SLR - MinimalΒΆ

# import libraries
import pandas as pd
from statsmodels.formula.api import ols

# load dataset and create dataframe
df = pd.read_csv('data/edincome.csv').round(1)


# run regression
slr = ols('Income ~ Education',df).fit()

# print results
slr.summary()
OLS Regression Results
Dep. Variable: Income R-squared: 0.878
Model: OLS Adj. R-squared: 0.875
Method: Least Squares F-statistic: 238.4
Date: Tue, 25 Jan 2022 Prob (F-statistic): 1.17e-16
Time: 22:45:38 Log-Likelihood: -119.61
No. Observations: 35 AIC: 243.2
Df Residuals: 33 BIC: 246.3
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept -23.1764 5.918 -3.917 0.000 -35.216 -11.137
Education 5.5742 0.361 15.440 0.000 4.840 6.309
Omnibus: 2.854 Durbin-Watson: 2.535
Prob(Omnibus): 0.240 Jarque-Bera (JB): 1.726
Skew: 0.502 Prob(JB): 0.422
Kurtosis: 3.420 Cond. No. 75.8


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
# predict new points
data = {'Education': [12,16,18]}
df_predict = pd.DataFrame(data).round(1)

df_predict['Income'] = slr.predict(df_predict).round(1)
df_predict
Education Income
0 12 43.7
1 16 66.0
2 18 77.2