SLR - Minimal¶

# import libraries
import pandas as pd
from statsmodels.formula.api import ols

# load dataset and create dataframe
df = pd.read_csv('data/edincome.csv').round(1)


# run regression
slr = ols('Income ~ Education',df).fit()

# print results
slr.summary()

OLS Regression Results
Dep. Variable:	Income	R-squared:	0.878
Model:	OLS	Adj. R-squared:	0.875
Method:	Least Squares	F-statistic:	238.4
Date:	Tue, 25 Jan 2022	Prob (F-statistic):	1.17e-16
Time:	22:45:38	Log-Likelihood:	-119.61
No. Observations:	35	AIC:	243.2
Df Residuals:	33	BIC:	246.3
Df Model:	1
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	-23.1764	5.918	-3.917	0.000	-35.216	-11.137
Education	5.5742	0.361	15.440	0.000	4.840	6.309

Omnibus:	2.854	Durbin-Watson:	2.535
Prob(Omnibus):	0.240	Jarque-Bera (JB):	1.726
Skew:	0.502	Prob(JB):	0.422
Kurtosis:	3.420	Cond. No.	75.8

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

# predict new points
data = {'Education': [12,16,18]}
df_predict = pd.DataFrame(data).round(1)

df_predict['Income'] = slr.predict(df_predict).round(1)
df_predict

	Education	Income
0	12	43.7
1	16	66.0
2	18	77.2

Machine Learning for Absolute Beginners

SLR - Minimal¶