{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"# Multiple Linear Regression\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Libraries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import pandas as pd\n",
"from statsmodels.formula.api import ols\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load and Verify Data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" GPA | \n",
" Income | \n",
" Sleep | \n",
" Time | \n",
" Grade | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2.9 | \n",
" 82461 | \n",
" 6.5 | \n",
" 47 | \n",
" 77 | \n",
"
\n",
" \n",
" 1 | \n",
" 3.7 | \n",
" 61113 | \n",
" 6.2 | \n",
" 47 | \n",
" 94 | \n",
"
\n",
" \n",
" 2 | \n",
" 2.8 | \n",
" 63632 | \n",
" 6.2 | \n",
" 39 | \n",
" 69 | \n",
"
\n",
" \n",
" 3 | \n",
" 2.0 | \n",
" 66854 | \n",
" 7.2 | \n",
" 49 | \n",
" 81 | \n",
"
\n",
" \n",
" 4 | \n",
" 2.8 | \n",
" 82721 | \n",
" 5.5 | \n",
" 49 | \n",
" 78 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" GPA Income Sleep Time Grade\n",
"0 2.9 82461 6.5 47 77\n",
"1 3.7 61113 6.2 47 94\n",
"2 2.8 63632 6.2 39 69\n",
"3 2.0 66854 7.2 49 81\n",
"4 2.8 82721 5.5 49 78"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(\"data/academicperformance.csv\")\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multiple Linear Regression"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"mlr = ols('Grade ~ GPA + Sleep + Time', df).fit()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"OLS Regression Results\n",
"\n",
" Dep. Variable: | Grade | R-squared: | 0.891 | \n",
"
\n",
"\n",
" Model: | OLS | Adj. R-squared: | 0.891 | \n",
"
\n",
"\n",
" Method: | Least Squares | F-statistic: | 5653. | \n",
"
\n",
"\n",
" Date: | Mon, 06 Dec 2021 | Prob (F-statistic): | 0.00 | \n",
"
\n",
"\n",
" Time: | 13:56:31 | Log-Likelihood: | -6191.4 | \n",
"
\n",
"\n",
" No. Observations: | 2077 | AIC: | 1.239e+04 | \n",
"
\n",
"\n",
" Df Residuals: | 2073 | BIC: | 1.241e+04 | \n",
"
\n",
"\n",
" Df Model: | 3 | | | \n",
"
\n",
"\n",
" Covariance Type: | nonrobust | | | \n",
"
\n",
"
\n",
"\n",
"\n",
" | coef | std err | t | P>|t| | [0.025 | 0.975] | \n",
"
\n",
"\n",
" Intercept | -39.7098 | 0.879 | -45.179 | 0.000 | -41.434 | -37.986 | \n",
"
\n",
"\n",
" GPA | 9.0992 | 0.136 | 67.065 | 0.000 | 8.833 | 9.365 | \n",
"
\n",
"\n",
" Sleep | 7.2070 | 0.104 | 69.500 | 0.000 | 7.004 | 7.410 | \n",
"
\n",
"\n",
" Time | 1.0580 | 0.011 | 95.102 | 0.000 | 1.036 | 1.080 | \n",
"
\n",
"
\n",
"\n",
"\n",
" Omnibus: | 1.358 | Durbin-Watson: | 1.941 | \n",
"
\n",
"\n",
" Prob(Omnibus): | 0.507 | Jarque-Bera (JB): | 1.305 | \n",
"
\n",
"\n",
" Skew: | -0.014 | Prob(JB): | 0.521 | \n",
"
\n",
"\n",
" Kurtosis: | 3.120 | Cond. No. | 344. | \n",
"
\n",
"
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified."
],
"text/plain": [
"\n",
"\"\"\"\n",
" OLS Regression Results \n",
"==============================================================================\n",
"Dep. Variable: Grade R-squared: 0.891\n",
"Model: OLS Adj. R-squared: 0.891\n",
"Method: Least Squares F-statistic: 5653.\n",
"Date: Mon, 06 Dec 2021 Prob (F-statistic): 0.00\n",
"Time: 13:56:31 Log-Likelihood: -6191.4\n",
"No. Observations: 2077 AIC: 1.239e+04\n",
"Df Residuals: 2073 BIC: 1.241e+04\n",
"Df Model: 3 \n",
"Covariance Type: nonrobust \n",
"==============================================================================\n",
" coef std err t P>|t| [0.025 0.975]\n",
"------------------------------------------------------------------------------\n",
"Intercept -39.7098 0.879 -45.179 0.000 -41.434 -37.986\n",
"GPA 9.0992 0.136 67.065 0.000 8.833 9.365\n",
"Sleep 7.2070 0.104 69.500 0.000 7.004 7.410\n",
"Time 1.0580 0.011 95.102 0.000 1.036 1.080\n",
"==============================================================================\n",
"Omnibus: 1.358 Durbin-Watson: 1.941\n",
"Prob(Omnibus): 0.507 Jarque-Bera (JB): 1.305\n",
"Skew: -0.014 Prob(JB): 0.521\n",
"Kurtosis: 3.120 Cond. No. 344.\n",
"==============================================================================\n",
"\n",
"Notes:\n",
"[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n",
"\"\"\""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mlr.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Predictions"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"data = {'GPA':[3,3,3,2,3,4,2.5,2.5,2.5],\n",
" 'Sleep':[5,6,7,6,6,6,5,5,5],\n",
" 'Time':[30,30,30,30,30,30,40,50,60]}\n",
"df_predict = pd.DataFrame(data)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"df_predict['Grade'] = mlr.predict(df_predict).round(1)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" GPA | \n",
" Sleep | \n",
" Time | \n",
" Grade | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 3.0 | \n",
" 5 | \n",
" 30 | \n",
" 55.4 | \n",
"
\n",
" \n",
" 1 | \n",
" 3.0 | \n",
" 6 | \n",
" 30 | \n",
" 62.6 | \n",
"
\n",
" \n",
" 2 | \n",
" 3.0 | \n",
" 7 | \n",
" 30 | \n",
" 69.8 | \n",
"
\n",
" \n",
" 3 | \n",
" 2.0 | \n",
" 6 | \n",
" 30 | \n",
" 53.5 | \n",
"
\n",
" \n",
" 4 | \n",
" 3.0 | \n",
" 6 | \n",
" 30 | \n",
" 62.6 | \n",
"
\n",
" \n",
" 5 | \n",
" 4.0 | \n",
" 6 | \n",
" 30 | \n",
" 71.7 | \n",
"
\n",
" \n",
" 6 | \n",
" 2.5 | \n",
" 5 | \n",
" 40 | \n",
" 61.4 | \n",
"
\n",
" \n",
" 7 | \n",
" 2.5 | \n",
" 5 | \n",
" 50 | \n",
" 72.0 | \n",
"
\n",
" \n",
" 8 | \n",
" 2.5 | \n",
" 5 | \n",
" 60 | \n",
" 82.6 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" GPA Sleep Time Grade\n",
"0 3.0 5 30 55.4\n",
"1 3.0 6 30 62.6\n",
"2 3.0 7 30 69.8\n",
"3 2.0 6 30 53.5\n",
"4 3.0 6 30 62.6\n",
"5 4.0 6 30 71.7\n",
"6 2.5 5 40 61.4\n",
"7 2.5 5 50 72.0\n",
"8 2.5 5 60 82.6"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_predict"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.11"
}
},
"nbformat": 4,
"nbformat_minor": 4
}