{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# Multiple Linear Regression\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "\n", "import pandas as pd\n", "from statsmodels.formula.api import ols\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load and Verify Data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GPAIncomeSleepTimeGrade
02.9824616.54777
13.7611136.24794
22.8636326.23969
32.0668547.24981
42.8827215.54978
\n", "
" ], "text/plain": [ " GPA Income Sleep Time Grade\n", "0 2.9 82461 6.5 47 77\n", "1 3.7 61113 6.2 47 94\n", "2 2.8 63632 6.2 39 69\n", "3 2.0 66854 7.2 49 81\n", "4 2.8 82721 5.5 49 78" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"data/academicperformance.csv\")\n", "df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multiple Linear Regression" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "mlr = ols('Grade ~ GPA + Sleep + Time', df).fit()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
OLS Regression Results
Dep. Variable: Grade R-squared: 0.891
Model: OLS Adj. R-squared: 0.891
Method: Least Squares F-statistic: 5653.
Date: Mon, 06 Dec 2021 Prob (F-statistic): 0.00
Time: 13:56:31 Log-Likelihood: -6191.4
No. Observations: 2077 AIC: 1.239e+04
Df Residuals: 2073 BIC: 1.241e+04
Df Model: 3
Covariance Type: nonrobust
\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
coef std err t P>|t| [0.025 0.975]
Intercept -39.7098 0.879 -45.179 0.000 -41.434 -37.986
GPA 9.0992 0.136 67.065 0.000 8.833 9.365
Sleep 7.2070 0.104 69.500 0.000 7.004 7.410
Time 1.0580 0.011 95.102 0.000 1.036 1.080
\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
Omnibus: 1.358 Durbin-Watson: 1.941
Prob(Omnibus): 0.507 Jarque-Bera (JB): 1.305
Skew: -0.014 Prob(JB): 0.521
Kurtosis: 3.120 Cond. No. 344.


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified." ], "text/plain": [ "\n", "\"\"\"\n", " OLS Regression Results \n", "==============================================================================\n", "Dep. Variable: Grade R-squared: 0.891\n", "Model: OLS Adj. R-squared: 0.891\n", "Method: Least Squares F-statistic: 5653.\n", "Date: Mon, 06 Dec 2021 Prob (F-statistic): 0.00\n", "Time: 13:56:31 Log-Likelihood: -6191.4\n", "No. Observations: 2077 AIC: 1.239e+04\n", "Df Residuals: 2073 BIC: 1.241e+04\n", "Df Model: 3 \n", "Covariance Type: nonrobust \n", "==============================================================================\n", " coef std err t P>|t| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "Intercept -39.7098 0.879 -45.179 0.000 -41.434 -37.986\n", "GPA 9.0992 0.136 67.065 0.000 8.833 9.365\n", "Sleep 7.2070 0.104 69.500 0.000 7.004 7.410\n", "Time 1.0580 0.011 95.102 0.000 1.036 1.080\n", "==============================================================================\n", "Omnibus: 1.358 Durbin-Watson: 1.941\n", "Prob(Omnibus): 0.507 Jarque-Bera (JB): 1.305\n", "Skew: -0.014 Prob(JB): 0.521\n", "Kurtosis: 3.120 Cond. No. 344.\n", "==============================================================================\n", "\n", "Notes:\n", "[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n", "\"\"\"" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mlr.summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Predictions" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "data = {'GPA':[3,3,3,2,3,4,2.5,2.5,2.5],\n", " 'Sleep':[5,6,7,6,6,6,5,5,5],\n", " 'Time':[30,30,30,30,30,30,40,50,60]}\n", "df_predict = pd.DataFrame(data)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df_predict['Grade'] = mlr.predict(df_predict).round(1)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GPASleepTimeGrade
03.053055.4
13.063062.6
23.073069.8
32.063053.5
43.063062.6
54.063071.7
62.554061.4
72.555072.0
82.556082.6
\n", "
" ], "text/plain": [ " GPA Sleep Time Grade\n", "0 3.0 5 30 55.4\n", "1 3.0 6 30 62.6\n", "2 3.0 7 30 69.8\n", "3 2.0 6 30 53.5\n", "4 3.0 6 30 62.6\n", "5 4.0 6 30 71.7\n", "6 2.5 5 40 61.4\n", "7 2.5 5 50 72.0\n", "8 2.5 5 60 82.6" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_predict" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.11" } }, "nbformat": 4, "nbformat_minor": 4 }