Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Theory (30 points) 1. This is a table for maintenance cost with operation hours. You cannot use python package. a. Find a linear equation by least-squares method. Show all your work! (15 points) b....

1 answer below »
Theory (30 points)
1. This is a table for maintenance cost with operation hours. You cannot use python package.
    
a. Find a linear equation by least-squares method. Show all your work! (15 points)
. Calculate MSE (Mean Squared E
or). (5 points)
c. Calculate the R2 score. (5 points)
d. If operation hours increase two hours, how much maintenance cost will increase/decrease based on your regression result? (5 points)
Practice (70 points)
1. Using sklearn.linear_model, please make a linear regression model. (40 points)
a. Find “housing.csv” file and read it as data frame.
. Choose “LSTAT” as x (explanatory variable) and “MEDV” as y (response variable). (5 points)
c. Do a linear regression and find intercept and slope. (5 points)
d. Draw scatter plot for all data and draw a line (scatter plot) you obtained in 1.c. on the same graph. (5 points)
e. Calculate R2 score. (5 points)
f. Do RANSAC with the same data set and find intercept and slope. Use “max_trials=100, min_samples=50” (10 points)
g. Draw scatter plot for all data (please distinguish between inliers and outliers) and draw a line you obtained in 1.f. on the same graph. (5 points)
h. Calculate R2 score for only inliers. (5 points)
2. Using sklearn.linear_model, please make a linear regression model. (30 points)
a. Find “housing_hw.csv” file and read it as data frame.
. Choose “LSTAT” and “RM” as x (explanatory variables) and “MEDV” as y (response variable). (5 points)
c. Do a (multivariate) linear regression and find optimal coefficients. (5 points)
d. Calculate R2 score. (5 points)
e. Now, please do polynomial regression with order=2 and find the optimal coefficients (5 points).
f. Calculate R2 score. (5 points)
g. Which variables are the most important to increase/decrease “MEDV”.
Answered Same Day Nov 17, 2021

Solution

Suraj answered on Nov 18 2021
150 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Slope is 0.9523809523809523\n",
"intercept term is 14.571428571428573\n",
"Regression equation is 14.571428571428573 + 0.9523809523809523 x\n",
"Mean square e
or is 36.05714285714286\n",
"R-Square value is 0.8094231350045303\n",
"The maintenance cost is 16.476190476190478\n"
]
}
],
"source": [
"#1\n",
"import numpy as np\n",
"import itertools\n",
"op_hr=[18,6,30,48,6,36,18,18,30,36]\n",
"main_cost=[25,17,48,58,23,40,30,39,40,60]\n",
"mean_op_hr=np.mean(op_hr)\n",
"mean_main_cost=np.mean(main_cost)\n",
"sum1=0\n",
"for (i,j) in zip(op_hr,main_cost):\n",
" val=(i-mean_op_hr)*(j-mean_main_cost)\n",
" sum1=sum1+val\n",
"sum2=0\n",
"for i in op_hr:\n",
" val1=(i-mean_op_hr)**2\n",
" sum2=sum2+val1\n",
"slope=sum1/sum2\n",
"print(\"Slope is\",slope)\n",
"intercept=mean_main_cost-slope*mean_op_hr\n",
"print(\"intercept term is\",intercept)\n",
"print(\"Regression equation is\",intercept,\"+\",slope,\"x\")\n",
"#calculating mean square e
or\n",
"pred=[]\n",
"for i in op_hr:\n",
" predict=intercept+slope*i\n",
" pred.append(predict)\n",
"e
or=0\n",
"for (i,j) in zip(main_cost,pred):\n",
" val2=(i-j)**2\n",
" e
or=e
or+val2\n",
"mse=e
o
len(main_cost)\n",
"print(\"Mean square e
or is\",mse)\n",
"#R-square calculation\n",
"ss_total=0\n",
"for i in main_cost:\n",
" val3=(i-mean_main_cost)**2\n",
" ss_total=ss_total+val3\n",
"r_square=1-(e
o
ss_total)\n",
"print(\"R-Square value is\",r_square)\n",
"#Prediction when operation hours increases 2 hours\n",
"prediction=intercept+slope*2\n",
"print(\"The maintenance cost is\",prediction)"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"