Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now
Answered Same Day Dec 21, 2021

Solution

Subhanbasha answered on Dec 21 2021
110 Votes
BMB 620
## calling packages
li
ary(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
li
ary(stringr)
li
ary(tidyr)
li
ary(lu
idate)
##
## Attaching package: 'lu
idate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
li
ary(ggplot2)
li
ary(pwr)
## Warning: package 'pwr' was built under R version 4.1.2
Question 1
Model 1
## Reading data into R
state_dat <- data.frame(state.x77)

## column names of the data
colnames(state_dat)
## [1] "Population" "Income" "Illiteracy" "Life.Exp" "Murder"
## [6] "HS.Grad" "Frost" "Area"
colnames(state_dat)[6] <- 'HS_Grad'
# Regression model
model1 <- lm(Population~Area+Frost,data =state_dat )

# Model summary
summary(model1)
##
## Call:
## lm(formula = Population ~ Area + Frost, data = state_dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6238 -2504 -1320 1096 14334
##
## Coefficients:
## Estimate Std. E
or t value Pr(>|t|)
## (Intercept) 7.092e+03 1.442e+03 4.917 1.11e-05 ***
## Area 2.217e-03 7.204e-03 0.308 0.7597
## Frost -2.874e+01 1.183e+01 -2.431 0.0189 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard e
or: 4295 on 47 degrees of freedom
## Multiple R-squared: 0.1121, Adjusted R-squared: 0.07433
## F-statistic: 2.967 on 2 and 47 DF, p-value: 0.06115
# Plots
plot(model1)
#The model one was build on considering the population as dependent variable and #Area and Frost as indepenent variables. By observing the
summay of the model only the variable Frost is significant in the model.where Area is not significant variable. The R sqaure value of the model is
11% only which is need to improve the model and we can say that is bad fit.
#Assumptions: By observing the plots generated by the model the Noraml Q-Q plots is showing claerly that the residuals are not following the
normal distribution which is one of the major assumption. And by observing the other three plorts there are some outliers affecting the model need
to remove. So, finally we can say that the assumptions are not satisfied
Model 2
# Second model
# Regression model
model2 <- lm(Population~Area+Frost+Illiteracy+HS_Grad,data =state_dat )

# Model summary
summary(model2)
##
## Call:
## lm(formula = Population ~ Area + Frost + Illiteracy + HS_Grad,
## data = state_dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5843.6 -2511.5 -1107.4 771.1 13639.7
##
## Coefficients:
## Estimate Std. E
or t value Pr(>|t|)
## (Intercept) 1.882e+04 8.598e+03 2.189 0.03386 *
## Area 8.560e-03 8.633e-03 0.991 0.32676
## Frost -4.623e+01 1.654e+01 -2.795 0.00761 **
## Illiteracy -3.063e+03 1.911e+03 -1.603 0.11601
## HS_Grad -1.274e+02 1.204e+02 -1.058 0.29558
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard e
or: 4269 on 45 degrees of freedom
## Multiple R-squared: 0.1602, Adjusted R-squared: 0.08553
## F-statistic: 2.146 on 4 and 45 DF, p-value: 0.09064
# Plots
plot(model2)
#The model one was build on considering the population as dependent variable and #Area, Frost,illiteracy and HS Grad as indepenent variables.
By observing the summay of the model only the variable Frost is significant in the model.where Area is not significant variable. The R sqaure value
of the model is 16% only which is need to improve the model and we can say that is bad fit.
#Assumptions: By observing the plots generated by the model the Noraml Q-Q plots is showing claerly that the residuals are not following the
normal distribution which is one of the major assumption. And by observing the other three plorts there are some outliers affecting the model need
to remove. So, finally we can say that the assumptions are not satisfied
#by comparing the two models the seconds model is somehow better than the first model in the performance.
Question 2
<- seq(0.1,0.5,0.1)
nr <- length(r)
p...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here