Directions: Answer each question below in its entirety. Submit your answers through Canvas by the
due date. Your submission should be a Word document. Please round to the nearest hundredth in your calculations.
1.) Below is some output from a bivariate regression with respondents’ inflation-adjusted
personal incomes (in dollars) as the dependent variable and their educational attainment
(measured in years of schooling completed) as the independent variable. The mean income
for this sample is $34,554.85, and the mean years of schooling completed is XXXXXXXXXXyears.
Please fill in blanks (a) through (c) and provide a sentence or two interpreting what each
number is telling us about the relationship between earnings and educational attainment.
Please show all of your work.
Analysis of Variance Table
Response: conrinc
Df Sum Sq Mean Sq F value Pr(>F)
educ 1 2.1145e+11 2.1145e XXXXXXXXXX
Residuals XXXXXXXXXX4851e+12 9.7637e+08
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) (a XXXXXXXXXXb) 4.85e-08 ***
educ XXXXXXXXXX
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 31250 on 1521 degrees of freedom
(1015 observations deleted due to missingness)
Multiple R-squared: (c), Adjusted R-squared: 0.1241
F-statistic: 216.6 on 1 and 1521 DF, p-value:
2.) Below is some output from a bivariate regression with respondents’ educational attainment
(measured in years of schooling completed) as the dependent variable and their mother’s
educational attainment (also measured in years of schooling completed) as the independent
variable. Use the table to answer the following questions.
a. What is the regression equation? Please write it out, including the residual term.
b. What is the equation we use to predict educational attainment with mother’s
educational attainment? In other words, what do we have to do with the equation in
(a) in order to make predictions?
c. What is a person’s predicted years of schooling completed when their mother has a
10th grade education?
d. What is a person’s predicted years of schooling completed when their mother has a
high school diploma?
e. What is a person’s predicted years of schooling completed when their mother has a
four-year baccalaureate degree?
f. What is a person’s predicted years of schooling completed when their mother has
completed a two-year graduate degree on top of their four-year baccalaureate degree?
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept XXXXXXXXXX.40 maeduc XXXXXXXXXX41 ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.734 on 2269 degrees of freedom
(267 observations deleted due to missingness)
Multiple R-squared: 0.1945, Adjusted R-squared: 0.1942
F-statistic: 547.9 on 1 and 2269 DF, p-value:
3.) Below is some output from a bivariate regression with respondents’ systolic blood pressure
readings (measured in millimeters of mercury) and sex (where 0 = male and 1 = female) as
the independent variable. Use the table to answer the following questions.
a. What is the regression equation? Please write it out, including the residual term.
b. What is the equation we use to predict systolic blood pressure with sex? In other
words, what do we have to do with the equation in (a) in order to make predictions?
c. What is the predicted systolic blood pressure for an average male?
d. What is the predicted systolic blood pressure for an average female?
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept XXXXXXXXXX XXXXXXXXXXFemale XXXXXXXXXX319 ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23.26 on 10335 degrees of freedom
Multiple R-squared: XXXXXXXXXX, Adjusted R-squared: XXXXXXXXXX
F-statistic: 69.21 on 1 and 10335 DF, p-value:
4.) Below is some output from a bivariate regression with respondents’ systolic blood pressure
readings (measured in millimeters of mercury) as the dependent variable and weight (in
kilograms) as the independent variable.
Please use whatever information you think is necessary to explain the relationship between
systolic blood pressure and weight that we should expect to see in the U.S. adult population.
Be sure to address any statistical and substantive significance you may see, as well as how
well the model “fits” the data (that is, how accurate our predictions appear to be). Your
answer should be in paragraph form. Remember, the “residual standard error” reported
below is the same thing as the “root MSE” in your slides.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept XXXXXXXXXX94.58 weight XXXXXXXXXX32 ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 22.37 on 10335 degrees of freedom
Multiple R-squared: XXXXXXXXXX, Adjusted R-squared: XXXXXXXXXX
F-statistic: 919.2 on 1 and 10335 DF, p-value:
5.) You are a researcher in the automobile industry studying fuel efficiency differences between
vintage cars. You are interested in the following research question: “Did fuel efficiency
among vintage cars vary depending on vehicle weight?”
You have a random sample of 74 vintage cars from 1978. The dependent variable, “mpg,” is
an interval-ratio variable indicating the miles per gallon of that vehicle. The independent
variable, “weight,” is the weight of the vehicle in pounds. Carry out a simple OLS regression
to address this research question.
Carry out the following commands to load the dataset, called “auto”:
install.packages(“foreign”)
library(foreign)
auto
After a few seconds or so, you should see a new dataset in your RStudio global environment
called “auto.” Once you have done this, carry out the following commands:
auto.model
summary(auto.model)
where A is the name of the dependent variable and B is the name of the independent
variable.
Copy and paste the R output showing the regression analysis into your write-up. After you
do that, use whatever information you think is necessary to address your research question.
You should make reference to at least the following statistics in your write-up: the slope
coefficient, the test t-statistic and p-value associated with the slope coefficient, the yintercept,
and R2.