Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #4 Due date: December 10, 2021 Instructions: You must provide your own unique solution. You may work with...

1 answer below »
DEPARTMENT OF ECONOMICS
ECON 4041H – RESEARCH METHODOLOGY
Fall 2021, Pete
orough
Assignment #4
Due date: December 10, 2021
Instructions: You must provide your own unique solution. You may work with others, but each of
you is responsible for submitting your own problem set solution. Each question is 30
marks and each part is of equal value. Submission of one file knit from RMarkdown
is best, but acceptable alternatives are allowed.
1. Re-estimate the models from assignment 3, question 2, but now use the log(wage) as the
dependent variable. Include the same set of covariates/explanatory variables: age, age2,
sex, educational attainment, sector of employment, collective agreement status, firm size,
immigrant status and province. In assessing the question, here are items to consider:
a. Should age be in log form? Note that none of the other explanatory variables are
numeric, so only consider the issue of the log-transformation for the numeric transfor-
mation of age.
. Which model format fits best, with or without the log transformed wage as dependent
variable?
c. What does the model predict for unemployment rates across provinces?
d. Does immigrant status interact with the other explanatory variables in explaining wage
differences? To analyze, first estimate the model from assignment 3, question 2, part c.
Then generate predicted values of wages using emmeans() with the “type = ‘response’ ”
parameter to convert from log-transformed values back into dollars. Do this for each
pair of variables interacted with immigrant status, but no need to include age, age2 o
province. Plots really help visualize these effects.
2. Predict the impact of immigrant status on the probability of unemployment.
• You need to define the unemployment variable. The dataset contains the variable lfsstat
which has four categories, two identifying employed (at work, or absent from work),
one for unemployed, and one for those not in the labour force. Recode this variable by
defining a new variable, unemploy, as:
unemploy =

TRUE for those unemployed
FALSE for those employed (2 categories)
NA for those not in the labour force
(1)
ECON 4041H - Assignment 4
• Estimate your model using a subset of the variables from question 1: age, age2, sex,
educational attainment, sector of employment, immigrant status and province.1
• For ease of analysis, recode the variable cowmain into a new variable sector with
three categories: public, private, and self-employed. Drop the category "Unpaid family
worker" by setting it to NA.
• Since the variable unemploy takes on two values: TRUE and FALSE, estimate a logit
model. Use the glm() function with the option “family = binomial” which yields a logit
prediction model.
In assessing the question, here are items to consider:
a. Should age (and age2) be in log form? Note that none of the other explanatory vari-
ables are numeric, so only consider the issue of the log-transformation for the numeric
version of age.
. What does the model predict for unemployment rates across provinces?
c. Does immigrant status interact with the other explanatory variables in explaining dif-
ferences in the probabilities of unemployment? To analyze, first estimate a model with
immigrant status interacted with the other categorical explanatory variables (not age
or province), then generate predicted probabilities of unemployment using emmeans()
with the “type = ‘response’ ” parameter to convert log-odds into probabilities. Do this
for each pair of immigrant status-educational attainment, immigrant status-sex, and
immigrant status-sector of employment. Plots really help visualize these effects.
3. Use the dataset General Social Survey Canada 2016 dataset “gssA4Q3.rds” to test whethe
money can buy happiness. Specifically, does happiness increase with income? Include ap-
propriate covariates that might otherwise explain happiness. The file contains the following
variables:
• hap5: happiness, categorical: 1–least happy through 5–most happy
• ttlincg2: income (before tax), categorical
• agegr10: age of respondent, by 10-year age categories
• sex:
• marstat: marital status
• mar_110: main activity of respondent
• ehg3: educational attainment
• rlr_110: importance of religion
• vismin: visible minority status
• srh_110: self-reported health
The dependent variable hap5 is an ordered categorical variable with more than two cate-
gories. Treat the happiness variable as if it were unordered and estimate a model using
multinom(). Then re-estimate the same model treating happiness as an ordered categori-
cal variable and use polr(). All have been shown to influence happiness. There are many
other potential covariates of happiness. These are a selection that may be interesting and
1We are dropping the variables collective agreement status and firmsize because those values are NA for the unem-
ployed. Collective agreement status and firm size do not apply to the unemployed, and the survey doesn’t code for the
value of previous employment.
2
ECON 4041H - Assignment 4
that happen to have been addressed in the 2016 GSS of Canada. Try estimating your model
with different subsets of the variables listed above as covariates. Note, you cannot use all
the possible variables in the dataset in one model. Emmeans() cannot handle a model with
too many categories. You will know when you have too many categories when emmeans()
gives you a message indicating the size exceeds the grid’s capacity. Once you have settled
on the set of independent variables, only estimate the one model with each function. Make
sure you address the question, does money buy happiness?
3


lfs_df21 <- readRDS("~/data/lfs21.rds")
lfs_df<-lfs_df21%>%
filter(age_12!="70 and over")
lfs_df$age=lfs_df$age_12
lfs_df$wage=lfs_df$hrlyearn
lfs_df$age=as.numeric(lfs_df$age)
lfs_df$wage=as.numeric(lfs_df$wage)
lfs_df<-lfs_df%>%
mutate(im2=ifelse(immig=="Non-immigrant","Non-immigrant","Immigrant"))%>%
mutate(ca2=ifelse(union=="Non-unionized","Non-unionized","Covered-Collective-agreement"))
#%>%
# mutate_if(is.character,factor)
Q.2(a).
Q2_mod<-lm(wage~age+I(age^2)+educ+sex+ca2+im2+cowmain+firmsize+prov,data=lfs_df)
summary(Q2_mod)
##
## Call:
## lm(formula = wage ~ age + I(age^2) + educ + sex + ca2 + im2 +
## cowmain + firmsize + prov, data = lfs_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## XXXXXXXXXX XXXXXXXXXX
##
## Coefficients:
## XXXXXXXXXXEstimate Std. E
or t value Pr(>|t|)
## (Intercept) XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX14e-15
## age XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## I(age^2) XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## educSome high school XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
## educHigh school graduate XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX90e-08
## educSome postsecondary XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX64e-15
## educPostsecondary certificate or diploma XXXXXXXXXX XXXXXXXXXX < 2e-16
## educBachelor's degree XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## educAbove bachelor's degree XXXXXXXXXX XXXXXXXXXX41.757 < 2e-16
## sexFemale XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## ca2Non-unionized XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
## im2Non-immigrant XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## cowmainPrivate sector employees XXXXXXXXXX XXXXXXXXXX < 2e-16
## firmsize20 to 99 employees XXXXXXXXXX XXXXXXXXXX13.949 < 2e-16
## firmsize100 to 500 employees XXXXXXXXXX XXXXXXXXXX21.855 < 2e-16
## firmsizeMore than 500 employees XXXXXXXXXX XXXXXXXXXX < 2e-16
## provPrince Edward Island XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX45e-09
## provNova Scotia XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX52e-09
## provNew Brunswick XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX29e-14
## provQuebec XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX21e-07
## provOntario XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX < 2e-16
## provManitoba XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
## provSaskatchewan XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
Answered 2 days After Nov 22, 2021

Solution

Subhanbasha answered on Nov 24 2021
129 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here