Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Assignment #2 Due date: February 28, 2023 Instructions: You must provide your own unique solution. You may work with others, but each of you is responsible for submitting your own problem set...

1 answer below »
Assignment #2
Due date: Fe
uary 28, 2023
Instructions: You must provide your own unique solution. You may work with others, but each
of you is responsible for submitting your own problem set solution. Question values
are listed for each question. Submit solution through SafeAssign. Ideally you will
submit your RMarkdown file output, preferably in pdf format, but word or html are
acceptable. Blackboard won’t accept html files, so if submitting an html file, first
zip it and submit the zipped version.
For this assignment you will use three labour force surveys, from June 1977, June 1997 and June
2022. They are amalgamated and saved for you in the datafile Ifs3.rds, in the Bb data folder. All
variables required will be referenced below.
1. Data cleaning and preparation. [5 marks]
We need to create two variables, and adjust wages:
a. Create a variable capturing part-time vs. full-time work status. The datafile contains a
variable fiptmain. Recode this variable from its cu
ent four categories into two. Code
it as a 1 for part-time and O for full-time. To confirm you have coded it co
ectly,
generate a 2x2 table of the original variable and the new variable.
. Create a numeric version of age. The datafile contains an age variable age_12, coded
as a factor variable with 12 levels. Create a numeric version of age_I2 using the
as_numeric() function. Note that the variable age_12 consists of five-year age group-
ings from 15-19 through 65-69, and then there is a catch-all category for 70 and over.
Drop the category for 70 and over so that the numeric variable captures five-year age
groupings. By converting a factor into a numeric with equal numeric spacing, it is a
true linear representation of age, and we can use polynomial functions of age. To con-
firm you have it co
ectly coded, you could generate a basic 2x2 table, but that will
e large. Instead report the diagonal elements (function diag()) of the 2x2 table of the
original age variable and the numeric variable of age. If you have done it co
ectly, all
off-diagonal elements of this table will be zeroes, so the diagonal element will show
the co
ect number of observations.
c. Adjust wages for inflation by multiplying the wage variable Arlyearn by the ratio of the
CPIs from June 2022 and June 1997, a ratio of 152.9/90.5. Multiplying the wages in
1997 by this ratio converts them into the same base year as for the 2022 data. Report
the mean, median, maximum and minimum wages for 1997 (adjusted) and for 2022.
Note: wages were not collected until 1997, so all earlier labour force surveys, including
1977, do not include wages.
2. Basic data descriptions of datafile Ifs3.rds [10 marks]
Provide tables of counts of the following, and provide a
ief paragraph explaining what you
find for:
oo op
education (educ4) by sex by yea
industry (ind) by sex by yea
part-time/full-time status by sex and yea
wage rate (hrlyearn) by sex by yea
wage rate by age (original or numeric version) by yea
3. Basic model of wage. [15 marks]
Start with the following model:
wage = f (age, education, sex, part-time status, year)
making sure to use the numeric version of age generated in 1.b. above. Use educ4 for edu-
cation, and part-time status is a variable generated in question 1.
a.
.
Run a regression on the basic wage model above. Report your regression results using
the command stargazer(). Fully explain your regression results.
Run the basic regression above again, but run the regression separately by year for both
years 1997 and 2022 (which means you will drop the variable year from each regres-
sion. Report both regression results using stargazer(). Fully explain your regression
esults. Compare results you get when running separately by year with those from the
previous section where you constrained the coefficient estimates to be the same across
oth years.
4. Wage variation by age—modeling age as a polynomial term. [20 marks]
a.
Run the following regression:
wage = f (age, age?, education, sex, part-timestatus, year)
eport and discuss regression results.
Compare the fit of the linear model (3.a) and the quadratic model (4.a) by comparing
the residual plots. Do you see any differences?
. Add a cubic term for age to the regression above (4.a), run, report, discuss, and note
any differences from the quadratic model.
Add a quartic (fourth-order) term for age to the regression above (4.c), run, report,
discuss, and note any differences from the quadratic or cubic models
Rerun the above two regressions with cubic and quartic age terms (4.c and 4.d), but now
use a scaled version of the numeric age variable. Create the new scaled age variable,
ages, defined as Ra Report results and comment, and compare to unscaled
estimates.
Estimate the basic regression from question 3.a but use a fractional polynomial model
on age. Compare regression results to the quadratic, cubic and quartic models (4.a, 4.c,
and 4.d).
To put all the models to a more practical test, generate four sets of predicted wages
(emmeans) by age using quadratic, cubic, quartic and fractional polynomial models
of age. Comment on results, and comment on how much explanation is provided by
adding in the higher-order polynomial terms.
2
5. Allow for wage variation by age. [15 marks]
Estimate the age-quadratic model of wages from 4.a above, but interact age (both age and
age?) with year to see if wage response by age differs over the two periods.
a. Report regression results and provide a
ief interpretation. Compare these results to
those of the quadratic model in 4.a.
. Generate predicted wages (emmeans) for each age category for both years.
c. Summarize your findings.
6. Allow for wage variation by sex and education. [20 marks]
a. Modify the age-quadratic model in Question 4.a by interacting sex and education
i. Generate predicted wages (emmeans) for each level of education by sex and plot.
ii. Test the differences in predicted wages (emmeans) using contrast() by
A. education
B. by sex
iii. Summarize results of part a.
. Modify the model in part a. above to allow for a three-way interaction among sex,
education and year.
i. Generate predicted wages varying all three variables. Note this will generate 2 x
4 x 2 = 16 predicted values. Present your results and explain what you see.
ii. Use contrast() to find the changes in wages from 1997 to 2022 for for females and
males by level of education. This section has a lot of moving parts, so plan ahead
and test methods before deciding how to present the effects.
iii. Summarize your findings in this section.
7. Test for industry wage differentials. [15 marks]
a. Add the industry categorical variable (ind) to the age-quadratic model of part 4.a.
i. Report regression results and discuss.
ii. Generate predicted wages by industry. Plot results.
iii. Use contrast() to identify the industry wage-differentials.
iv. Summarize findings.
. Modify the model in part 7.a above by interacting sex and ind. Run separate regressions
y year. You will need to create two subsets of your dataframe, one for 1997 and one
for 2022.
i. Report regression results and discuss.
ii. Use contrast() to identify the industry sex wage-differentials. Do this separately
y year.
iii. Summarize findings.
Answered 1 days After Mar 05, 2023

Solution

Subhanbasha answered on Mar 06 2023
30 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here