Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

DEPARTMENT OF ECONOMICS ECON 4041H – RESEARCH METHODOLOGY Fall 2021, Peterborough Assignment #3 Due date: November 17, 2021 Instructions: You must provide your own unique solution. You may work with...

1 answer below »
DEPARTMENT OF ECONOMICS
ECON 4041H – RESEARCH METHODOLOGY
Fall 2021, Pete
orough
Assignment #3
Due date: November 17, 2021
Instructions: You must provide your own unique solution. You may work with others, but each of
you is responsible for submitting your own problem set solution. Each question is
50 marks and each part is of equal value. Submit solution through SafeAssign. Sub-
mission of one file generated using RMarkdown is best, but acceptable alternatives
are allowed.
1. Use the dataset “klemA3.csv” to estimate the aggregate production function for an entire
economy. This is a nice application of the basic economic theory of production.
Start with a basic Co
-Douglas production function
Y = AKαLβ
where Y is output (value-added), K is capital stock, and L is labour. We can estimate this
Co
-Douglas production function as a linear model by log-transforming1 it into
log(Y ) = log(A)+α log(K)+β log(L).
A more flexible functional form allows for interaction among factors, yielding:
log(Y ) = log(A)+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)).
The term log(A) represents a productivity parameter and is the coefficient on the constant in
the regression, so relabel it as α0.
log(Y ) =α0+α1 log(K)+α2 (log(K))2+β1 log(L)+β2 (log(L))2+γ (log(K)×log(L)).
Note the equation has (log(L))2, not log(L2).
The variables in the dataset are:
• ind: industry label
• indnum: an integer identifying an industry
• year: yea
• y: value of gross output ($ millions)
• k: value of capital input ($ millions)
• l: value of labour input ($ millions)
• int: value of intermediate inputs ($ millions)
1log is natural log (or ln) in R.
ECON 4041H - Assignment 3
a. Estimate the production function
log(Y ) = α0 +α1 log(K)+β1 log(L)+α2 (log(K))2
+β2 (log(L))2 + γ (log(K)×log(L))+ ε
where Y is value-added, calculated as gross output minus intermediate inputs (y− int).
Report your results, and comment
iefly.
. Is the Co
-Douglas production function sufficient? Or is the full flexible-functional
form appropriate? Use a formal test(s) to support you conclusion.
c. Generate predicted value-added levels Y using emmeans() fo
i. mean values of K and L.
ii. values of K and L equal to half their mean values.
iii. values of K and L equal to twice their mean values.
Remember from micro theory that for a function y = aKαLβ , if doubling both inputs
yields
• double the output, the function displays constant returns to scale.
• less than double the output, the function displays diminishing returns to scale.
• more than double the output, the function displays increasing returns to scale.
Does this estimated production function display decreasing, constant, or increasing re-
turns to scale?
Note: you specify the values of K and L, and emmeans() will apply the log() transfor-
mation. So provide values of K and L in the “at =” parameter, not values of log(K) o
log(L). Also, add the option
type = “response”
as an additional parameter to the emmeans() command. That option will convert the
predicted means from log() values back into their values, essentially applying the exp()
function to all output. To see that, try it without the option.
d. Estimate the marginal products of the two inputs. Since the marginal product of a facto
of production is the partial derivative of output Y with respect to the factor ( ∂Y∂K and
∂Y
∂L ),
you can use the margins() function for this.
i. Estimate the marginal effect of value-added (Y) with respect to capital, and graph
the resulting estimates. Just like for emmeans() above, specify the values for K,
not log(K), in the “at =” parameter. You will need to specify a vector of values
of K to generate a vector of values of ∂Y∂K to graph. Note that because the function
takes the log(K), your vector of values of K will have to be a geometric series (1,
2, 4, 8, . . . ; or 10, 100, 1000, XXXXXXXXXXand not a linear series (1000, 2000, 3000, . . . ).
Play around with this until you get it right.
ii. Repeat 1.d.i but now with respect to labour L. Same details from above apply.
iii. What do the graphs reveal, and are they consistent with the economic theory of
production?
2
ECON 4041H - Assignment 3
2. Using the labour force survey file, “lfs21.rds”, explore whether wages differ for immigrants
and non-immigrants. We will also explore interaction effects of immigrant status with othe
potential explanatory variables.
Some data processing is required.
• The variable immig, has three categories. Two of the categories are for immigrants
and identify time since they a
ived in Canada. The third category represents non-
immigrants. Recode the two immigrant categories into one, so that the new variable is
a binary categorical variable identifying immigrant status only. In other words, the new
variable will combine the two “Immigrant” status categories into one and the variable
will be coded “immigrant” and “non-immigrant”. Let’s identify this new variable as
im2.
• The variable union has three categories: union member, not unionized by under a col-
lective agreement, and non-unionized. Recode this variable into a new binary variable
combining the first two categories together into one that captures presence of a collec-
tive agreement. The variable will now code as either covered by a collective agreement
or non-unionized. Let’s refer to it as ca2.
• Convert age_12 into a numeric variable, and drop the top age category “70 and over”.
Let’s refer to it as age.
a. Run a regression with wages (hrlyearn) as the dependent variable, and use the follow-
ing following variables as explanatory variables: the numeric age in both linear and
quadratic terms, education (educ), sex, sector of employment (cowmain), collective
agreement status (ca2 from above), firmsize, immigrant status (im2 from above), and
province (prov). You will have nine explanatory variables, including age as both linea
and quadratic. Discuss the estimates and discuss what the coefficients mean. Provide a
complete discussion for all but province. We will address province next.
. Do wages differ by province? Do they differ for every province, or are some similar?
Answer this part using both lht() and emmeans().
c. Now interact the immigration status variable with all the categorical variables in the
model except province, so use: educ, sex, cowmain, ca2, and firmsize. Very
iefly
characterize the interaction terms.
d. Explaining interactions is challenging, so now use emmeans() to calculate the interac-
tion effect of immigration status on wages for each of the other five categorical explana-
tory variables with which im2 is interacted. Run the interaction for each separately, ie.
first run emmeans() for immigration status and education, then immigration status and
sex, etc. Feel free to add any additional analysis that helps provide explanation. It
is often useful to graph the results of emmeans() (hint, hint). You may also find the
contrast() function helpful after running emmeans(). I leave this part a bit open-ended
and invite you to explore these economic relationships using the tools we have been
eviewing.
3
Answered 3 days After Nov 11, 2021

Solution

Mohd answered on Nov 15 2021
124 Votes
Reg
Reg
Bassi
11/14/2021
li
ary(readr)
li
ary(magrittr)
li
ary(dplyr)
li
ary(ggplot2)
li
ary(rmarkdown)
li
ary(MASS)
li
ary(skimr)
li
ary(ggeffects)
li
ary(readr)
klema3 <- read_csv("~/data/klema3.csv")
#View(klema3)
Descriptive stats
skim(klema3)
Data summary
    Name
    klema3
    Number of rows
    4420
    Number of columns
    7
    _______________________
    
    Column type frequency:
    
    characte
    1
    numeric
    6
    ________________________
    
    Group variables
    None
Variable type: characte
    skim_variable
    n_missing
    complete_rate
    min
    max
    empty
    n_unique
    whitespace
    ind
    0
    1
    5
    66
    0
    65
    0
Variable type: numeric
    skim_variable
    n_missing
    complete_rate
    mean
    sd
    p0
    p25
    p50
    p75
    p100
    hist
    indnum
    0
    1
    33.00
    18.76
    1.00
    17.00
    33.00
    49.00
    65.0
    ▇▇▇▇▇
    yea
    0
    1
    1980.50
    19.63
    1947.00
    1963.75
    1980.50
    1997.25
    2014.0
    ▇▇▇▇▇
    y
    0
    1
    140818.40
    257726.32
    230.56
    12308.21
    45605.68
    147478.41
    2820728.0
    ▇▁▁▁▁
    k
    0
    1
    33430.09
    109737.20
    20.90
    2063.38
    6984.37
    24506.90
    1939544.1
    ▇▁▁▁▁
    l
    0
    1
    44697.50
    90442.06
    14.52
    3194.63
    13133.90
    40102.09
    790195.5
    ▇▁▁▁▁
    int
    0
    1
    62690.81
    103330.22
    15.65
    5241.25
    20151.50
    71683.25
    788034.0
    ▇▁▁▁▁
Q1(a)*(b)
klema3$log_y<-log(klema3$y)
klema3$log_l<-log(klema3$l)
klema3$log_k<-log(klema3$k)
k_mod<-lm(log_y~I(log_k)+I((log_k)^2)+I(log_l)+I((log_l)^2)+I(log_k*log_l),data=klema3)
summary(k_mod)
##
## Call:
## lm(formula = log_y ~ I(log_k) + I((log_k)^2) + I(log_l) + I((log_l)^2) +
## I(log_k * log_l), data = klema3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.78160 -0.22818 -0.05519 0.18501 1.90069
##
## Coefficients:
## Estimate Std. E
or t value Pr(>|t|)
## (Intercept) 2.057833 0.108113 19.03 <2e-16 ***
## I(log_k) 0.402945 0.029025 13.88 <2e-16 ***
## I((log_k)^2) 0.058295 0.002982 19.55 <2e-16 ***
## I(log_l) 0.491054 0.027102 18.12 <2e-16 ***
## I((log_l)^2) 0.058983 0.003497 16.86 <2e-16 ***
## I(log_k * log_l) -0.113367 0.006044 -18.76 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard e
or: 0.3494 on 4414 degrees of freedom
## Multiple R-squared: 0.959, Adjusted R-squared: 0.959
## F-statistic: 2.067e+04 on 5 and 4414 DF, p-value: < 2.2e-16
Q1(C)
emmeans::emmeans(k_mod,specs = c("log_l","log_k"),at=c("l","k"),type="response")
## log_l log_k emmean SE df lower.CL upper.CL
## 9.324 8.869 10.55 0.007373 4414 10.53 10.56
##
## Confidence level used: 0.95
plot(k_mod)
Q1.(d)
a2_1<-ggpredict(k_mod,c("log_l","log_k"))
a2_1
## # Predicted values of log_y
##
## # log_k = 7.07
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 7.41 | [ 7.23, 7.59]
## 7.47 | 8.79 | [ 8.78, 8.81]
## 8.60 | 9.51 | [ 9.49, 9.53]
## 9.48 | 10.18 | [10.15, 10.21]
## 10.23 | 10.82 | [10.77, 10.87]
## 13.58 | 14.48 | [14.24, 14.72]
##
## # log_k = 8.87
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 9.26 | [ 8.97, 9.55]
## 7.47 | 9.67 | [ 9.64, 9.69]
## 8.60 | 10.16 | [10.14, 10.17]
## 9.48 | 10.64 | [10.63, 10.66]
## 10.23 | 11.13 | [11.11, 11.15]
## 13.58 | 14.11 | [13.98, 14.24]
##
## # log_k = 10.67
##
## log_l | Predicted | 95% CI
## ----------------------------------
## 2.68 | 11.49 | [11.06, 11.93]
## 7.47 | 10.92 | [10.84, 10.99]
## 8.60 | 11.18 | [11.14, 11.21]
## 9.48 | 11.48 | [11.46, 11.51]
## 10.23 | 11.82 | [11.80, 11.84]
## 13.58 | 14.11 | [14.05, 14.18]
plot(a2_1)
lfs21 <- readRDS("~/data/lfs21.rds")
lfs<-lfs21%>%
filter(age_12!="70 and over")
lfs$age=lfs$age_12
lfs$wage=lfs$hrlyearn
lfs$age=as.numeric(lfs$age)
lfs$wage=as.numeric(lfs$wage)
lfs<-lfs%>%
mutate(im2=ifelse(immig=="Non-immigrant","Non-immigrant","Immigrant"))%>%
mutate(ca2=ifelse(union=="Non-unionized","Non-unionized","Covered-Collective-agreement"))
#%>%
# mutate_if(is.character,factor)
Q.2(a).
l_mod<-lm(wage~age+I(age^2)+educ+sex+ca2+im2+cowmain+firmsize+prov,data=lfs)
summary(l_mod)
##
## Call:
## lm(formula = wage ~ age + I(age^2) + educ + sex + ca2 + im2 +
## cowmain + firmsize + prov, data = lfs)
##
## Residuals:
## Min 1Q Median 3Q Max
## -43.215 -7.428 -1.473 5.525 77.620
##
## Coefficients:
## Estimate Std. E
or t value Pr(>|t|)
## (Intercept) 3.982917 0.510466 7.803 6.14e-15
## age 4.774620 0.072091 66.230 < 2e-16
## I(age^2) -0.317012 0.005913 -53.609 < 2e-16
## educSome high school 1.261637 0.406858 3.101 0.00193
## educHigh school graduate 2.129278 0.387420 5.496 3.90e-08
## educSome postsecondary 3.197833 0.411303 7.775 7.64e-15
## educPostsecondary certificate or diploma 5.723664 0.382207 14.975 < 2e-16
## educBachelor's degree 11.948389 0.389321 30.690 < 2e-16
## educAbove bachelor's degree 16.855738 0.403659 41.757 < 2e-16
## sexFemale -5.201711 0.083482 -62.309 < 2e-16
## ca2Non-unionized 0.336650 0.110664 3.042 0.00235
## im2Non-immigrant 4.331353 0.110756 39.107 < 2e-16
## cowmainPrivate sector employees -4.155718 0.119660 -34.729 < 2e-16
## firmsize20 to 99 employees 1.956708 0.140281 13.949 < 2e-16
## firmsize100 to 500 employees 3.141274 0.143731 21.855 < 2e-16
## firmsizeMore than 500 employees 4.932237 0.121355 40.643 < 2e-16
## provPrince Edward Island -2.032285 0.335886 -6.051 1.45e-09
## provNova Scotia -1.760881 0.295396 -5.961 2.52e-09
## provNew Brunswick -2.217299 0.292218 -7.588 3.29e-14
## provQuebec 1.337800 0.252766 5.293 1.21e-07
## provOntario 3.547768 0.248251 14.291 < 2e-16
## provManitoba 0.660738 0.267316 2.472 0.01345
## provSaskatchewan 2.651351 0.280268 9.460 < 2e-16
## provAlberta 6.344972 0.267541 23.716 < 2e-16
## provBritish Columbia 3.918443 0.266103 14.725 < 2e-16
##
## (Intercept) ***
## age ***
## I(age^2) ***
## educSome high school **
## educHigh school graduate ***
## educSome postsecondary ***
## educPostsecondary certificate or diploma ***
## educBachelor's degree ***
## educAbove bachelor's degree ***
## sexFemale ***
## ca2Non-unionized **
## im2Non-immigrant ***
## cowmainPrivate sector employees ***
## firmsize20 to 99 employees ***
## firmsize100 to 500 employees ***
## firmsizeMore than 500 employees ***
## provPrince Edward Island ***
## provNova Scotia ***
## provNew Brunswick ***
## provQuebec ***
## provOntario ***
## provManitoba *
## provSaskatchewan ***
## provAlberta ***
## provBritish Columbia ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here