The dataset Heckman Part-Time includes the following variables: age – the age of the respondent in...

Question

The dataset Heckman Part-Time includes the following variables:

age – the age of the respondent in years.
exp01 – the potential labour market experience of the respondent
maritstatus – a categorical variable indicating the marital status of the respondent

1=Married or cohabiting
2=Single
3=Divorced, widowed or separated

school – the number of years of schooling the respondent has.
children16 – the total number of dependent children in the household aged 16 or under.
female – a binary variable takes the value 1 if the respondent is female 0 otherwise.
industry – a categorical variable indicating the industry in which the respondent is employed

1= Agriculture & forestry
2= Electricity & water supply
3= Manufacturing
4= Construction
5= Distribution & hotels
6= Transport & communications
7= Banking & finance
8= Public administration, health & education
9= Others

region_resid – categorical variable indicating the region in which the respondent lives.

1= Northern
2= Yorkshire/Humberside
3= East Midlands
4= East .Anglia
5= London
6= South East
7= South West
8= West Midlands
9= North West
10= Wales
11= Scotland
12= Northern Ireland

hwage – the hourly wage of the respondent.
Partime - takes the value 1 if the respondent is a part time employee and 0 otherwise (full -time)

All individuals in the sample are currently in employment:
Use the dataset to which you have been allocated to estimate the following models:
1. A probit model of selection based on the binary variable parttime, which includes a constant, age and its square, the gender dummy, schooling in years, the number of dependent children in the household aged 16 or under, dummy variables indicating the industry of employment (omitting manufacturing), dummy variables indicating marital status (omitting married or cohabiting), and dummy variables indicating the region of residence (omitting Wales). Briefly comment on the results.
2. Use the results from the probit estimates to construct an estimate of the inverse Mills-Ratio and use this variable to estimate a sample selection adjusted a hourly wage equation for parttime employees (those for whom parttime=1). The wage equation should have the natural logarithm of hourly earnings as the dependent variable and should include, in the following order, a constant, potential labour market experience and its square, the gender dummy, schooling in years, your estimate of the inverse Mills Ratio, dummy variables indicating the industry of employment (omitting manufacturing), dummy variables indicating marital status (omitting married or cohabiting), and dummy variables indicating the region of residence (omitting Wales). With explanation, including stating the null and alternative hypothesis being tested, comment on whether these estimates suggest sample selection is an issue for this data (remember when producing these estimates to restrict the sample to only part time employees, parttime=1).
3. Irrespective of your findings in (2) re-estimate the above model using e-views Heckman two-step estimator (remembering to reset the sample to all employees for this estimator and excluding the inverse Mills Ratio from the list of explanatory variables). Keep the order of the variables in the selection equations and the response equation the same as in (1) and (2). Carefully explain why the results for the Heckman-Two step estimator produced by eviews are different from those estimated in part (2) and why you might prefer to base inferences on the estimates produced by eviews.
4. Explain in general how (partial) Maximum likelihood estimates of the sample selection model can be obtained using an appropriately defined log-likelihood function. Compare and contrast this maximum likelihood method with the two-step method.
5. Explain how (partial) Maximum likelihood estimates of this sample selection model (specified in part 1-3) can be obtained. Use eviews to obtain maximum likelihood estimates of the model estimated and briefly comment about the results.
6. Explain how you would obtain the margina

005_7kvqajo-ryxs3uii.wf1 005_qajo-sm1kanov.docx

David · Accepted Answer

Answer 1 
 
We need to define the dummy variables for each of the categorical variable in our model. We assign k-1 dummy 
variables for k categories.  
 
For Gender, we have “Male” as our omitted category 
 
 D_FEMALE =1 if female, 0 otherwise (which is male here) 
 
For Marital Status, we have “Married or cohabiting” as our omitted category 
 
D_DIVORCED= 1 if Divorced, widowed or separated, 0 otherwise 
D_SINGLE =1 if Single, 0 otherwise 
 
For Industry, we have “Manufacturing” as our omitted category 
 
D_AGRI =1, if Agriculture & forestry, 0 otherwise 
D_BANK=1 if Banking & finance, 0 otherwise 
D_CONS =1 if Construction, 0 otherwise  
D_DISTR =1, if Distribution & hotels, 0 otherwise  
D_ELEC =1 if Electricity & Water Supply, 0 otherwise  
D_OTHERINDUSTRY =1 if Others, 0 otherwise  
D_PUBLIC =1 if Public administration, health & Education  
D_TRANS=1 if Transport & Communication, 0 otherwise   
 
For region_resid, we have “Wales” as our omitted category 
 
 
D_EASTANGLIA =1 if East Anglia, 0 otherwise
D_EASTMID =1 if East Mild lands, 0 otherwise  
D_LONDON =1 if London, 0 otherwise  
D_NORTHERN= 1 if Northern, 0 otherwise  
D_NORTHERNIRELAND =1 if Northern Ireland, 0 otherwise  
D_NORTHWEST =1 if North West, 0 otherwise  
D_SCOTLAND =1 if Scotland, 0 otherwise  
D_SOUTHEAST =1 if South East, 0 otherwise  
D_SOUTHWEST =1 if South West, 0 otherwise  
D_WESTMID =1 if West Midlands, 0 otherwise  
D_YORKSHIRE =1 if Yorkshire/Humberside, 0 otherwise 
 
 
We run the logit model on E-views. Below is the output generated.  
 
 
 
 
 
 
 
 
 
 
 
 

Table 1 
 
Dependent Variable: PARTIME 
Method: ML - Binary Probit  (Newton-Raphson / Marquardt steps) 
Date: 04/29/17   Time: 13:08   
Sample: 1 17079   
Included observations: 17079   
Convergence achieved after 6 iterations  
Coefficient covariance computed using the Huber-White method 
     
     Variable Coefficient Std. Error z-Statistic Prob.   
     
     C -0.163949 0.196500 -0.834348 0.4041 
AGE -0.108587 0.008835 -12.29113 0.0000 
AGE^2 0.001504 0.000106 14.14263 0.0000 
D_FEMALE 1.270667 0.030345 41.87397 0.0000 
SCHOOL -0.028879 0.004727 -6.109081 0.0000 
CHILDREN16 0.386359 0.014552 26.55036 0.0000 
D_DIVORCED -0.125482 0.036540 -3.434124 0.0006 
D_SINGLE -0.082769 0.035109 -2.357470 0.0184 
D_AGRI 0.380603 0.170986 2.225924 0.0260 
D_BANK 0.459914 0.057229 8.036404 0.0000 
D_CONS 0.227523 0.080538 2.825030 0.0047 
D_DISTR 1.029037 0.054625 18.83805 0.0000 
D_ELEC 0.040387 0.114154 0.353795 0.7235 
D_OTHERINDUSTRY 0.700298 0.073594 9.515678 0.0000 
D_PUBLIC 0.600043 0.051423 11.66881 0.0000 
D_TRANS 0.305459 0.068057 4.488274 0.0000 
D_EASTANGLIA 0.059455 0.076548 0.776704 0.4373 
D_EASTMID -0.091177 0.069608 -1.309872 0.1902 
D_LONDON -0.163152 0.069418 -2.350282 0.0188 
D_NORTHERN -0.063787 0.074915 -0.851456 0.3945 
D_NORTHERNIRELAND -0.079461 0.107131 -0.741713 0.4583 
D_NORTHWEST -0.055886 0.066049 -0.846125 0.3975 
D_SCOTLAND -0.057346 0.067722 -0.846782 0.3971 
D_SOUTHEAST -0.037192 0.060746 -0.612254 0.5404 
D_SOUTHWEST 0.106336 0.066261 1.604808 0.1085 
D_WESTMID 0.056196 0.068811 0.816674 0.4141 
D_YORKSHIRE -0.047302 0.067524 -0.700532 0.4836 
     
     McFadden R-squared 0.232685     Mean dependent var 0.245506 
S.D. dependent var 0.430400     S.E. of regression 0.371017 
Akaike info criterion 0.858478     Sum squared resid 2347.275 
Schwarz criterion 0.870723     Log likelihood -7303.976 
Hannan-Quinn criter. 0.862515     Deviance 14607.95 
Restr. Deviance 19037.76     Restr. log likelihood -9518.880 
LR statistic 4429.808     Avg. log likelihood -0.427658 
Prob(LR statistic) 0.000000    
     
     Obs with Dep=0 12886      Total obs 17079 
Obs with Dep=1 4193    
     

We look at the p-values of each of the independent variables to see whether that variable is statistically 
significant or not. The above highlighted variables have p-values higher than 5 percent (we are assuming 
5 percent level of significance). For these, we don’t reject the null that the true coefficient of each of this 
variable is zero and hence the variable is not statistically significant.  
The below logit regression is run to see whether Gender, Industry, Marital Status and Region of residence (each of 
the categorical variables) are overall significant or not. As we can see p-values of each of the variables except 
Region of residence is less than 5 percent. Hence all the variables except the Region of residence are statistically 
significant.  We will go back to the above regression to see whether the individual category of each of the dummy 
variables is significant or not. 
 
 
Table 2
Dependent Variable: PARTIME   
Method: ML - Binary Probit  (Newton-Raphson / Marquardt steps) 
Date: 04/29/17   Time: 14:06   
Sample: 1 17079   
Included observations: 17079   
Convergence achieved after 5 iterations  
Coefficient covariance computed using the Huber-White method 
     
     Variable Coefficient Std. Error z-Statistic Prob.   
     
     C 0.541099 0.174555 3.099875 0.0019 
AGE -0.121477 0.008385 -14.48686 0.0000 
AGE^2 0.001635 0.000103 15.94211 0.0000 
FEMALE 1.316211 0.029510 44.60217 0.0000 
SCHOOL -0.036481 0.004622 -7.893106 0.0000 
CHILDREN16 0.381259 0.013948 27.33341 0.0000 
INDUSTRY 0.033453 0.006402 5.225186 0.0000 
MARITSTATUS -0.055542 0.016850 -3.296260 0.0010 
REGION_RESID 0.006435 0.003927 1.638636 0.1013 
     
     McFadden R-squared 0.203797     Mean dependent var 0.245506 
S.D. dependent var 0.430400     S.E. of regression 0.377174 
Akaike info criterion 0.888571     Sum squared resid 2428.377 
Schwarz criterion 0.892653     Log likelihood -7578.956 
Hannan-Quinn criter. 0.889917     Deviance 15157.91 
Restr. Deviance 19037.76     Restr. log likelihood -9518.880 
LR statistic 3879.848     Avg. log likelihood -0.443759 
Prob(LR statistic) 0.000000    
     
     Obs with Dep=0 12886      Total obs 17079 
Obs with Dep=1 4193    
     

As we can see in table 1, we see all the dummy variables for Region of residence (except for London) are 
statistically insignificant due to high p-values (highlighted in yellow). This is consistent with what we see 
in table 2. We see all the other independent variables, Age, Schooling in years, the number of 
dependent children in the household aged under 16 or under, Industry, Marital status are statistically 
significant.

The dataset Heckman Part-Time includes the following variables: age – the age of the respondent in years. exp01 – the potential labour market experience of the respondent maritstatus – a categorical...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment