Modeling Project
Introduction to Econometrics: Modeling Project
General Instructions
The Modeling project for this course is intended to give you hands on experience to construct an
econometric model for a real world problem. You must keep a copy of this project to show your
prospective employers to substantiate the fact that you have learnt quite a lot of econometric
modeling. They will really like it in your resume. However, in this project you are not able to
involve yourself in the data collection effort, which is a major learning and exciting experience in
any econometric analysis. The data that are being provided to you have the features described
in the following section.
The modeling project Report must be typewritten, double-spaced, and must not exceed eight
pages. The Report must not be in EXCEL sheet or in STATA sheet. Over and above the 8-page
limit, you must attach STATA print out of the regression results as APPENDIX. On your title page,
you should have the name of the course the semester (for instance, Summer 2021), the nice
title you have decided to give to your report, and your name.
Data Description
You are an economist at the headquarters of a major real estate company interested in the
Chicago u
an area. Your task is to investigate the effects of various structural, locational, access
factors and factors relating to the local government spending on home value. Your programming
assistant has compiled data for a randomly selected sample of about 2000 property transactions
from Cook and Dupage counties of the Chicago Metropolis.
Only use the data set assigned to you. (Data set has been atatched)
The details of the data, such as variable descriptions, original source, units in which they are
measured are available in the li
ary or on a specific Internet site. You need to have them ready
efore you start working on your modeling project.
• Attached is the following article: Written by Sudip Chattopadhyay in the journal
Land Economics, volume 75, number 1, pp. 22-38, 1999.
• When you download a PDF copy of the journal article, look for Table 3 in the article for
variable definition, source, etc.
Instruction on the Modeling Project Write Up
1. Introduction:
In this section, write a paragraph or two explaining/discussion the following aspect of you
project.
 Explain, in your own words, what economic issues you are addressing in the project.
 Explain, in your own words, why the subject may be interesting.
 Discuss, in specific terms, what you wish to predict or explain (the subject of your paper).
 Explain the dependent and each of the explanatory variables. Specify the units in which they
are measured.
 Write down the population regression equation as follows:
2. Data and Method
In this section, describe the data by explaining what each variable represents by
inging the
context of the Chicago Metropolitan area counties. Also, describe the econometric model you
want to estimate and the relationship (positive or negative) you would think each variable has
on the selling price of homes.
log(sprice) = β0 + β2 log(nrooms) + β3 log(lvarea) + β4 log(hageeff) + β5 log(lsize) + β6 aircon +
β7 nbath + β8 garage + β9 log(ptaxes) + β10 pctwht + β11 log(medinc) + β12 log(dfcl) + β13 dfni +
β14 log(sspend) + β15 log(mspend) + β16 cook + β17 ohare + u
(Note that in the above regression model SPRICE, NROOMS, LVAREA, HAGEEFF, LSIZE, PTAXES,
MEDINC, DFCL, SSPEND, MSPEND are transformed in to natural logarithm. Keep the rest of the
variables in unlogged form, since they have zero values in the sample. Transformation into
natural logarithm is required before you start estimating the above population model in STATA.)
3. Empirical Results
In this section, you must do the following:
3.1 Full Regression
In this sub-section, you must present and discuss the full regression estimated model with all
the available explanatory variables.
i) Present the estimated regression equation for the first computer run, with standard e
ors
in parenthesis under each coefficient. Also, present statistic-F and 2R for the estimated
model. You must use all the available explanatory variables for this run of the OLS model.
ii) Interpret 2R .
iii) Perform a test of the overall significance of the regression equation (F-test for the full set of
egression parameters). Provide all the details of the test, including decision and conclusion.
iv) Perform the test to see if the variable hageeff is statistically significant at 5% level. Provide
all the details of the test.
3.2 Final Regression
In this sub-section, present and discuss the final model by ca
ying out the following steps:
v) Drop the insignificant variables, one at a time, by looking at the p-value from the regression
esults. This means you need to drop the one with the highest p-value, then run the
egression, look for the highest p-value again, then drop the associated variable and
continue this way until all coefficients are significant at the 0.05 level of significance.
vi) Now do the subset test. That is, using the full regression model from (ii) and the final model
obtained in (vi), test whether the variables you dropped are significant as a group, using F-
test for the subset of the explanatory variables you finally keep. Rejection of the null
hypothesis would suggest that you might have dropped an important variable and you
should reconsider including one or more variables you have dropped earlier.
vii) Presnt your final regression equation, with standard e
or in parentheses under each
coefficient. Also, present statistic-F and2R for this final regression.
3.3 Analysis and Inferences
In this sub-section, present the following analyses/inferences pertaining to the revised model
(i.e., after dropping all the insignificant explanatory variables)
 Interpret three most highly significant estimated regression coefficients in the context of the
problem.
 Choose two explanatory variables from the final regression and construct and interpret the
confidence intervals for the population coefficients of each of your chosen explanatory
variables.
4. Discussion and Conclusion (one or two paragraph)
In this section, provide a wholistic discussion of the results, in general. Then conclude with your
own observations on your findings.
 State in your own words your conclusions regarding the final (revised) model you have
estimated. Base your conclusion by carefully reviewing the final (revised) models and the
causal relationships you observe in your model. Discuss any problems your final model
might have. Do not hesitate to write the strengths and weaknesses of your final model and
your results.
 Finally, offer any interesting implications of your findings that you might like to convey to
your boss in a non-technical way. For example, based on your findings in sub-section 3.3.
5. Appendix (Computer printout)
In this section, include STATA printout of the full-set and the final regressions. No dataset print
out please.
General Instructions
Estimating the Demand for Air Quality:
New Evidence Based on the
Chicago Housing Market
Sudip Chattopadhyay
ABSTRACT. This paper combines a new, large
household-level data set with the two-stage he-
donic-estimation technique to derive new esti-
mates of willingness to pay (WTP) for reduced ai
pollution. The WTP estimates are found robust
against functional-form specification. Marginal
WTP estimates for a reduction in particulate mat-
ter (PM-10) are found to be quite comparable
with some previous estimates. Benefits of non-
marginal changes exhibit consistently highe
monetary returns in the case of PM-10 than in the
case of SO2, signifying that households dislike
particulate pollution more than they do sulfur.
(JEL Q25)
I. INTRODUCTION
Measuring the welfare impact of environ-
mental degradation, particularly air pollu-
tion, using hedonic techniques has remained
an important area of empirical research in the
past few decades. The use of hedonic benefit
estimates of clean air is no longer limited to
addressing welfare issues but extends to in-
clude such important aspects as incorporat-
ing monetary values for changes in environ-
mental quality into national accounts (Smith
and Huang XXXXXXXXXXAfter the theoretical pape
y Rosen XXXXXXXXXXdeveloping the hedonic
model, there have been numerous empirical
studies which estimate willingness to pay
(WTP) for marginal changes in air quality.
Unfortunately, studies that estimate the envi-
onmental benefits of non-marginal changes
in air quality are so far, very limited. Such
estimates, which are needed to gauge the
enefits of large changes in air quality, re-
quire knowledge of the parameters of the
consumer utility function. Deriving such pa-
ameter estimates requires the application of
the hedonic two-stage estimation technique
on household-level data. The present pape
combines a new, large household-level data
set with the two-stage estimation technique
to derive new estimates of WTP for reduced
air pollution.
The study models the Chicago housing
market to estimate the demand for clean ai
measured in terms of concentration of partic-
ulate matter (PM-10) and sulfur dioxide
(SO2). There have been a few studies that es-
timate WTP for reduced air pollution in the
Chicago housing market (see, e.g., Atkinson
and Crocker 1987; Bender, Gronberg, and
Hwang XXXXXXXXXXBut these studies have limited
appeal, for two reasons. First, the hedonic
data sets considered in these studies pertain
to the 1960s or the early 1970s. Second, the
estimates are only for marginal changes in ai
quality, which are not useful for welfare
analysis (see, e.g., Bartik 1988; Palmquist
1988, for discussion of exact measurement of
welfare). Since Chicago falls in the desig-
nated non-attainment region by National
Ambient Air Quality Standards (NAAQS)
(National Air Quality Emissions Trend Re-
port 1990), it is worthwhile to ca
y out he-
donic analysis with more recent data to
estimate WTP for both marginal and non-
marginal changes in air quality. The present
esearch compares the new estimates of mar-
ginal benefits with the estimates obtained in
some previous studies. Using the estimates of
non-marginal benefits from the second-stage
hedonic regression, the study also analyzes
the size of the monetary returns to reduced
The author is with the Department of Economics,
Kansas State University. He wishes to thank Jan
Brueckner, his thesis adviser, for advice and guidance.
Thanks are due to John Braden for his comments on an
earlier version of the paper and Robert Shaw at Housing
and U
an Development Office, Washington, DC, fo
providing the data on housing, and two anonymous ref-
erees for their helpful suggestions that improved the
paper.
Land Economics • Fe
uary 1999 • 75 (1): 22-38
75(1) Chattopadhyay: Demand for Air Quality in Chicago 23
air pollution, the knowledge of which is nec-
essary for important policy discussions.
Two major econometric issues that must
e addressed in reliable estimation of the he-