Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

--- title: "Assignment 5: IV/2SLS" date: output: bookdown::html_document2: toc: true toc_float: true theme: flatly highlight: monochrome code_folding: hide markdown_extensions: XXXXXXXXXXadmonition...

1 answer below »
---
title: "Assignment 5: IV/2SLS"
date:
output:
bookdown::html_document2:
toc: true
toc_float: true
theme: flatly
highlight: monochrome
code_folding: hide
markdown_extensions:
XXXXXXXXXXadmonition
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE,
    messages = FALSE
)
# kable
options(knitr.kable.NA = '')
# Set the graphical theme
ggplot2::theme_set(ggplot2::theme_light())
li
ary(tidyverse)
li
ary(AER)
li
ary(kableExtra)
li
ary(gridExtra)
li
ary(
oom)
li
ary(haven)
# load model summary and set options
li
ary(modelsummary)
gm <- modelsummary::gof_map
gm$omit <- TRUE
gm$omit[1] <- FALSE
gm$omit[6] <- FALSE
gm$omit[5] <- FALSE
gm$omit[17] <- FALSE
```
# Clean Ai
Economic theory suggests that amenities of a location will be reflected in the prices people are willing to pay to live in a particular place. For example, Vancouver has more desirable amenities than Windsor, therefore, housing prices are higher in Vancouver than in Windsor -- all else equal. Unfortunately, establishing the causal link between a particular amenity and prices is not an easy task. In this problem set we will look at the question whether house prices reflect the level of air pollution. If we believe the price differential reflects the causal effect of air pollution on housing prices, then it can be used to evaluate how much people are willing to pay for cleaner air.
The `pollution_HP` data set loads from my GitHub below and contains data set on house prices and pollution levels in 966 U.S. counties. In addition, there are a number of "control" variables, and all variables are measured in changes between 1970 and 1980. The house price variable `Change.house.price` is the change in the log of the average house prices. Air pollution is measured in terms of average total suspended particulate matter (TSPs). The variable `Change.TSPs` measures the change in average TSPs between 1970 and 1980.
Regulation of air pollution started in the U.S. during the 1970s. Counties with particularly high levels of pollution in the early to mid 1970s were regulated by the Environmental Protection Agency (EPA), and had to take specific measures to reduce pollution. Areas that had lower pollution levels already were not regulated. There are two variables in the data set which capture whether a county was regulated or not: `Regulated.74` captures the level of TSPs in 1974. Regulation by the EPA was more likely in counties with higher pollution levels. `Regulated.7576` is a dummy variable indicating whether the county was regulated by the EPA in 1975 or 1976. We expect that regulation led to a faster decline in air pollution.
## Data
The control variables capture changes in other characteristics of the counties, aside from pollution and regulation. They include:
```{r, eval = F}
# Variable names and definitions
Change.dens = XXXXXXXXXXchange in population density,
Change.mnfcg = change in % manufacturing employment,
Change.white = change in fraction of population that is white,
Change.feml = change in fraction female,
Change.age65 = change in fraction over 65 years old,
Change.hs = change in fraction with at least a high school degree,
Change.coll = change in fraction with at least a college degree,
Change.u
an = change in fraction living in u
an area,
Change.unemp = change in unemployment rate,
Change.income = change in income per-capita,
Change.poverty = change in poverty rate,
Change.owner = change in fraction of houses that are owner-occupied,
Change.plumb = change in fraction of houses with plumbing,
Change.revenue = change in government revenue per-capita,
Change.taxprop = change in property taxes per-capita,
Change.epend = change in general expenditures per-capita,
Change.educ = change in fraction of spending on education,
Change.hghwy = change in % spending on highways,
Change.welfr = change in % spending on public welfare,
Change.hlth = change in % spending on health,
```
```{r}
# data loaded here
pollution <- read_rds(url("https:
github.com
en-sand
en-sand.github.io
lo
maste
files/Pollution_HP.rds?raw=true"))
```
## Economic Question:
Does air quality (causally) affect housing prices?
The outcome variable is the change in county housing prices during the 1970s `Change.house.price`. We want to estimate the "causal" effect of air pollution changes on housing price changes. The regressor of interest or treatment variable is changes in pollution, as measured by `Change.TSPs`. We can write down an econometric model as:
$$
\Delta \log (\text{house price})_i = \alpha_1 + \delta \Delta TSPs_i + \Delta \epsilon_i \tag{1}
$$
where the $i$ indexes counties, and the $\epsilon_i$ includes other factors that affect changes in house prices over the 1970s and 1980s. A finding that $\delta$ is negative would indicate that property values decline in the level of pollution -- ie, clean air has economic benefits.
**(A)**
The data we have access to comes in changes or differences. For example the variable $\Delta \log \text{house price}_i$ (`Change.house.price` in the data) is defined as $\log(\text{house price})_{i,1980} - \log(\text{house price})_{i,1970}$. This means that we could write outcomes in levels as:
$$
\log (\text{house price})_{it} = \alpha + \delta TSPs_{it} + \eta_{it} \tag{2}
$$
Where $1[year = 1980]$ dummy variable for 1980, $t$ indexes year, where $t \in \{1970,1980\}$, and $i$ indexes counties. Suppose the unobserved term $\eta_{it$} takes the form:
$$
\eta_{it} = \alpha_1[year = 1980] + \alpha_{i} + \epsilon_{it}
$$
Where the object $\alpha_i$ is an unobserved county effect that does not vary over time, and $\alpha_1$ is the coefficient for a 1980 dummy. Derive equation (1) from (2). Show that the unobserved effect is eliminated. Give one example of what $\alpha_i$ could denote (ie, give an example of an unobserved fixed feature of a county) and one example of what $\epsilon_{it}$ could denote.
**Answer:**
**(B)**
Start by assuming that the effect of air pollution changes on housing prices is the same everywhere (i.e. there is no heterogeneity in the treatment effect). Run a bivariate OLS regression of the change in housing prices on the change in TSPs as in equation (1). Is this procedure likely to give you an estimate of the causal effect of air quality on housing prices? What are the likely biases you might expect? What is the sign of these biases? What does this tell you about your estimate?
**Answer:**
```{r}
# code here
```
**(C)**
Suppose that air quality changes are randomly assigned conditional on covariates -- That is, CIA. Obtain an estimate of the impact of air quality changes on housing prices under this assumption. To do so, estimate three specifications adding additional controls in each specification. Explain why you chose the controls you did. How does your estimate differ from your estimate in part I? What does this tell you about the biases you analyzed in part I?
**Answer:**
```{r}
# code here
```
**(D)**
Suppose that EPA regulation is a potential instrumental variable for pollution changes during the 1970s. In class we discussed two assumptions necessary for a good instrument. Explain in words what these assumption mean in this particular context.
**Answer:**
**(E)**
Use the variable `Regulated.7576` as an instrument and construct a Wald estimate of the effect of pollution changes on housing prices. Demonstrate that the Wald estimate is identical to a bivariate instrumental variables regression of house price changes on pollution changes. How does this estimate differ from your estimates in Part I and Part II? How would you interpret the differences?
**Answer:**
```{r}
# code here
```
**(F)**
Now run an 2SLS regression with `Regulated.7576` as the instrument using the same covariates as your prefe
ed specification in part III. Run the first stage regression for this problem. Does regulation affect the change in air pollution in the way you would expect? Run the reduced form regression for this problem. Does regulation affect housing prices in the way you would expect? Obtain the ratio of the reduced form coefficient on `Regulated.7576` and the first stage coefficient on `Regulated.7576`. If you have done everything co
ectly, you should obtain the 2SLS coefficient (this is called indirect least squares and is just the analogue to what you are doing in computing the Wald estimate).
**Answer:**
```{r}
# code here
```
**(G)**
Now suppose that the effect of pollution changes on house price changes are heterogeneous across counties (i.e. there are heterogeneous treatment effects). What parameter can you possibly identify using instrumental variables in this case? What additional assumption do you need in this case in addition to those you stated earlier to identify this parameter? Do you think this assumption is satisfied in this application?
**Answer:**
**(H)**
Summarize your results. What have you learned about the effect of clean air on housing prices? Which is your most reliable estimate?
To obtain an idea of the magnitude of these effects, perform the following analysis:
1. Estimate the impact of the regulations on pollution (hint: this comes from the first-stage)
2. Multiply this impact with your best estimate of the causal effect of pollution on housing prices.
This product gives you the impact of the cleaner air due to regulation on house prices. The average house cost was about $100,000 (this is in 1997 dollars). What is the dollar amount the average home owner is willing to pay for this level of improvement in air quality? Does this seem like a reasonable number?
# Something Fishy is going on at the Fulton market...
Fish is sold by about 35 different dealers at the Fulton fish market, although only six of the dealers regularly sell whiting. There are no posted prices, and each dealer is free to charge a different price to each customer. The buyers at the Fulton market generally own retail fish shops or restaurants.
Whiting is a good choice for a study of the wholesale fish market because more transactions take place in whiting than almost any other fish. Whiting also vary less in size and quality than other fish. Finally, there is probably very little substitution between whiting and other fish. Whiting is a very cheap fish in large supply that is oily and distinctive tasting. Other fish would rarely be sold at a low enough price and in sufficient quantities to be attractive to retailers or restaurants as a substitute for whiting.
The data used in Graddy XXXXXXXXXXwere obtained from a single dealer who supplied his inventory sheets for the period December 2nd, 1991 through May 8th, 1992. Total price and quantity for each transaction are recorded on the inventory sheets. These data are supplemented by data that were collected by direct observation from the same dealer during the period April 13th through May 8th, 1992. For this study, the prices and quantities are aggregated by day, for the 97
Answered Same Day Apr 02, 2021

Solution

Abr Writing answered on Apr 02 2021
163 Votes
---
title: "Assignment 5: IV/2SLS"
date:
output:
bookdown::html_document2:
toc: true
toc_float: true
theme: flatly
highlight: monochrome
code_folding: hide
markdown_extensions:
- admonition
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE,
    messages = FALSE,
    comment = ""
)
# kable
options(knitr.kable.NA = '')
# Set the graphical theme
ggplot2::theme_set(ggplot2::theme_light())
li
ary(tidyverse)
li
ary(AER)
li
ary(kableExtra)
li
ary(gridExtra)
li
ary(
oom)
li
ary(haven)
li
ary(aod)
# load model summary and set options
li
ary(modelsummary)
gm <- modelsummary::gof_map
gm$omit <- TRUE
gm$omit[1] <- FALSE
gm$omit[6] <- FALSE
gm$omit[5] <- FALSE
gm$omit[17] <- FALSE
```
# Clean Ai
Economic theory suggests that amenities of a location will be reflected in the prices people are willing to pay to live in a particular place. For example, Vancouver has more desirable amenities than Windsor, therefore, housing prices are higher in Vancouver than in Windsor -- all else equal. Unfortunately, establishing the causal link between a particular amenity and prices is not an easy task. In this problem set we will look at the question whether house prices reflect the level of air pollution. If we believe the price differential reflects the causal effect of air pollution on housing prices, then it can be used to evaluate how much people are willing to pay for cleaner air.
The `pollution_HP` data set loads from my GitHub below and contains data set on house prices and pollution levels in 966 U.S. counties. In addition, there are a number of "control" variables, and all variables are measured in changes between 1970 and 1980. The house price variable `Change.house.price` is the change in the log of the average house prices. Air pollution is measured in terms of average total suspended particulate matter (TSPs). The variable `Change.TSPs` measures the change in average TSPs between 1970 and 1980.
Regulation of air pollution started in the U.S. during the 1970s. Counties with particularly high levels of pollution in the early to mid 1970s were regulated by the Environmental Protection Agency (EPA), and had to take specific measures to reduce pollution. Areas that had lower pollution levels already were not regulated. There are two variables in the data set which capture whether a county was regulated or not: `Regulated.74` captures the level of TSPs in 1974. Regulation by the EPA was more likely in counties with higher pollution levels. `Regulated.7576` is a dummy variable indicating whether the county was regulated by the EPA in 1975 or 1976. We expect that regulation led to a faster decline in air pollution.
## Data
The control variables capture changes in other characteristics of the counties, aside from pollution and regulation. They include:
```{r, eval = F}
# Variable names and definitions
Change.dens = 1970-80 change in population density,
Change.mnfcg = change in % manufacturing employment,
Change.white = change in fraction of population that is white,
Change.feml = change in fraction female,
Change.age65 = change in fraction over 65 years old,
Change.hs = change in fraction with at least a high school degree,
Change.coll = change in fraction with at least a college degree,
Change.u
an = change in fraction living in u
an area,
Change.unemp = change in unemployment rate,
Change.income = change in income per-capita,
Change.poverty = change in poverty rate,
Change.owner = change in fraction of houses that are owner-occupied,
Change.plumb = change in fraction of houses with plumbing,
Change.revenue = change in government revenue per-capita,
Change.taxprop = change in property taxes per-capita,
Change.epend = change in general expenditures per-capita,
Change.educ = change in fraction of spending on education,
Change.hghwy = change in % spending on highways,
Change.welfr = change in % spending on public welfare,
Change.hlth = change in % spending on health,
```
```{r}
# data loaded here
pollution <- read_rds(url("https:
github.com
en-sand
en-sand.github.io
lo
maste
files/Pollution_HP.rds?raw=true"))
```
## Economic Question:
Does air quality (causally) affect housing prices?
The outcome variable is the change in county housing prices during the 1970s `Change.house.price`. We want to estimate the "causal" effect of air pollution changes on housing price changes. The regressor of interest or treatment variable is changes in pollution, as measured by `Change.TSPs`. We can write down an econometric model as:
$$
\Delta \log (\text{house price})_i = \alpha_1 + \delta \Delta TSPs_i + \Delta \epsilon_i \tag{1}
$$
where the $i$ indexes counties, and the $\epsilon_i$ includes other factors that affect changes in house prices over the 1970s and 1980s. A finding that $\delta$ is negative would indicate that property values decline in the level of pollution -- ie, clean air has economic benefits.
**(A)**
The data we have access to comes in changes or differences. For example the variable $\Delta \log \text{house price}_i$ (`Change.house.price` in the data) is defined as $\log(\text{house price})_{i,1980} - \log(\text{house price})_{i,1970}$. This means that we could write outcomes in levels as:
$$
\log (\text{house price})_{it} = \alpha + \delta TSPs_{it} + \eta_{it} \tag{2}
$$
Where $1[year = 1980]$ dummy variable for 1980, $t$ indexes year, where $t \in \{1970,1980\}$, and $i$ indexes counties. Suppose the unobserved term $\eta_{it$} takes the form:
$$
\eta_{it} = \alpha_1[year = 1980] + \alpha_{i} + \epsilon_{it}
$$
Where the object $\alpha_i$ is an unobserved county effect that does not vary over time, and $\alpha_1$ is the coefficient for a 1980 dummy. Derive equation (1) from (2). Show that the unobserved effect is eliminated. Give one...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here