Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Washington, D.C. is the capital of the United States. Washington's population is approaching 700,000 people and has been growing since 2000 following a half-century of population decline. The city is...

1 answer below »
Washington, D.C. is the capital of the United States. Washington's population is approaching 700,000
people and has been growing since 2000 following a half-century of population decline. The city is highly
segregated and features a high cost of living. In 2017, the average price of a single-family home in the
district was $649,000. The dataset (DC_Property_Train.CSV) provides insight on the housing stock of the
district.
Explanations for Columns
ID: House ID
BATHRM: Number of Full Bathrooms
HF_BATHRM: Number of Half Bathrooms (no bathtub or shower)
HEAT: Heating
AC: Cooling
NUM_UNITS: Number of Units
ROOMS: Number of Rooms
BEDRM: Number of Bedrooms
AYB: The earliest time the main portion of the building was built
YR_RMDL: Year structure was remodeled
EYB: The year an improvement was built more recent than actual year built
STORIES: Number of stories in primary dwelling
PRICE: Price of most recent sale
GBA: Gross building area in square feet
BLDG_NUM: Building Number on Property
STYLE: Style
STRUCT: Structure
LANDAREA: Land area of property in square feet
ASSESSMENT_NBHD: Neighborhood ID
In this problem, you are required to finish the following tasks. All necessary steps need to be clearly
documented in your report. You use the data set DC_Property_Train.CSV for questions 1 to 6.
1. Plot a histogram for EYB. Describe the plotted pattern. (4 marks)
2. Plot a histogram for PRICE. Describe the plotted pattern and analyze the potential reasons for the
high-priced properties. (4 marks)
3. Summarize the average PRICE for each ASSESSMENT_NBHD. Sort the processed data and make a
ar plot of average prices for the top 10 neighborhoods. (6 marks)
4. Plot boxplots of PRICE by ASSESSMENT_NBHD for the top 10 neighborhoods. Explain the pros of using
oxplots instead of average prices. (6 marks)
5. Plot boxplots PRICE by the categories of STRUCT using the facet approach. Compare these boxplots
and summarize your findings. (6 marks)
6. Visualize the relationship between PRICE and GBA. Identify outliers based on the visualization and list
their IDs. (6 marks)
7. Create a regression model for predicting PRICE through selected variables (you decide which ones to
use) from the data set DC_Property_Train.CSV. You may exclude the identified outliers from the
previous steps. Quantitatively evaluate the model performance using R2 and MSE. Fill the
PREDICTED_PRICE column of the data set DC_Property_Test.CSV using the predicted values from you
model. (8 marks)
Answered Same Day Nov 29, 2021

Solution

Hemanth answered on Nov 30 2021
153 Votes
# Installing required packages
install.packages("dplyr")
install.packages("ggplot2")
install.packages("caret")
# Loading required packages
li
ary(dplyr)
li
ary(ggplot2)
li
ary(caret)
# Removing all objects from the working directory
m(list = ls())
# Reading dataset
Property_Train <- read.csv("DC_Property_Train.csv", header = TRUE, sep = ",")
# Showing first SIX records
head(Property_Train)
# Showing structure of the data
str(Property_Train)
# Plotting histogram of EYB variable
par(mfrow = c(1,1))
hist(Property_Train$EYB,
xlab = 'EYB',
ylab = 'Frequency',
main = 'Histogram of EYB',
col = rainbow(7))
# Plotting histogram of PRICE variable
hist(Property_Train$PRICE,
xlab = 'PRICE',
ylab = 'Frequency',
main = 'Histogram of PRICE',
col = rainbow(7))
# Summarizing the average PRICE for each ASSESSMENT_NBHD.
# Sorting prices and selecting the top 10 neighborhoods.
Top_10_Avg <- Property_Train %>%
group_by(ASSESSMENT_NBHD) %>%
summarise(Average = as.integer(mean(PRICE))) %>%
a
ange(desc(Average)) %>%
head(n = 10)
# Making bar plot of average Prices
ggplot(Top_10_Avg, aes(x = ASSESSMENT_NBHD, y = Average)) +
geom_bar(stat = 'identity', fill = rainbow(10)) +...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here