Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

PART A (38 MARKS) For this part, you examine how the housing values are affected by a number of variables such as air pollution, features of house and other socio-economic factors using the data set...

1 answer below »
PART A (38 MARKS)
For this part, you examine how the housing values are affected by a number of variables such as air pollution, features of house and other socio-economic factors using the data set (“House_price_Data.xls”) provided on the Blackboard. This data file contains the data on the following variables from 506 communities in Australia:
Hpricei    =    median house price in community i (Hprice is measured in $1,000)
NOxi    =    Amount of nitrogen oxides (NOx) in the air (measured in parts per million or PPM)
DISTi    =    distance of community i from the state capital (DIST is measured in miles)
ROOMSi    =    Average number of rooms per house in community i
STRatioi    =    Average student-teacher ratio of schools in community i
CRIMEi    =    Crime committed (measured per 100 residents in community i)
Using the data set provided in the Excel file (“House_price_Data.xls”), transform the required variables to estimate a multiple regression model in which the natural log of Hprice is regressed on the following variables:
· NOx,
· NOx squared,
· DIST,
· DIST squared,
· the natural log of ROOMS,
· STRatio, and
· CRIME.
Answer Questions 1-8 in Part A in reference to the multiple regression output that you have obtained.
Question 1: Provide an output from estimating the multiple regression model specified above and state the estimated sample regression equation. (4 marks)
(i) Provide the output from estimation results (3 marks)
        

(ii) State the estimated regression equation (1 mark)
Question 2: Interpret the estimated coefficients for INTERCEPT, lnROOMS and Nox in the context of the estimated regression model and comment
iefly on whether the estimated coefficients make sense. (2 marks each, 6 marks total)
(i) INTERCEPT
(ii) lnROOMS
(iii) Nox
Question 3: Following the steps provided below, test the hypothesis that the true population coefficient for DIST is negative at a 1% significance level. (4 marks total)
(i) State the null and alternative hypotheses. (1 mark)
(ii) Calculate the test statistics. (2 marks)

(iii) Obtain the relevant critical value and complete the test (that is, determine whether you should reject or not reject the null hypothesis). (1 mark)
Question 4: Construct a 99% confidence interval of the coefficient for CRIME and interpret this interval.
XXXXXXXXXX5 marks total)
(i) Provide the relevant critical value and construct the confidence interval manually using the estimated coefficient and standard e
or in your regression output (do not use an option available in Excel’s regression command to construct the confidence interval). (3 marks)
(ii) Interpret the confidence interval. (1 mark)
(iii) State what the confidence interval suggests about the significance of the coefficient for CRIME.
XXXXXXXXXXmark)
Question 5: Answer the following four questions regarding the relationship between lnHPrice and DIST as implied by the estimated regression model. (5 marks total)
(i) Sketch the implied relationship between lnHPrice and DIST in a two-dimensional diagram. Clearly label the horizontal and vertical axes. (1 mark)
(ii) Describe in words the implied relationship between lnHPrice and DIST. (1 mark)
(iii) Do the estimation results suggest that a quadratic relationship between lnHPrice and DIST is appropriate? Briefly describe a reasoning of your answer. (1 mark)
(iv) Find the level of DIST at which the marginal effect of increased DIST on lnHPrice changes its sign (that is, from positive to negative or negative to positive). State all relevant steps. (2 marks)
Question 6: Answer the following three questions regarding the F-statistic and Significance F figures obtained in the regression output. (1 mark each, 3 marks total)
(i) State the null and alternative hypotheses that can be tested by the reported F-statistic.
(ii) Provide the critical value to be used for testing the hypothesis in (i) at 5% significance level. Present the answer in at least 6 decimal point.
    
(iii) Based on the reported F-statistic, what would you conclude about the test of the hypothesis stated in part (i)?
Question 7: Answer the following two questions regarding the R-Square and standard e
or of regression obtained in your regression output. (1 mark each, 2 marks total)
(i) Interpret the reported R Square.
(ii) Interpret the reported standard e
or of regression.
Question 8: Following the steps provided below, test a joint hypothesis that neither DIST nor STRatio affects the house price. (9 marks total)
(i) State the null and alternative hypothesis. (1 mark)
(ii) State a regression model that needs to be estimated to test the hypothesis stated in (i). (2 marks)
(iii) Estimate the regression model proposed in step (ii) and calculate an appropriate statistics to test the hypothesis stated in step (i). State the relevant formula used for calculation of statistics, specify the inputs and present the final result. (4 marks)
(iv) Obtain the relevant critical value at 1% significance level and complete the test (that is, to determine whether to reject or not to reject the hypothesis. (2 marks)
PART B (12 MARKS)
Suppose that a researcher is interested in the determinants of income for Australian workers and has estimated the following multiple linear regression model,
    
where          = monthly income earned by individual i (measured in dollar per month),
        = gender of individual i (=1 if female, 0 = male),
        = individual i’s age (measured in years), and
        = individual i’s education (measured in years).
The model estimated with a sample of 166 Australian workers has yielded the following regression results:
    SUMMARY OUTPUT
    
    
    
    
    
    Regression Statistics
    
    
    
    
    
    Multiple R
    0.3724
    
    
    
    
    
    R Square
    0.1387
    
    
    
    
    
    Adj R Square
    0.1118
    
    
    
    
    
    Std E
o
     XXXXXXXXXX
    
    
    
    
    
    Observations
    166
    
    
    
    
    
    
    
    
    
    
    
    
    ANOVA
    
    
    
    
    
    
     
    df
    SS
    MS
    F
    Significance F
    
    Regression
    5
     XXXXXXXXXX
     XXXXXXXXXX
    5.1527
    0.0002
    
    Residual
    160
     XXXXXXXXXX
    4036971
    
    
    
    Total
    165
     XXXXXXXXXX
     
     
     
    
    
    
    
    
    
    
    
     
    Coefficients
    Std E
o
    t Stat
    P-value
    Lower 95%
    Upper 95%
    Intercept
     XXXXXXXXXX
     XXXXXXXXXX
    -2.6651
    0.0085
     XXXXXXXXXX
     XXXXXXXXXX
    GEN
     XXXXXXXXXX
     XXXXXXXXXX
    0.5277
    0.5984
     XXXXXXXXXX
     XXXXXXXXXX
    EDU
     XXXXXXXXXX
    89.7156
    2.9600
    0.0035
    88.3756
     XXXXXXXXXX
    EDU*GEN
     XXXXXXXXXX
     XXXXXXXXXX
    -0.3463
    0.7296
     XXXXXXXXXX
     XXXXXXXXXX
    AGE
     XXXXXXXXXX
     XXXXXXXXXX
    3.2078
    0.0016
     XXXXXXXXXX
     XXXXXXXXXX
    AGE^2
    -4.4013
    1.4059
    -3.1307
    0.0021
    -7.1777
    -1.6248
Please answer the following three questions (Questions 9 through 11) in reference to the regression output provided on previous page.
Question 9: Interpret the estimated coefficients for GEN and EDU*GEN in Model (2).
XXXXXXXXXX2 marks each, 4 marks total)
(i) GEN
(ii) EDU*GEN
Question 10: In a two-dimensional diagram, sketch the relationship between INC and EDU as implied by the estimated regression equation and clearly indicate how this relationship differs between male and female workers of the same age. Clearly label the horizontal and vertical axes. (2 marks)

Question 11: Calculate the difference in the expected wage between a male and a female who are the same age and have 12 years of education. (2 marks)
Question 12: Suppose that a researcher has quarterly data on Gross Domestic Product (GDP) since 1980 to XXXXXXXXXXThe researcher has created a time variable t by setting t = 1 for the 1st quarter of 1980. Also, the researcher has created quarterly dummy variable,Q2 , and set its value such that Q 2,t = 1 if the observation t is from quarter 2 and Q 2,t = 0, otherwise. Similarly, the dummy variables are created for Q3 and Q4. Based on the data set created, the researcher has estimate a linear trend model with the time variable t and the three quarterly dummy variables to account for seasonal variation in the GDP. The relevant regression output is as follows:
    SUMMARY OUTPUT
    
    
    
    
    
    
    
    
    
    
    
    
    
    Regression Statistics
    
    
    
    
    
    Multiple R
     XXXXXXXXXX
    
    
    
    
    
    R Square
     XXXXXXXXXX
    
    
    
    
    
    Adjusted R Square
     XXXXXXXXXX
    
    
    
    
    
    Standard E
o
     XXXXXXXXXX
    
    
    
    
    
    Observations
    124
    
    
    
    
    
    
    
    
    
    
    
    
    ANOVA
    
    
    
    
    
    
     
    df
    SS
    MS
    F
    Significance F
    
    Regression
    4
    5.91937E+11
    1.4798E+11
     XXXXXXXXXX
    4.47E-97
    
    Residual
    119
     XXXXXXXXXX
     XXXXXXXXXX
    
    
    
    Total
    123
    6.05521E+11
     
     
     
    
    
    
    
    
    
    
    
     
    Coefficients
    Standard E
o
    t Stat
    P-value
    Lower 95%
    Upper 95%
    Intercept
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
    2.209E-72
     XXXXXXXXXX
     XXXXXXXXXX
    t
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
    1.9481E-99
     XXXXXXXXXX
     XXXXXXXXXX
    Q2
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
    Q3
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
    Q4
     XXXXXXXXXX
     XXXXXXXXXX
     XXXXXXXXXX
    4.24008E-13
     XXXXXXXXXX
     XXXXXXXXXX
Please answer the following question in reference to the information and regression output provided on previous page.
Question 12: Interpret the estimated coefficients for time variable (t), Q3 and forecast GDP in 3rd Quarter in XXXXXXXXXXmarks)
i) Interpret the estimated coefficient for time variable, t. (1 mark)
ii) Interpret the estimated coefficient for quarter 2, Q2. (1 mark)
iii) Forecast GDP in 3rd Quarter in 2012. Present the relevant equation, input and final result. (2 marks)
    Cover Page
Page 15 of 15
i
GEN
i
AGE
i
EDU
e
=+++´+++
2
012345
iiiiiiii
INCGENEDUEDUGENAGEAGE
i
INC

Sheet1
    Hprice    NOx    DIST    ROOMS    STRatio    CRIME
    240    5.38    4.09    6.57    15.3    0.006
    215.99    4.69    4.97    6.42    17.8    0.027
    347    4.69    4.97    7.18    17.8    0.027
    334    4.58    6.06    7    18.7    0.032
    361.99    4.58    6.06    7.15    18.7    0.069
    287.01    4.58    6.06    6.43    18.7    0.03
    229    5.24    5.56    6.01    15.2    0.088
    271    5.24    5.95    6.17    15.2    0.145
    165    5.24    6.08    5.63    15.2    0.211
    189    5.24    6.59    6    15.2    0.17
    150    5.24    6.35    6.38    15.2    0.225
    189    5.24    6.23    6.01    15.2    0.117
    217    5.24    5.45    5.89    15.2    0.094
    204    5.38    4.71    5.95    21    0.63
    182    5.38    4.46    6.1    21    0.638
    199    5.38    4.5    5.83    21    0.627
    231    5.38    4.5    5.93    21    1.054
    175    5.38    4.26    5.99    21    0.784
    202    5.38    3.8    5.46    21    0.803
    182    5.38    3.8    5.73    21    0.726
    136    5.38    3.8    5.57    21    1.252
    196    5.38    4.01    5.96    21    0.852
    152    5.38    3.98    6.14    21    1.232
    145    5.38    4.1    5.81    21    0.988
    156    5.38    4.4    5.92    21    0.75
    139    5.38    4.45    5.6    21    0.841
    166    5.38    4.68    5.81    21    0.672
    148    5.38    4.45    6.05    21    0.956
    184    5.38    4.45    6.49    21    0.773
    210    5.38    4.23    6.67    21    1.002
    127    5.38    4.23    5.69    21    1.131
    145    5.38    4.17    6.07    21    1.355
    132    5.38    3.99    5.95    21    1.388
    131    5.38    3.79    5.7    21    1.152
    135    5.38    3.76    6.1    21    1.613
    189    4.99    3.36    5.93    19.2    0.064
    200    4.99    3.38    5.84    19.2    0.097
    136.68    4.99    3.93    5.85    19.2    0.08
    247.01    4.99    3
Answered Same Day Nov 03, 2021

Solution

Rochak answered on Nov 03 2021
143 Votes
PART A (38 MARKS)
For this part, you examine how the housing values are affected by a number of variables such as air pollution, features of house and other socio-economic factors using the data set (“House_price_Data.xls”) provided on the Blackboard. This data file contains the data on the following variables from 506 communities in Australia:
Hpricei    =    median house price in community i (Hprice is measured in $1,000)
NOxi    =    Amount of nitrogen oxides (NOx) in the air (measured in parts per million or PPM)
DISTi    =    distance of community i from the state capital (DIST is measured in miles)
ROOMSi    =    Average number of rooms per house in community i
STRatioi    =    Average student-teacher ratio of schools in community i
CRIMEi    =    Crime committed (measured per 100 residents in community i)
Using the data set provided in the Excel file (“House_price_Data.xls”), transform the required variables to estimate a multiple regression model in which the natural log of Hprice is regressed on the following variables:
· NOx,
· NOx squared,
· DIST,
· DIST squared,
· the natural log of ROOMS,
· STRatio, and
· CRIME.
Answer Questions 1-8 in Part A in reference to the multiple regression output that you have obtained.
Question 1: Provide an output from estimating the multiple regression model specified above and state the estimated sample regression equation. (4 marks)
(i) Provide the output from estimation results (3 marks)
Answer:
    
(ii) State the estimated regression equation (1 mark)
Answer:
Regression Equation,
Hprice = 101.65 -55.72*NOx + 1.53*NOx squared – 37.46*DIST + 2.24*DIST squared + 391.55*ROOMS – 12.32*STRatio – 1.77*CRIME
Question 2: Interpret the estimated coefficients for INTERCEPT, lnROOMS and Nox in the context of the estimated regression model and comment
iefly on whether the estimated coefficients make sense. (2 marks each, 6 marks total)
(i) INTERCEPT
Answer: The Intercept is the expected mean of the independent variable which in this case is “Hprice”, when all the dependent variables are marked 0.
Yes, the coefficient which is estimated from the regression model makes sense because the following represent the factor by which each coefficient affects the independent variable.
(ii) lnROOMS
Answer: “lnROOMS” means that the ROOMS data has been naturally regressed which means that the data is made continuous with the usage of the natural log function.
(iii) Nox
Answer: NOx is the, “Amount of nitrogen oxides (NOx) in the air (measured in parts per million or PPM)”, which is a very is the amount of nitrogen oxides which is in the air, and this will affect the Hprice because this is an important component with a coefficient of -55.72.
Question 3: Following the steps provided below, test the hypothesis that the true population coefficient for DIST is negative at a 1% significance level. (4 marks total)
(i) State the null and alternative hypotheses. (1 mark)
Answer: Null Hypothesis: There is no relationship between the independent variables and all the Hprice
Alternative Hypothesis: There is a relationship between the independent variables and all the Hprice
(ii) Calculate the test statistics. (2 marks)
Answer: The test statistics for the regression run is “0.94”
(iii) Obtain the relevant critical value and complete the test (that is, determine whether you should reject or not reject the null hypothesis). (1 mark)
Answer: The p-value is more than the significance value at this point (p-value = 0.34), therefore we cannot reject the null hypothesis.
Question 4: Construct a 99% confidence interval of the coefficient for CRIME and interpret this interval.
(5 marks total)
(i) Provide the relevant critical value and construct the confidence interval manually using the estimated coefficient and standard e
or in your regression output (do not use an option available in Excel’s regression command to construct the confidence interval). (3 marks)
Answer:
Critical Values,
Sample Mean = 3.61
Sample Standard Deviation = 8.59
Sample Size = 506
Confidence level value = 2.576
Confidence Interval = Sample Mean Confidence Level Value * (Sample Standard Deviation/SQRT(sample size))
= 3.61 2.576*(8.59/SQRT(506))
= 3.61 0.98
Therefore, Confidence Interval (2.63, 4.59)
(ii) Interpret the confidence interval. (1 mark)
Answer: The confidence interval gives us the probability that 99% of the parameters or output will fall withing the range which is (2.63, 4.59)
(iii) State what the confidence interval suggests about the significance of the coefficient for CRIME.
(1 mark)
Answer: It is significant is what the confidence interval states about the coefficient for CRIME
Question 5: Answer the following four questions regarding the relationship between lnHPrice and DIST as implied by the estimated regression model. (5 marks total)
(i) Sketch the implied relationship between lnHPrice and DIST in a two-dimensional diagram. Clearly label the horizontal and vertical axes. (1...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here