Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

You must submit 1 word document with your screenshots and answers. You should also submit the file where you conducted your statistical analysis (Excel, R, Python, etc.)...

1 answer below »
You must submit 1 word document with your screenshots and answers. You should also submit the file where you conducted your statistical analysis (Excel, R, Python, etc.) ______________________________________________________________________________ Consider yourself working for a global retailer that over the years has added a web-based channel to their physical store locations. Now, after learning more about mobile-led changes in retailing, they are excited about what the mobile ecosystem offers. They are seeking your help as they embark on using mobile as a channel. They want to commission an app development team to deploy a presence on iOS and Android. However, several questions arise about the deployment of the app. Your job is to provide data driven insights to help them navigate this complex landscape. Specifically, you are tasked with:
1. Using the data, estimate a linear model for the relationship between demand and price. For this you have access to a large volume of app level data (in a file called assignment_2_apps.csv), including information about the ‘rank’ of the app on the app store. Assume rank = (1/sales)*1,000,000 (don’t wo
y about the details behind this assumption, just make the assumption). Specifically, estimate a univariate regression where the dependent variable is sales and the independent variable is price. Provide a screen shot of both estimated coefficients and the associated P-Values. Briefly explain the interpretation of the coefficient associated with price. Provide an explanation for what the P-Value (associated with the price coefficient) indicates (be very specific). (2 Points)
2. Create a dummy
inary variable for region. This variable should have a value of 1 if the region is CN (China) and 0 if the region is US (USA). Estimate a univariate regression of sales on this newly created variable. Provide a screenshot and an interpretation of both estimated coefficients. Be specific. (2 Points)
3. Create another dummy
inary variable for in app advertisements. This variable should have a value of 1 if the device has in app advertising and a value of 0 if the device does NOT have in app advertising. Estimate a regression of sales on the dummy variable created in part 2 and this newly created dummy variable (all in the same model). Provide a screenshot of the results and provide an interpretation of all the coefficients. (2 Points)
4. Estimate a univariate regression of sales on rank (similar to part 1) except in this case your model should able to speak in terms of elasticity. By elasticity you want to speak to your management in percentage terms – what is the % change in sales for a % increase in price? (Tip: we do this using log-log-regression models.) Since price can have a value of 0, you will have to adjust the variable. You can do this by adding 1 to each price and then taking the log. Provide a screenshot of the results and provide an interpretation for the coefficients. (https:
stats.idre.ucla.edu/othe
multpkg/faq/general/faqhow-do-i-interpret-a-regression-model-when-some-variablesare-log-transformed/) (2 Points)
5. The app retailer believes that other factors, specifically the app store, the filesize, the number of screenshots, and the average rating may also be associated with sales. The retailers want a model that estimates the relationship between price and sales (similar to 4) except they want the impact of the above mentioned factors (app store, filesize, etc.) to be controlled for. Estimate a model that accomplishes this. Your model should speak in terms of elasticity. Provide screenshots of your results and discuss how this model achieves what the retailers want. Provide an interpretation of all the estimated coefficients. (3 Points)
6. The retailer is also interested in understanding the impact of the in-app purchase option. Specifically, the retailer believes that the relationship between price and sales is different for apps with an in-app purchase option and apps without an in-app purchase option. They are wo
ied that if they just create two subsamples (with and without the in-app purchase option) they will not adequately capture the variation in the other variables beyond price. Therefore, they want to estimate a model that allows the relationship between price and sales to be different based upon whether the app has in-app purchasing or not. The model should speak in elasticity terms. Do NOT control for the variables that you controlled for in problem 5. Write the model you will estimate. Estimate this model and present a screenshot of the results. Provide an interpretation of all the estimated coefficients. (4 Points)
Answered Same Day Nov 26, 2021

Solution

Sourav answered on Nov 30 2021
136 Votes
Solution:- 1)
data1 <- read.csv(file.choose(),header = T)
# Obtaining the value of sales from rank value
sales = 1000000/data1$rank
summary(sales)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2451 3571 5714 20444 11905 1000000
summary(data1$price)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 0.000 0.000 1.341 0.990 99.990
# detecting and removing outliers
minmax1 = mean(sales) + (c(-3,3)*sqrt(var(sales)))
minmax2 = mean(data1$price) + (c(-3,3)*sqrt(var(data1$price)))
minmax1
[1] -205764.1 246652.8
minmax2
[1] -9.60199 12.28468
pos1 = which(sales > minmax1[2],a
.ind = T);head(pos1)
[1] 220 376 384 419 468 480
pos2 = which(data1$price > minmax2[2],a
.ind = T);head(pos2)
[1] 24844 24845 24846 24847 24848 24849
data2 = data1[c(-pos1,-pos2),]
sales = sales[c(-pos1,-pos2)]
# Hence there are no outliers.
# Building a univariate linear regression model of sales on price
# Ho: The model is not significant
# H1: The model is significant
reg = lm(sales~data2$price)
summary(reg)
Call:
lm(formula = sales ~ data2$price)
Residuals:
Min 1Q Median 3Q Max
-10842 -9417 -7269 -1446 189782
Coefficients:
Estimate Std. E
or t value Pr(>|t|)
(Intercept) 13293.15 167.73 79.251 < 2e-16 ***
data2$price -307.86 74.39 -4.139 3.5e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard e
or: 22940 on 24478 degrees of freedom
Multiple R-squared: 0.0006993,    Adjusted R-squared: 0.0006584
F-statistic: 17.13 on 1 and 24478 DF, p-value: 3.505e-05
# Conclusion
Since estimated coefficient of price is -307.86 which indicate that 1unit change
in price will change -307.86 unit in sales.
Since p-value = 3.5e-05 < 0.05 therefore price is significant.
    # ------------------------------------------------------------------------------------
#Solution 2.
summary(data2$region)
CN US
14237 10243
str(data2$region)
Factor w/ 2 levels "CN","US": 1 1 1 1 1 2 1 1 2 1 ...
# Creating a dummy variable for region
dummyregion = data2$region
li
ary(plyr)
dummyregion = revalue(dummyregion, c("CN"=1))
dummyregion = revalue(dummyregion, c("US"=0))
str(dummyregion)
Factor w/ 2 levels "1","0": 1 1 1 1 1 2 1 1 2 1 ...
#2 Building a univariate linear regression model of sales on region
# Ho: The model is not significant
# H1: The model is significant
reg2 = lm(sales~dummyregion)
summary(reg2)
Call:
lm(formula = sales ~ dummyregion)
Residuals:
Min 1Q Median 3Q Max
-11727 -9127 -7185 -1438 187923
Coefficients:
Estimate Std. E
or t value Pr(>|t|)
(Intercept) 12076.8 192.1 62.861 < 2e-16 ***
dummyregion0 2101.0 297.0 7.074 1.55e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard e
or: 22920 on 24478 degrees of freedom
Multiple R-squared: 0.00204,    Adjusted R-squared: 0.001999
F-statistic: 50.04 on 1 and 24478 DF, p-value: 1.545e-12
# Here estimated coefficient of "US" is given while country China is taken as reference country
since p-value is 1.55e-12 which shows that variable 'region' is significant.
    
    
#------------------------------------------------------------------
#Solution 3.
summary(data2$in_app_ads)
IN_APP_ADS NO_IN_APP_ADS
8034 16446
str(data2$in_app_ads)
Factor w/ 2 levels "IN_APP_ADS","NO_IN_APP_ADS": 2 2 2 2 1 1 2 2 2 2 ...
#Creating a dummy variable for in_app_ads
dummyin_app_ads = data2$in_app_ads
dummyin_app_ads = revalue(dummyin_app_ads, c("NO_IN_APP_ADS"=0))
dummyin_app_ads = revalue(dummyin_app_ads, c("IN_APP_ADS"=1))
str(dummyin_app_ads)
Factor w/ 2 levels "1","0": 2 2 2 2 1 1 2 2 2 2 ...
#Building a linear regression model of sales on region and in_app_ads
# Ho: The model is not significant
# H1: The model is significant
reg3 = lm(sales~ dummyregion + dummyin_app_ads)
summary(reg3)
Call:
lm(formula = sales ~ dummyregion + dummyin_app_ads)
Residuals:
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here