You must submit 1 word document with your screenshots and answers. You should also submit the file...

Question

You must submit 1 word document with your screenshots and answers. You should also submit the file where you conducted your statistical analysis (Excel, R, Python, etc.) ______________________________________________________________________________ Consider yourself working for a global retailer that over the years has added a web-based channel to their physical store locations. Now, after learning more about mobile-led changes in retailing, they are excited about what the mobile ecosystem offers. They are seeking your help as they embark on using mobile as a channel. They want to commission an app development team to deploy a presence on iOS and Android. However, several questions arise about the deployment of the app. Your job is to provide data driven insights to help them navigate this complex landscape. Specifically, you are tasked with:
1. Using the data, estimate a linear model for the relationship between demand and price. For this you have access to a large volume of app level data (in a file called assignment_2_apps.csv), including information about the ‘rank’ of the app on the app store. Assume rank = (1/sales)*1,000,000 (don’t wo
y about the details behind this assumption, just make the assumption). Specifically, estimate a univariate regression where the dependent variable is sales and the independent variable is price. Provide a screen shot of both estimated coefficients and the associated P-Values. Briefly explain the interpretation of the coefficient associated with price. Provide an explanation for what the P-Value (associated with the price coefficient) indicates (be very specific). (2 Points)
2. Create a dummy
inary variable for region. This variable should have a value of 1 if the region is CN (China) and 0 if the region is US (USA). Estimate a univariate regression of sales on this newly created variable. Provide a screenshot and an interpretation of both estimated coefficients. Be specific. (2 Points)
3. Create another dummy
inary variable for in app advertisements. This variable should have a value of 1 if the device has in app advertising and a value of 0 if the device does NOT have in app advertising. Estimate a regression of sales on the dummy variable created in part 2 and this newly created dummy variable (all in the same model). Provide a screenshot of the results and provide an interpretation of all the coefficients. (2 Points)
4. Estimate a univariate regression of sales on rank (similar to part 1) except in this case your model should able to speak in terms of elasticity. By elasticity you want to speak to your management in percentage terms – what is the % change in sales for a % increase in price? (Tip: we do this using log-log-regression models.) Since price can have a value of 0, you will have to adjust the variable. You can do this by adding 1 to each price and then taking the log. Provide a screenshot of the results and provide an interpretation for the coefficients. (https:
stats.idre.ucla.edu/othe
multpkg/faq/general/faqhow-do-i-interpret-a-regression-model-when-some-variablesare-log-transformed/) (2 Points)
5. The app retailer believes that other factors, specifically the app store, the filesize, the number of screenshots, and the average rating may also be associated with sales. The retailers want a model that estimates the relationship between price and sales (similar to 4) except they want the impact of the above mentioned factors (app store, filesize, etc.) to be controlled for. Estimate a model that accomplishes this. Your model should speak in terms of elasticity. Provide screenshots of your results and discuss how this model achieves what the retailers want. Provide an interpretation of all the estimated coefficients. (3 Points)
6. The retailer is also interested in understanding the impact of the in-app purchase option. Specifically, the retailer believes that the relationship between price and sales is different for apps with an in-app purchase option and apps without an in-app purchase option. They are wo
ied that if they just create two subsamples (with and without the in-app purchase option) they will not adequately capture the variation in the other variables beyond price. Therefore, they want to estimate a model that allows the relationship between price and sales to be different based upon whether the app has in-app purchasing or not. The model should speak in elasticity terms. Do NOT control for the variables that you controlled for in problem 5. Write the model you will estimate. Estimate this model and present a screenshot of the results. Provide an interpretation of all the estimated coefficients. (4 Points)

assignment2-t1mjviud.docx assignment2apps-buotfh5i.csv

Sourav · Accepted Answer

Solution:- 1)
> data1  sales = 1000000/data1$rank
> summary(sales)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2451    3571    5714   20444   11905 1000000 
> summary(data1$price)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   0.000   0.000   1.341   0.990  99.990 
# detecting and removing outliers
> minmax1 = mean(sales) + (c(-3,3)*sqrt(var(sales)))
> minmax2 = mean(data1$price) + (c(-3,3)*sqrt(var(data1$price)))
> minmax1
[1] -205764.1  246652.8
> minmax2
[1] -9.60199 12.28468
> pos1 = which(sales > minmax1[2],arr.ind = T);head(pos1)
[1] 220 376 384 419 468 480
> pos2 = which(data1$price > minmax2[2],arr.ind = T);head(pos2)
[1] 24844 24845 24846 24847 24848 24849
> data2 = data1[c(-pos1,-pos2),]
> sales = sales[c(-pos1,-pos2)]
> # Hence there are no outliers.
> #  Building a univariate linear regression model of sales on price
> # Ho: The model is not significant
> # H1: The model is significant
> reg = lm(sales~data2$price)
> summary(reg)
Call:
lm(formula = sales ~ data2$price)
Residuals:
   Min     1Q Median     3Q    Max 
-10842  -9417  -7269  -1446 189782 
Coefficients:
                    Estimate     Std. Error  t value   Pr(>|t|)    
(Intercept)     13293.15     167.73    79.251    #Solution 2.
> summary(data2$region)
   CN    US 
14237 10243 
> str(data2$region)
 Factor w/ 2 levels "CN","US": 1 1 1 1 1 2 1 1 2 1 ...
> # Creating a dummy variable for region 
> dummyregion = data2$region
> library(plyr)
> dummyregion = revalue(dummyregion, c("CN"=1))
> dummyregion = revalue(dummyregion, c("US"=0))
> str(dummyregion)
 Factor w/ 2 levels "1","0": 1 1 1 1 1 2 1 1 2 1 ...
> #2 Building a univariate linear regression model of sales on region
> # Ho: The model is not significant
> # H1: The model is significant
> reg2 = lm(sales~dummyregion)
> summary(reg2)
Call:
lm(formula = sales ~ dummyregion)
Residuals:
   Min     1Q Median     3Q    Max 
-11727  -9127  -7185  -1438 187923 
Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   12076.8      192.1  62.861   #------------------------------------------------------------------
#Solution 3.

You must submit 1 word document with your screenshots and answers. You should also submit the file where you conducted your statistical analysis (Excel, R, Python, etc.)...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment