Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

In week 6 we will look at a selection of regression problems: how to detect them what their consequences are, and what are some potential remedies. We will cover:Omitted variables, where we leave an...

1 answer below »
In week 6 we will look at a selection of regression problems: how to detect them what their consequences are, and what are some potential remedies. We will cover:Omitted variables, where we leave an x-variable out of our regression but we shouldn't have;Irrelevant variables, where we include x-variables in our regression, but we shouldn't have;Non-linear relationships between our y-variable and our x-variable(s)Non-constant error variance, where some of our observations are more difficult to predict than othersSerial correlation, where observations occur in some order, and are related to other observations close to them in orderFor our homework, we'll consult for a supermarket chain that has been accused of sexist salary practices, and try to discover why the data make it look that way.
Answered 4 days After Jun 19, 2022

Solution

Subhanbasha answered on Jun 24 2022
95 Votes
Report of Fresh Market Analysis
Introduction:
    The main theme or aim of this project or report is that to find the summary from the market data that is about the employees’ salaries and finding the insights about the trend and many more aspects of the marketing firms of the give data which will help to understand the past behaviour of the market firm employees’ salaries. It is also helpful to us finding the trend and the dependencies of various factors to get the expected result from the scenarios.
    At last, we are going to do give the recommendation to further to the market where we will achieve their aimed goals regarding the policies. The factors may be varied according to the data but still we are going to address some recommendations to the industry. So, the main analysis all about the using statistical analysis for the data to make the executive summary about the market and finally recommendations of the analysis.
    So, here we are given the data that is all about the marketing data that means marketing firm employees salaries based on their various factors. The salaries may depend on the factors of the employee but there is some rules and policies about the salary given to the employee so that it may not go beyond the policies.
Data:
    The data given for the analysis is all about the salaries of marketing firm employees where they may be differences in getting salaries on some unique qualities, but it is inbounded of the policies. The HR departments of any firm will maintain the policies about to follow the firm where some of the factors should not be considered in salary allocation to the employee.
    The data having the columns about the employee factors that is age of the employee, Experience of the employee, education of the employee, college of the employee, race of the employee, Gender of the employee, age of the employee and Dexscore of the employee. There are the factors collected by the data collection process with the salaries received by the employee so, the salary may depend on these factors but not sure without any statistical proof we can’t say that there is significance difference.
The above plot is about the relation between the salary and age here we can clearly identify that there is some positive co
elation but not significant co
elation present between the factors so that we have some dependency in the data.
The above plot is about the relation between the salary and Experience here we can clearly identify that there is some positive co
elation but not significant co
elation present between the factors so that we have some dependency in the data. But any how there will be positive relation between the experience and salary. In real time also we can prove that there will be positive relation.
Executive Summary:
    Summary statistics:
    Average of Salary
    Â 
    Gende
    Total
    Female
    31.97142857
    Male
    35.01612903
    Grand Total
    33.91752577
The above table is about the summary of the data that is average salary by the gender. By observing the table, the average salary is high for gender male. The difference is around 3 but can’t say that is significant difference. By using the statistical analysis, we can do conclusion about the claim.
    Average of Salary
    Â 
    Gende
    Race
    Total
    Female
    Maori
    29.75
    Â 
    Pacific Islande
    32.21666667
    Â 
    White
    32.48695652
    Female Total
    31.97142857
    Male
    Maori
    32.88888889
    Â 
    Pacific Islande
    36.33846154
    Â 
    White
    35.065
    Male Total
    Â 
    35.01612903
    Grand Total
    Â 
    33.91752577
The above summary table is all about the average salary by the gender and race. That means for checking the is there any differences of salaries about the race and gender. By observing this in the female category Maori race category receiving the low salary comparing to other two categories. In the Male category also Maori category receiving the low salary comparing to the other two categories. By comparing gender of the same race category male Maori is receiving high salary comparing the Female Maori race employees. In the gender level male is receiving high and in the race level Maori is receiving the lowest salary comparing to other two categories.
Analysis:
Here next we are going to analyse the co
elation analysis from the data.
    Â 
    Salary
    Age
    Experience
    HS Dummy
    College Dummy
    Maori Dummy
    PI Dummy
    Gender code
    DexScore
    Salary
    1
    
    
    
    
    
    
    
    
    Age
    0.329429
    1
    
    
    
    
    
    
    
    Experience
    0.410084
    0.108396
    1
    
    
    
    
    
    
    HS Dummy
    0.071114
    -0.07102
    0.051067
    1
    
    
    
    
    
    College Dummy
    -0.01506
    -0.28562
    0.138811
    0.235542
    1
    
    
    
    
    Maori Dummy
    -0.16714
    -0.17018
    -0.08693
    -0.13816
    0.158203
    1
    
    
    
    PI Dummy
    0.094513
    0.01521
    -0.0206
    0.083021
    0.002422
    -0.21109
    1
    
    
    Gender code
    -0.25016
    -0.0394
    -0.48659
    0.07215
    -0.07807
    0.034889
    -0.04628
    1
    
    DexScore
    -0.04833
    -0.06622
    -0.04209
    -0.06213
    -0.16561
    -0.05373
    0.075916
    0.204363
    1
The above tables are all about the co
elation between the variables which will show the impact of the independent variable on the dependent variable.
In the next step we are going to do the analysis of the regression from the data salary as dependent variable and all other variables as independent variables. We can observe the regression line and the impact of each independent variable on the dependent variable. We also get the measure of the regression which will give us the accuracy of the model fit.
The model fit coefficient table as follows
    Â 
    Coefficients
    Standard E
o
    t Stat
    P-value
    Intercept
    10.04996
    7.440576
    1.350697
    0.180256
    Age
    0.794461
    0.272414
    2.916375
    0.004492
    Experience
    1.234901
    0.395949
    3.118843
    0.002454
    HS Dummy
    0.725685
    1.125575
    0.644724
    0.520783
    College Dummy
    0.229853
    2.69283
    0.085357
    0.932171
    Maori Dummy
    -1.03676
    1.569708
    -0.66048
    0.51067
    PI Dummy
    1.109231
    1.38425
    0.801323
    0.425103
    Gender code
    -0.88966
    1.314183
    -0.67697
    0.500203
    DexScore
    -0.00882
    0.196478
    -0.04487
    0.964314
The Anova table of the regression analysis as follows
    ANOVA
    
    
    
    
    
    Â 
    df
    SS
    MS
    F
    Significance F
    Regression
    8
    900.8593
    112.6074
    4.10642
    0.000339
    Residual
    88
    2413.161
    27.42228
    
    
    Total
    96
    3314.02
    Â 
    Â 
    Â 
The measures of the accuracy table as follows
    Regression Statistics
    Multiple R
    0.521376
    R Square
    0.271833
    Adjusted R Square
    0.205636
    Standard E
o
    5.236629
    Observations
    97
By observing the above accuracy table, the model is not that much expected good fit for the data.
By observing the coefficient table also, the factors or variable Age and Experience have the significant effect in the model means that these variables are useful in the model. There is staring significant relation between these variables in the model. And all the other variables are not significant with 95% confidence.
So we can omit the variables other than the Experience and age in the model to get the best model fit of the data.
Next we are going to analyse the residual and fitted plots.
By observing the above residual plot about the age is almost normally distributed but not exactly.
By observing the above residual plot about the Experience is almost right tailed distributed but not normally distributed although this variable is significant in the model means it is explaining the variance present in the dependent variable.
This is the fit plot of the age and salary by observing this plot we can say that there is significant linear relation between the age and salary where in the model this is significant on the salary dependent variable.
This is the fit plot of the Experience and salary by observing this plot we can say that there is linear relation between the Experience and salary but not significant. where in the model this is significant on the salary dependent variable.
Here also we can see some of the non-significant variables fitted plots about the dependent variable salary.
We can clearly observe that the relation between HS Dummy and salary is not linear this is nonlinear relation is because of the HS Dummy is derived variable and categorical so the relation may not be linear.
From the above fit plot the relation between the Dexscore and salary is linear and the predicted values also within the values of the original salary values.
But we can clearly say that there is some of the variables which we mentioned earlier is not significant.
Conclusion/Recommendations:
    The conclusion is that from the above all analysis there is the variables which involved in the model not significant. So, we need to remove those in the model, or we can do separate treatment that means normalizing the variables so that they may have the linear relation with the dependent variable salary. Or we can fit the nonlinear regression model to the data.
    Here the recommendation is that except...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here