Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

TERM PROJECT Aim: Data analysis, transformation, model development and assessment. Please submit a word document that includes all of your answers for each question and submit your R scripting files...

1 answer below »
TERM PROJECT
Aim: Data analysis, transformation, model development and assessment. Please submit a word document that includes all of your answers for each question and submit your R scripting files with the report.
Data files: MidWest_balanced and Prediction Scoring
MidWest Airlines has entered into a partnership with the wireless phone company Telcom to sell the latter's service via direct mail. The MidWest-Airlines data file contains a subset of a data sample who has already received a test offer. About 13% accepted. You are asked to develop a model to classify MidWest customers as to whether they purchase a wireless phone service contract (target variable Phone_sale), that can be used to predict classifications for additional customers.
1. Explain the role of Data Dictionary below using the CRISP-DM process for a successful predictive analytics project in the MidWest Co.
2. First, transform variables based on the data dictionary given below (as.factor etc) and form a dataframe in RStudio. Present summary statistics for Balance, Bonus_miles, and Topflight variables.
3. Develop a logistic regression model to predict customer purchase behavior (Target variable= Phone_sale). Report the model output. Which factors are statistically significant predictors of the purchase decision. Present Odds Ratios of the significant variables with 95% Confidence Intervals. Plot the ROC curve of the model.
4. Use a step() function in R for feature selection. Which factors are significant in this final model? Present the Odds Ratios for these variables. Plot the ROC curve of the model.
5. Construct a Neural Network model. Change the number of nodes in the hidden layer from 20 to 50 and rerun the model. Plot the ROC curves of both models.
6. Fill the following table for the models developed.
7. Which model would you suggest marketing department should use for targeting customers? Summarize your results in a discussion.
8. Use one of the models you developed above and the provided records of new customers in the PredictionScoring.xlsx data file for predicting if they would by the Telcom service as a result of the direct mailing campaign or not.
Supplement: Data Dictionary
    FIELD NAME
    DATA TYPE
    MAX DATA LENGTH
    RAW DATA OR TELCOM CREATED FIELD?
    DESCRIPTION
    ID#
    NUMBER
     
    Telcom
    Unique ID
    Topflight
    CHAR
    1
    Raw
    Indicates whether flyer has attained elite "Topflight" status, 1 = yes, 0 = no
    Balance
    NUMBER
    8
    Raw
    Number of miles eligible for award travel
    Qual_miles
    NUMBER
    8
    Raw
    Number of miles counted as qualifying for Topflight status
    cc1_miles?
    CHAR
    1
    Raw
    Has member earned miles with airline freq. flyer credit card in the past 12 months (1=Yes/0=No)?
    cc2_miles?
    CHAR
    1
    Raw
    Has member earned miles with Rewards credit card in the past 12 months (1=Yes/0=No)?
    cc3_miles?
    CHAR
    1
    Raw
    Has member earned miles with Small Business credit card in the past 12 months (1=Yes/0=No)?
    Bonus_miles
    NUMBER
     
    Raw
    Number of miles earned from non-flight bonus transactions in the past 12 months
    Bonus_trans
    NUMBER
     
    Raw
    Number of non-flight bonus transactions in the past 12 months
    Flight_miles_12mo
    NUMBER
     
    Raw
    Number of flight miles in the past 12 months
    Flight_trans_12
    NUMBER
     
    Raw
    Number of flight transactions in the past 12 months
    Online_12
    NUMBER
     
    Raw
    Number of online purchases within the past 12 months
    Email
    CHAR
    1
    Raw
    E-mail address on file. 1= yes, 0 = no?
    Club_member
    NUMBER
     
    Telcom
    Member of the airline's club (paid membership), 1=yes, 0=no
    Any_cc_miles_12mo
    NUMBER
     
    Telcom
    Dummy variable indicating whether member added miles on any credit card type within the past 12 months (1='Y', 0='N')
    Phone_sale
    NUMBER
     
    Telcom
    Dummy variable indicating whether member purchased Telcom service as a result of the direct mail campaign (1=sale, 0=no sale)

ID#,Topflight,Balance,Qual_miles,cc1_miles?,cc2_miles?,cc3_miles?,Bonus_miles,Bonus_trans,Flight_miles_12mo,Flight_trans_12,Online_12,Email,Club_member,Any_cc_miles_12mo,Phone_sale
1,0,28143,0,0,1,0,174,1,0,0,0,1,0,1,0
2,0,19244,0,0,0,0,215,2,0,0,0,0,0,0,0
3,0,41354,0,1,0,0,4123,4,0,0,0,1,0,1,0
4,0,14776,0,0,0,0,500,1,0,0,0,1,0,0,0
5,1,97752,0,1,0,0,43300,26,2077,4,0,1,0,1,0
6,0,16420,0,0,0,0,0,0,0,0,0,0,1,0,0
7,0,84914,0,1,0,0,27482,25,0,0,0,1,1,1,0
8,1,20856,0,0,0,0,5250,4,250,1,0,1,0,0,0
9,1,443003,0,1,1,0,1753,43,3850,12,0,1,0,1,0
10,1,104860,0,1,0,0,28426,28,1150,3,0,1,0,1,0
11,0,40091,0,1,0,0,7278,10,0,0,0,1,0,1,0
12,1,96522,0,1,0,0,61105,19,0,0,1,1,1,1,1
13,0,43382,0,1,0,0,11150,20,0,0,0,1,0,1,0
14,0,43097,0,0,1,0,3258,6,0,0,0,1,0,1,0
15,0,17648,0,0,0,0,0,0,0,0,0,1,0,0,0
16,0,28495,0,1,0,0,49442,15,0,0,0,1,1,1,0
17,0,51890,0,1,0,0,48963,16,0,0,0,1,0,1,0
18,0,13958,0,0,0,0,4291,5,0,0,0,1,0,0,0
19,1,91473,0,1,0,0,27408,17,0,0,0,1,0,1,0
20,0,23354,0,1,0,0,10447,5,0,0,0,1,0,1,0
21,1,120576,0,1,0,0,58831,23,250,2,0,1,0,1,1
22,1,185681,2024,0,0,0,13300,16,1800,9,0,1,0,0,0
23,0,20584,0,0,0,0,3450,11,3450,11,0,0,0,0,0
24,0,66275,0,0,0,0,2533,11,150,1,0,1,0,0,0
25,0,205651,500,0,0,0,4025,21,700,4,1,1,0,0,0
26,0,20726,0,0,0,0,1375,4,0,0,0,0,0,0,0
27,0,18521,0,0,0,0,1227,2,1227,2,0,0,0,0,0
28,0,8828,0,0,0,0,0,0,0,0,0,0,0,0,0
29,1,59763,0,1,0,0,33772,20,100,1,0,1,0,1,0
30,0,19221,0,0,0,0,4655,8,500,1,0,0,0,0,0
31,0,177926,0,1,0,0,20797,13,0,0,0,1,0,1,0
32,0,10021,0,0,0,0,0,0,0,0,0,0,0,0,1
33,0,276571,0,1,0,0,42044,23,0,0,0,1,0,1,0
34,0,18047,0,0,0,0,100,1,0,0,0,1,0,0,0
35,0,43832,0,0,0,0,0,0,0,0,0,1,0,0,0
36,1,123759,0,1,0,0,50572,23,2500,5,0,1,1,1,1
37,0,23193,0,0,0,0,650,4,550,3,0,0,0,0,1
38,0,68666,0,0,0,0,350,2,350,2,0,0,1,0,0
39,1,59990,0,1,0,0,27878,17,0,0,0,1,1,1,0
40,0,2176,0,0,0,0,0,0,0,0,0,0,0,0,1
41,0,34616,0,0,0,0,1750,4,500,1,0,0,0,0,0
42,1,10470,0,1,0,0,38094,26,0,0,0,0,1,1,0
43,1,60313,0,0,0,0,10000,26,3250,9,0,1,0,0,0
44,0,619393,0,1,0,0,15008,14,0,0,0,1,0,1,0
45,0,121260,0,1,0,0,18493,18,0,0,0,1,0,1,0
46,1,288865,967,0,0,0,23600,14,2000,4,0,1,0,0,0
47,0,92336,0,1,0,0,11214,6,0,0,0,1,0,1,0
48,1,36924,0,0,0,0,5900,6,300,2,0,1,0,0,0
49,1,70312,0,1,0,0,34678,24,500,1,0,1,0,1,0
50,0,17051,0,0,0,0,1150,4,1150,4,0,1,0,0,0
51,0,108137,0,0,0,0,6368,5,6368,5,0,0,0,0,0
52,0,1300,0,0,0,0,370,1,0,0,0,0,0,0,0
53,0,118531,0,1,0,0,44577,38,0,0,0,0,0,1,1
54,1,121395,0,0,0,0,4970,8,650,2,2,1,0,0,0
55,0,38348,0,0,0,0,0,0,0,0,0,1,1,0,0
56,0,14448,0,0,0,0,1625,6,0,0,0,0,0,0,0
57,0,75971,0,1,0,0,34339,14,0,0,0,1,0,1,0
58,0,36298,0,0,0,0,3100,5,600,3,0,1,0,0,1
59,0,38077,0,1,0,0,34024,8,0,0,0,1,0,1,0
60,0,95118,0,1,0,0,23188,23,2200,7,0,0,0,1,0
61,0,134457,0,1,0,0,15588,14,0,0,0,1,0,1,0
62,0,19918,0,1,0,0,17601,11,0,0,0,1,0,1,0
63,0,10120,0,0,0,0,0,0,0,0,0,0,0,0,0
64,1,362642,0,1,0,0,28079,8,0,0,0,1,0,1,0
65,0,80250,0,0,0,0,895,2,0,0,0,1,0,0,0
66,1,53914,0,1,0,0,33767,45,5550,29,0,1,0,1,1
67,0,9375,0,0,0,0,1750,7,0,0,0,0,0,0,0
68,0,83237,0,1,0,0,35287,18,0,0,0,1,0,1,0
69,0,230715,0,1,0,0,24047,12,0,0,0,0,0,1,0
71,0,47457,0,1,0,0,12621,16,0,0,0,0,0,1,0
72,1,84409,5031,1,0,0,15436,16,1150,4,0,1,0,1,0
73,0,252386,0,1,0,0,39787,13,0,0,0,1,1,1,0
74,0,550367,0,1,0,0,12500,13,50,1,0,0,1,1,0
75,0,12646,0,0,0,0,631,4,631,4,0,0,0,0,0
76,0,123867,0,1,0,0,25308,17,0,0,0,1,0,1,0
78,0,129871,0,1,0,0,15776,22,0,0,0,0,0,1,0
79,1,109380,0,1,0,0,7537,16,0,0,0,1,1,1,0
80,0,111157,0,1,0,0,32883,19,0,0,0,0,0,1,0
81,0,49238,0,1,0,0,38037,18,0,0,0,0,0
Answered 7 days After Oct 06, 2021

Solution

Mohd answered on Oct 11 2021
129 Votes
TERM PROJECT
Aim: Data analysis, transformation, model development and assessment. Please submit a word document that includes all of your answers for each question and submit your R scripting files with the report.
Data files: MidWest_balanced and Prediction Scoring
MidWest Airlines has entered into a partnership with the wireless phone company Telcom to sell the latter's service via direct mail. The MidWest-Airlines data file contains a subset of a data sample who has already received a test offer. About 13% accepted. You are asked to develop a model to classify MidWest customers as to whether they purchase a wireless phone service contract (target variable Phone_sale), that can be used to predict classifications for additional customers.
1. Explain the role of Data Dictionary below using the CRISP-DM process for a successful predictive analytics project in the Midwest Co.
A. Data dictionary provides blue prints for data type, sample values and it's meaning in different business context.  More we have clarity about our dataset more clarity will be in our analysts and results. For model selection, we must have clear understanding of datatype. In data The MidWest-Airlines. We have response variable as character type datatype. We will transform this datatype into factor for logistic regression.  In that case we have two value 0 and 1 and two levels.
B. First, transform variables based on the data dictionary given below (as.factor etc) and form a dataframe in...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here