Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Homework: This homework will test and develop your knowledge on predictive model construction and using their predictive performance for decision making. Please submit your answers for each section in...

1 answer below »
Homework: This homework will test and develop your knowledge on predictive model construction and using their predictive performance for decision making. Please submit your answers for each section in a word document with your explanations with a supporting R script file you used for this assignment.
Use the BankruptcyData.csv file and the BankruptcyNew.csv data files for this assignment. The attributes to develop a predictive model for predicting bankruptcy of firms. Data dictionary is provided below with the variables’ labels in the data sets and the long descriptive names.
    Attribute
No
    Short
Name
    Long
Name
    1
    fyea
    Fiscal yea
    2
    at
    Assets-total
    3
    bkvlps
    Book value per share
    4
    invt
    Inventories- total
    5
    Lt
    Liabilities-total
    6
    rect
    Receivables-total
    7
    cogs
    Cost of goods sold
    8
    dvt
    Dividends-Total
    9
    ebit
    Earnings before interests and taxes
    10
    gp
    Gross profit
    11
    ni
    Net income (loss)
    12
    oiadp
    Operating income after depreciation
    13
    revt
    Revenue- total
    14
    dvpsx_f
    Dividends per share -ex-date fiscal
    15
    mkvalt
    Market value -Total -Fiscal
    16
    prch_f
    Price High -Annual-Fiscal
    17
    bankruptcy
    Output (Y/N)
1. Are there any variables with missing values? If so, remove records with any missing value.
Write your script for checking and removing records with missing values.
2. Build a logistic regression model using the 70% of the data for training and 30% for the testing to predict bankruptcy in R. Present a summary table of the odds ratios for the significant predictors (p<0.05).
3. Use a variable selection algorithm to remove multicollinearity. Write the final model with the list of final predictors in the model.
4. Report the confusion matrix using the final logistic regression model and the test data. Calculate model accuracy, precision, sensitivity and F1 score.
5. Plot the ROC curve and report the AUC statistic.
6. Construct a decision tree model (called DT-1) to predict the bankruptcy with 70% of the data for training and 30% for the testing.
7. Calculate model accuracy, precision, sensitivity and F1 score. Plot the ROC curve with the AUC statistic reported.
8. For pruning the decision tree what complexity parameter value would you use? Prune the tree and construct the confusion matrix for the pruned model (called DT-2). Calculate model accuracy, precision, sensitivity and F1 score for DT-2. Plot the ROC curve with the AUC statistic reported.
9. Which model would you use for predicting the new applications? Explain why.
10. The credit organization received new applications with the attribute values in the new Bankrupcynew.csv data file. Use one of the models you constructed to predict if the companies will undergo bankruptcy or not. Report the resulting table.
BankruptcyData.csv
BankruptcyData
    fyear    at    bkvlps    invt    Lt    rectr    cogs    dvt    ebit    gp    ni    oiadp    revt    dvpsx_f    mkvalt    prch_f    Bankrupt
    2007    1.733    -4.3373    0    65.444    0    0    0.168    -1.009    0    -1.181    -1.009    0    0    4.4088    0.5    yes
    2006    2.614    -4.2455    0    64.976    0    0    0.179    -0.534    0.002    -0.121    -0.534    0.002    0    7.348    0.75    yes
    2005    4.193    -4.2252    0    66.256    0    0    0.113    -0.406    0    -0.448    -0.406    0    0    4.1149    0.65    yes
    2004    1.841    -4.1869    0    63.342    0    0    0.113    -0.357    0    -0.395    -0.357    0    0    4.4088    0.75    yes
    2008    0    -1.4509    0    0.062    0    0    0    -0.062    0    -0.062    -0.062    0    0    0.0256    0.09    no
    2007    0    -1.4266    0    0    0    0    0    0    0    -1.343    0    0    0    0.1559    0.1    no
    2008    2.406    -1.1355    0    62.004    0    0    0.146    -2.343    0    -2.791    -2.343    0    0    10.5026    0.6    yes
    2006    0    -1.0736    0    0.321    0    0    0    -0.142    0    -0.142    -0.142    0    0    0.1196    1.01    yes
    2002    3    -1.007    0    3.587    0    0    0    -1.154    0    -1.217    -1.154    0    0    2.997    3.75    no
    2002    2.643    -0.9861    0    12.25    0    0    0    -0.133    0.737    -1.278    -0.133    0.737    0    0.0974    0.02    yes
    2004    0    -0.7597    0    0.098    0    0    0    -0.061    0    -0.021    -0.061    0    0    0.1419    6.9231    yes
    2005    0    -0.6595    0    0.184    0    0    0    -0.198    0    -0.198    -0.198    0    0    0.0698    3.5    yes
    2005    2.137    -0.537    0    7.501    0    0    0    -1.041    0.074    -0.676    -1.041    0.074    0    1.9978    0.44    yes
    2004    3.367    -0.4692    0    8.055    0    0    0    -1.231    0.048    -1.231    -1.231    0.048    0    1.3989    0.45    yes
    2002    1.009    -0.4164    0    5.644    0    0.548    0    -8.807    0.396    -29.56    -8.807    0.944    0    1.6967    214    no
    2003    5.386    -0.346    0    8.843    0    0    0    -0.596    0.091    2.742    -0.596    0.091    0    2.9976    0.47    yes
    2002    0    -0.2529    0    1.446    0    0    0    -0.11    0    -0.097    -0.11    0    0    0.343    1.65    no
    2003    0.001    -0.1581    0    0.376    0    0    0    -0.821    0    -0.815    -0.821    0    0    0.3605    1.75    no
    2004    0.023    -0.1276    0    0.417    0    0    0    -0.065    0    -0.079    -0.065    0    0    0.9261    2.75    no
    2009    0.649    -0.1265    0    7.444    0    1.529    0    0.155    3.014    -4.551    0.155    4.543    0    0.7795    0.06    no
    2006    0.007    -0.1177    0    5.022    0    0    0    -0.523    0    -1.314    -0.523    0    0    0.4262    0.035    no
    2008    0.386    -0.0776    0    10.267    0    0    0    -0.449    0    -6.383    -0.449    0    0    0.1909    0.5    no
    2007    0.005    -0.075    0    0.792    0    0    0    -0.037    0    -0.037    -0.037    0    0    2.6248    0.8    no
    2009    0.007    -0.0526    0    1.335    0    0    0    -1.316    0    -1.363    -1.316    0    0    6.8132    2.99    no
    2003    0.009    -0.0521    0    0.614    0    0    0    -0.182    0    0.514    -0.182    0    0    2.9043    0.27    no
    2002    0    -0.0345    0    0.03    0    0.014    0    -0.012    -0.002    -0.014    -0.012    0.012    0    0.0608    7    no
    2002    0.128    -0.0308    0    1.011    0    0    0    -0.469    0    -0.505    -0.469    0    0    0.717    0.24    no
    2005    0.144    -0.0278    0    0.387    0    0    0    -1.089    0    -0.957    -1.089    0    0    3.6662    1.7    no
    2009    0    -0.0273    0    18.523    0    0    0    -4.671    0    -6.52    -4.671    0    0    19.1323    0.11    no
    2002    0.001    -0.0267    0    0.299    0    0    0    -0.777    0    -4.265    -0.777    0    0    0.3409    0.63    no
    2006    0.062    -0.0236    0    0.45    0    0    0    -1.539    0.012    -1.589    -1.539    0.012    0    5.7624    1.01    no
    2008    0    -0.0218    0    13.91    0    4.677    0    -4.625    -4.625    -5.361    -4.625    0.052    0    58.9719    0.155    no
    2007    0.5    -0.0217    0    2.517    0    2.31    0    -4.261    1.363    -5.116    -4.261    3.673    0    7.4292    1.25    yes
    2009    1.652    -0.0213    0    11.933    0    0    0    -0.441    0    3.736    -0.441    0    0    0.1931    0.02    no
    2004    0.002    -0.0213    0    0.411    0    0.296    0    -0.293    -0.293    -0.33    -0.293    0.003    0    5.3774    2    no
    2007    0.856    -0.0185    0    1.858    0    0    0    -3.172    0    -3.275    -3.172    0    0    5.4191    0.375    yes
    2003    0    -0.0175    0    0.309    0    0    0    -0.376    0    -0.468    -0.376    0    0    26.4795    5    no
    2007    1.21    -0.0175    0    1.658    0    2.764    0    -2.519    -2.484    -2.463    -2.519    0.28    0    30.2835    1.95    no
    2008    0.003    -0.0161    0    0.403    0    0    0    -1.513    0    -1.513    -1.513    0    0    77.8932    22.75    no
    2007    0.001    -0.0155    0    9.788    0    0    0    -4.022    0     XXXXXXXXXX    -4.022    0    0    42.2333    0.23    no
    2002    0    -0.013    0    0.026    0    0    0    -0.098    0    -0.098    -0.098    0    0    0.3    32    no
    2007    3.248    -0.0094    0    5.033    0    0    0    -4.777    0    -43.072    -4.777    0    0    4.2016    0.145    no
    2009    0.409    -0.0091    0    0.814    0    0    0    -0.356    0    -3.626    -0.356    0    0    19.9976    0.45    no
    2009    0.287    -0.0078    0    0.64    0    0.09    0    -1.014    -0.09    -1.222    -1.014    0    0    5.211    0.16    no
    2004    0.108    -0.0073    0    0.368    0    0    0    -1.627    0    -1.627    -1.627    0    0    11.8206    0.78    no
    2007    0    -0.0055    0    0.092    0    0    0    -0.213    0    -0.04    -0.213    0    0    0.8361    0.21    no
    2008    0.154    -0.0045    0    0.875    0    0    0    -0.088    0    -0.088    -0.088    0    0    8.05    1    no
    2008    0.56    -0.0028    0    0.536    0    0.398    0    0.012    0.215    -0.931    0.012    0.613    0    2.2099    0.1    no
    2009    0.005    -0.0026    0    0.121    0    0    0    -0.057    0    -0.057    -0.057    0    0    0.4892    0.02    no
    2002    0    -0.0023    0    0.197    0    0    0    -0.129    0    12    -0.129    0    0    0.3554    0.014    yes
    2007    0.001    -0.0014    0    0.489    0    0    0.03    -0.842    0    -1.51    -0.842    0    0    1.8811    0.036    no
    2008    0.002    -0.0013    0    0.062    0    0    0    -0.117    0    -0.107    -0.117    0    0    0.4448    0.19    no
    2003    0    -0.0011    0    0.095    0    0    0    -0.01    0    0.102    -0.01    0    0    0.1692    0.005    yes
    2005    0.011    -0.0007    0    0.017    0    0    0    -0.037    0    -0.037    -0.037    0    0    0.4199    0.15    no
    2005    0.027    -0.0004    0    0.111    0    0.007    0.052    -5.266    -0.004    -21.573    -5.266    0.003    0    5.643    0.37    no
    2004    0.055    -0.0003    0    0.086    0    2.841    0.166    -2.881    -2.833    -3.382    -2.881    0.008    0    11.5321    1.1136    no
    2009    0.315    -0.0001    0    0.325    0    1.79    0    -1.531    -1.341    -1.531    -1.531    0.449    0    4.3334    0.4    no
    2006    0.776    0.0011    0    0.462    0    0    0.03    -1.676    0    -1.703    -1.676    0    0    9.2324    0.149    no
    2009    0.022    0.0019    0    0.001    0    0    0    -0.023    0    -0.023    -0.023    0    0    0.3455    0.15    no
    2003    4.18    0.0019    0    4.094    0    0.65    0    0.595    0.595    0.514    0.595    1.245    0    2.4268    0.09    no
    2003    0.127    0.0034    0    0.059    0    0    0    -2.409    0    -2.409    -2.409    0    0    74.1955    4.15    no
    2007    5.913    0.0034    0    5.882    0    0    26.745    26.532    27.484    26.74    26.532    27.484    3.05     XXXXXXXXXX    41    no
    2004    0.042    0.0036    0    0.012    0    0    0    -0.05    0    3.124    -0.05    0    0    0.8397    1.4    no
    2006    7.204    0.0039    0    7.168    0    0    30.288    30.095    31.079    30.259    30.095    31.079    2.92     XXXXXXXXXX    43.75    no
    2008    0.047    0.0042    0    0.002    0    0    0    -0.025    0    -0.025    -0.025    0    0    0.4319    0.68    no
    2003    4.064    0.005    0    4.019    0    0    17.418    17.367    18.169    17.398    17.367    18.169    1.88     XXXXXXXXXX    24.62    no
    2008    9.525    0.0063    0    9.467    0    0    33.638    33.569    34.645    33.665    33.569    34.645    3.27     XXXXXXXXXX    40.39    no
    2007    0.071    0.0064    0    0.002    0    0    0    -0.03    0    -0.03    -0.03    0    0    1.2956    0.51    no
    2005    3.92    0.007    0    3.856    0    0    20.225    20.164    21.085    20.223    20.164    21.085    2.13     XXXXXXXXXX    34.4    no
    2002    3.459    0.0073    0    3.394    0    0    16.88    16.825    17.436    16.886    16.825    17.436    2.11    196.482    25.5    no
    2004    3.014    0.0074    0    2.948    0    0    14.286    14.286    15.061    14.308    14.286    15.061    1.71     XXXXXXXXXX    25.99    no
    2006    0.1    0.0092    0    0.001    0    0    0    -0.014    0    -0.014    -0.014    0    0    1.9435    0.25    no
    2009    3.586    0.0102    0    3.492    0    0    27.664    27.688    28.724    27.699    27.688    28.724    3.66     XXXXXXXXXX    36.939    no
    2005    4.166    0.0134    0    3.57    0    0    0    -0.259    0    -0.259    -0.259    0    0    0.2851    0.02    no
    2005    1.444    0.0173    0    0.567    0    0.191    0    -0.188    -0.188    -0.075    -0.188    0.003    0    2.0272    0.65    no
    2004    4.175    0.0192    0    3.32    0    0.55    0    -0.525    -0.525    0.485    -0.525    0.025    0    0.6682    0.08    no
    2009    7.823    0.0413    0    7.633    0    3.642    0    -3.312    -3.12    -5.469    -3.312    0.522    0    3.68    1.35    no
    2007    1.213    0.0868    0    0.738    0    0.011    0    -0.038    -0.011    -0.061    -0.038    0    0    25.2988    8    no
    2009    16.114    0.1072    0    7.766    0    0    0    -1.105    0    -1.783    -1.105    0    0    26.9288    2.4    no
    2010    21.495    0.1547    0    7.317    0    0.459    0    -1.314    0.141    -2.104    -1.314    0.6    0    11.2049    0.72    no
    2006    125    0.1839    0    5.562    0    0    0    -3.91    0    -3.514    -3.91    0    0     XXXXXXXXXX    0.51    no
    2005    125.003    0.1878    0    3.036    0    0    0    -7.372    0    -14.928    -7.372    0    0     XXXXXXXXXX    0.65    no
    2008    6.138    0.6613    0    2.779    0    0.03    0    -7.916    0.126    -9.202    -7.916    0.156    0    1.5346    3.2488    no
    2006    17.136    1.0083    0    1.034    0    0.307    0    -6.217    0.539    -5.615    -6.217    0.846    0    19.3225    2.05    no
    2003    28.51    1.291    0    6.99    0    0.223    0    -6.871    0.488    -10.704    -6.871    0.711    0    22.5986    1.65    no
    2005    22.46    1.3138    0    1.284    0    0.345    0    -5.895    0.693    -5.468    -5.895    1.038    0    18.9709    2.92    no
    2008    11.813    1.3545    0    6.273    0    2.089    0    -1.302    -1.121    -0.195    -1.302    0.968    0    2.0859    2.3    no
    2004    28.682    1.6555    0    1.673    0    0.282    0    -6.312    0.635    6.004    -6.312    0.917    0    46.3863    4.28    no
    2002    50.247    1.9108    0    16.237    0    0.516    0    -6.823    0.391    -11.108    -6.823    0.907    0    13.0808    1.37    no
    2007    13.29    2.9689    0    1.438    0    0.297    0    -5.271    0.436    -4.418    -5.271    0.733    0    7.0658    5.96    no
    2002    248.497    7.7239    0    229.952    0    5.233    0.228    3.658    11.486    1.052    3.658    16.719    0.095    22.5694    10.66    no
    2003    342.469    7.7958    0    323.689    0    6.151    0.276    4.574    13.877    1.341    4.574    20.028    0.085    30.1125    12.5    no
    2004    405.047    7.9469    0    385.291    0    7.565    0.332    4.028    13.816    0.926    4.028    21.381    0.165    45.7424    18.6    no
    2004    254.115    8.5196    0    238.26    0    3.306    0    3.919    9.094    1.218    3.919    12.4    0    63.3671    34.05    no
    2003    222.613    9.6061    0    207.714    0    2.762    0.008    2.462    7.129    0.482    2.462    9.891    0    40.0934    26.5    no
    2009    1098.14    9.6977    0    202.855    0    256.774    0     XXXXXXXXXX     XXXXXXXXXX     XXXXXXXXXX     XXXXXXXXXX    138.257    0    1641.81    34.28    yes
    2005    470.641    9.7106    0    439.266    0    11.38    0.445    5.315    16.998    1.407    5.315    28.378    0.155    67.0433    21.64    no
    2008    75.464    10.1891    0    34.412    0    0    0    -13.253    0    -4.884    -13.253    0    0    38.3158    11.75    yes
    2008     XXXXXXXXXX    10.7222    0    226.877    0    233.989    0    -15.36    30.001    -35.883    -15.36    263.99    0     XXXXXXXXXX    46.82    yes
    2009    76.89    10.788    0    33.425    0    0    0    -0.698    0    2.084    -0.698    0    0    33.199    10.01    yes
    2005    845.108    11.4069    0    786.431    0    14.961    2.665    14.867    38.968    6.511    14.867    53.929    0.5245    98.5076    26.8182    no
    2007    64    11.6654    0    17    0    0    19    -4    0    7    -4    0    6.39    48.348    19.25    yes
    2006    872.094    12.2791    0    808.832    0    20.04    2.787    15.806    41.374    7.335    15.806    61.414    0.54    108.192    23    no
    2008    913.359    12.3917    0    854.585    0    39.829    0.854    -3.499    15.772    -7.112    -3.499    55.601    0.135    32.4896    16.55    no
    2007    914.781    13.1717    0    846.802    0    25.053    2.808    15.07    41.745    7.669    15.07    66.798    0.545    89.0273    21.35    no
    2006    645.983    13.5647    0    581.7    0    19.11    0.701    7.379    21.259    2.415    7.379    40.369    0.175     XXXXXXXXXX    30.5    no
    2007    807.53    14.0794    0    740.695    0    27.842    0.854    8.91    26.432    2.678    8.91    54.274    0.18    75.8096    26.52    no
    2006    110    15.8848    0    46    0    0    10    -13    0    15    -13    0    5    74.1336    21    yes
    2005    125    18.0962    0    52    0    3    49    -22    1    29    -22    4    13    70.595    32    yes
    2008    0.306    -0.0109    0.181    3.845    0    0.736    0    -1.796    -0.54    -1.962    -1.796    0.196    0    1.0331    0.048    no
    2005    0.905    -0.059    0.202    3.813    0    0.561    0    -2.396    0.201    -3.806    -2.396    0.762    0    6.922    0.155    no
    2003    220.21    23.0332    0.214    200.793    0    3.862    0.303    3.708    10.552    1.275    3.708    14.414    0.36    16.6493    19.9    no
    2006    516.299    7.934    0.242    450.804    0    12.956    0    7.02    18.203    2.466    7.02    31.159    0    68.5165    12.5143    no
    2002    179.537    10.109    0.302    164.606    0    3.162    0.002    2.272    5.774    0.784    2.272    8.936    0    21.7267    16.0952    no
    2002    578.359    4.0905    0.331    527.171    0    7.337    3.244    22.908    40.223    11.715    22.908    47.56    0.26     XXXXXXXXXX    19.75    no
    2007    222.218    13.1435    0.338    200.058    0    6.053    0.337    4.588    11.733    0.927    4.588    17.786    0.25    7.7725    10.75    no
    2002    640.01    10.9734    0.368    591.398    0    12.904    2.323    11.889    33.351    5.702    11.889    46.255    0.525    73.981    17.5    no
    2008    599.385    8.1279    0.4    516.933    0    11.983    0.016    8.165    19.835    2.91    8.165    31.818    0    55.5441    8.33    no
    2009    42.967    7.3097    0.51    5.388    0    23.292    3.132    3.755    6.42    3.057    3.755    29.712    0.605    62.7202    14.96    no
    2010    41.41    7.0729    0.52    5.006    0    22.211    3.188    2.29    4.873    1.851    2.29    27.084    0.62    72.5727    16.5    no
    2002     XXXXXXXXXX    7.1159    0.533     XXXXXXXXXX    0    71.851    6.459    27.093    54.878    16.308    27.093    126.729    0.4     XXXXXXXXXX    22.5    no
    2002    36.563    6.346    0.541    3.881    0    21.754    2.371    5.394    8.056    3.819    5.394    29.81    0.46    58.401    12.25    no
    2003    37.537    6.4123    0.565    4.584    0    21.731    3.112    5.113    7.645    3.583    5.113    29.376    0.48    60.3833    12.15    no
    2007    45.835    7.6591    0.582    6.498    0    23.647    2.209    5.493    8.327    4.189    5.493    31.974    0.57    87.5688    17.1    no
    2004    40.58    6.7919    0.583    5.683    0    21.218    2.723    4.895    7.216    4.702    4.895    28.434    0.52    73.2165    15.25    no
    2006    43.13    7.22    0.625    6.041    0    22.806    2.877    4.91    7.514    3.64    4.91    30.32    0.56    74.7434    14.8    no
    2005    42.549    7.044    0.626    6.357    0    21.169    2.8    5.171    7.438    3.849    5.171    28.607    0.54    71.7779    14.6    no
    2007    503.506    7.8763    0.658    434.62    0    13.947    0.004    5.474    16.324    0.844    5.474    30.271    0    69.5307    10.0381    no
    2004    805.475    12.1442    0.757    750.073    0    9.027    2.529    15.036    37.77    6.161    15.036    46.797    0.555     XXXXXXXXXX    28    no
    2008    44.057    7.4406    0.801    5.842    0    23.16    3.569    4.313    6.944    3.535    4.313    30.104    0.69    71.6472    17.43    no
    2004    1.423    -0.0526    0.836    3.312    0    1.697    0    -1.502    1.083    -1.658    -1.502    2.78    0    7.446    0.173    no
    2006    268.008    12.9075    0.951    246.246    0    5.64    0.337    5.183    12.402    1.786    5.183    18.042    0.2    16.6914    13    no
    2003    689.21    11.4567    0.965    637.46    0    9.673    2.434    11.869    35.06    5.805    11.869    44.733    0.5425     XXXXXXXXXX    25.05    no
    2009    577.658    4.7768    1.055    522.139    0    14.393    0.743    6.414    17.043    -25.768    6.414    31.436    0    27.6784    7.63    no
    2004    230.541    12.0641    1.099    210.201    0    3.663    0.337    4.223    11.732    1.139    4.223    15.395    0.4    42.15    25    no
    2005    245
Answered 3 days After Sep 29, 2021

Solution

Pritam Kumar answered on Oct 03 2021
140 Votes
#load data and check for missing values
data <- read.csv("D:/New
ankruptcy
ankruptcydata.csv")
data <- as.data.frame(data)
which(is.na(data))
data$Bankrupt <- ifelse(data$Bankrupt == "yes", 1, 0)
#logistic regression
li
ary(caTools)
split <- sample.split(data, SplitRatio = 0.7)
training <- subset(data, split == "TRUE")
testing <- subset(data, split == "FALSE")
log_model <- glm(Bankrupt~., training, family = 'binomial')
summary(log_model)
es <- predict(log_model, training, type = "response")
#odd-ratio for significant features
exp(coef(log_model))
#multicollinearity issue
li
ary(performance)
check_collinearity(log_model)
#final logit-model
log_model2...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here