Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

IS6052 ‐ Descriptive and Predictive Analytics    2022‐2023 Individual CA Project      Due Date: Thursday December 8th  Submit your project report as a single pdf file on Canvas      Loan Appr...

1 answer below »
IS6052 ‐ Descriptive and Predictive Analytics 
 
2022‐2023 Individual CA Project 
 
 
Due Date: Thursday December 8th 
Submit your project report as a single pdf file on Canvas 
 
 
Loan Appraisal for FNB Bank 
You are a credit analyst working for FNB Bank. Your responsibilities include analysing the loan 
applications and making recommendations to management, based upon your findings, to help them 
make data‐driven decisions on lending.  
“Loan.csv” file includes data on 40,000 FNB customers that were granted a loan in the past and the 
espective outcome, i.e. whether they were identified as “write‐offs” or “not write‐offs”. 
Using this data, analyse the new loan applications whose data are provided in “NewApplications.csv” 
and try to predict whether each new applicant will repay the requested loan if it is approved. 
 
Data available: 
1. Gender  M=male; F=female 
2. Age  an integer parameter   
3. marital_status  widowed; ma
ied; single; divorced   
4. education  basic; highsch; univ; postgrad   
5. nb_depend_child  number of dependent children (0,1,2,3)   
6. employ_status  employment status (full_time; part_time; unemployed; self_employ; retired) 
7. yrs_cu
ent_job  years at the cu
ent employment 
8. yrs_employed  total number of years employed so far   
9. net_income  an integer parameter 
10. spouse_work  yes; no   
11. spouse_income  if the spouse works, what is his/her income? 
12. residential_status  home owner (owner); tenant; home owner with a mortgage (owner_morg); 
living with parents (w_parents)   
13. yrs_cu
ent_address  years at the cu
ent address  
14. loan_amount  an integer parameter   
15. loan_purpose  debt consolidation (debt_consol); wedding; home improvement (home_improv); 
vehicle; holidays; other 
16. loan_length  the duration of the loan   
17. collateral  yes; no 
18. writeoff  yes; no 
 
SAT-Dell2019
Cross-Out
SAT-Dell2019
Typewritten Text
Monday December 19th
 
Your report should contain: 
 
1. an investigation of the data and a summary of your descriptive analyses; (18 pts) 
2. a discussion on the pros and cons of the prediction methods that can be used to address FNB's loan 
appraisal problem; (6 pts) 
3. a 
ief description (and the assumptions made, if any) of how selected prediction methods are 
applied; (7 pts) 
4. R codes developed; (12 pts) 
5. an evaluation of the results obtained by each prediction method tried on the data; (20 pts) 
6. a comparative analysis of the results; (20 pts) 
7. your final recommendation to FNB on which customers should be granted loan; (10 pts) 
8. a discussion on any additional data that you think would be useful, if collected, to make better 
predictions in the future. (7 pts)
Answered 13 days After Dec 01, 2022

Solution

Subhanbasha answered on Dec 15 2022
38 Votes
FNB Bank – prediction model
                 Data and summary
The data collected is about the customers of banks whether they write off or not. The data is from the FNB bank. Here I used descriptive and predictive analytics to find out the pattern from the past behaviour of the customers and used it to predict new customers whether we proceed to give loans or not.
    So, by observing the collected data there are some numerical and categorical variables. The total number of customer data is 40000 and we have a total of 18 columns where these are the main inputs to the model. The following are the type of variables that they have in the data.
Character variables:
· Gende
· marital_status
· education
· employ_status
· spouse_work
· residential_status
· loan_purpose
· collateral
· writeoff
Numerical/integer variables:
· age
· nb_depend_child
· yrs_cu
ent_jo
· yrs_employed
· net_income
· spouse_income
· yrs_cu
ent_address
· loan_amount
· loan_length
In the data, the age variable has a minimum age of 20 and maximum age is 65. The maximum number of dependent children is 3 and in years of cu
ent job is 25 years. The average net income is 42956 and the maximum is 178500. The average spouse’s income is 10266 and the maximum income is 167298. The average loan amount is 30702 and the maximum is 272343. The average loan tenure is 37 months the maximum is 96 months.
                Prediction Methods
Here we have the previous data about the customers of various characteristics geographical features and personal details. We can use here various machine learning algorithm models to make the model and predict. Here mainly the problem is a classification problem, so we use classification algorithms.
The methods used for prediction is as follows
· Decision Trees
· Random Forest
· Naïve Bayes
· Support Vector Machine
Decision Trees:
Pros:
· It will take less time and effort to create the algorithm and called straight forward algorithm
· It can be easily understood by the users.
· The algorithm does not require the scaling of the raw data it will handle itself only
· The missing values also not reflect in the model.
Cons:
· It will change drastically while we are changing data for the small part
· Decision tree algorithm not recommended for the continuous variables.
· This will take some time to make...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here