Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

The U.S. Small Business Administration (SBA) was founded in 1953 on the principle of promoting and assisting small enterprises in the U.S. credit market. Small businesses have been a primary source of...

1 answer below »
The U.S. Small Business Administration (SBA) was founded in 1953 on the principle of promoting and assisting small enterprises in the U.S. credit market. Small businesses have been a primary source of job creation in the United States; therefore, fostering small business formation and growth has social benefits by creating job opportunities and reducing unemployment. One way SBA assists these small business enterprises is through a loan guarantee program which is designed to encourage banks to grant loans to small businesses. SBA acts much like an insurance provider to reduce the risk for a bank by taking on some of the risk through guaranteeing a portion of the loan. In the case that a loan goes into default, SBA then covers the amount they guaranteed. There have been many success stories of start-ups receiving SBA loan guarantees such as FedEx and Apple Computer. However, there have also been stories of small businesses and/or start-ups that have defaulted on their SBA-guaranteed loans. The rate of default on these loans has been a source of controversy for decades. Conservative economists believe that credit markets perform efficiently without government participation. Supporters of SBA-guaranteed loans argue that the social benefits of job creation by those small businesses receiving government-guaranteed loans far outweigh the costs incurred from defaulted loans. Since SBA loans only guarantee a portion of the entire loan balance, banks will incur some losses if a small business defaults on its SBA-guaranteed loan. Therefore, banks are still faced with a difficult choice as to whether they should grant such a loan because of the high risk of default. One way to inform their decision making is through analyzing relevant historical data such as the datasets provided here. The case study focuses on loans pertaining to theReal Estate and Rental and Leasingindustry in California. The relevant data is extracted from the National SBA file to create this file which has 2,102 observations and 35 variables.

(a) Applyk-NN, Naïve Bayes, and Classification Trees (use GridsearchCV on training data coupled with cross-validation) to classify a loan application as a “lower risk” (approve) or “higher risk” (deny), using appropriate predictors. Partition the data into training (60%) and validation (40%) sets. Normalize data where it’s appropriate. Find the bestkfork-NN. Report classification accuracy rate for both training and validation data. Produce the lift and gains charts for all classifiers.

(b) Repartition the data into training, validation, and test sets (50%:30%:20%). Apply thek-NN classifier with thekchosen using the validation set. Compare the confusion matrix of the test set with that of the training and validation sets.

Answered 1 days After May 19, 2021

Solution

Suraj answered on May 20 2021
154 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from sklearn.naive_bayes import GaussianNB\n",
"from sklearn.tree import DecisionTreeClassifier\n",
"from sklearn.metrics import accuracy_score\n",
"from sklearn.metrics import confusion_matrix\n",
"from sklearn.model_selection import cross_val_score\n",
"from sklearn.model_selection import GridSearchCV"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"