Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

This assignment requires you to perform machine learning data analysis based on a bankruptcy dataset that’s free on Kaggle. Please use the dataset and perform machine learning analysis on this topic...

1 answer below »
This assignment requires you to perform machine learning data analysis based on a bankruptcy dataset that’s free on Kaggle.
Please use the dataset and perform machine learning analysis on this topic and submit a research report. The report contains the following:
· Abstract
· Which classification algorithms will we use?
· Logistic Regression
· Naive Bayes Classifier
· K-nearest neighbo
· Decision Trees
· Performance Measures?
· Confusion Matrix
· Precision
· Recall/ Sensitivity
· Specificity
· F1-Score
· AUC & ROC Curve
· Motivation
· Many businesses have gone into default due to the recent pandemic which may cause investors to be wary of providing further capital. We seek to provide a model that interprets whether a company will become bankrupt, which can be used by investors to provide confidence and guidance. In order to address this problem, we will use a company bankruptcy dataset found on Kaggle along with the help of classification algorithms like Logistic Regression and Support Vector Machine
· Literature Survey
· Whatever algorithms we choose we can find literature on
· Methodology
· Data research - Taiwan economy during 1999 to 2009. the great recession
· Data Exploration and Visualization - look for co
elation between features
· Data Preprocessing - The data is very imbalanced. The total entries are 6,819 with 6,599 not bankrupt and 220 being bankrupt. This is an obstacle we would have to come up with a solution through random sampling (under sampling/oversampling). Feature engineering - removing/adding features to determine which yield the best results, experiment with attribute combinations. We have 95 features.
· Data Cleaning - check for nulls, check for missing entries, etc. Handling text and categorical attributes, feature scaling
· Data split into training and testing data.
· Train and evaluate for different models
· Seek better evaluation through fine tuning
· Apply performance measures to find best algorithm or best feature combinations
Answered 1 days After Mar 09, 2021

Solution

Neha answered on Mar 11 2021
151 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here