Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

My college has requested me to do a project using script and I am seeking your help in this regard. The instructionsare as follows:You expected to pick a topic of interest, find an open-source...

1 answer below »
My college has requested me to do a project using script and I am seeking your help in this regard.
The instructionsare as follows:You expected to pick a topic of interest, find an open-source dataset, define a business analytics question, apply appropriate business analytics methods, interpret results, and share the insights in your project report and presentations. There are many open-source datasets. Here are just a few examples, and you are welcome to explore other sources.1. Kaggle UCI Machine Learning Repository Google Public Data Explorer National Information Center on Health Services Research and Health Care Technology Centers for Disease Control and Prevention (CDS) Public datasets U.S. Census Bureau
7. Open Data Boston Data World Airline Industry Statlib datasets van der Maaten’s web page provides high dimensional data sets Open Gov. Data Federal Reserve Economic Data World Bank Youtube labeled Video Dataset Past KDD Cups AwesomePublic Datasets WRDS and Bloomberg Access to finance data for all students and faculty of SOM
Could you kindly suggest me a topic, please be advised that this is a beginners level project?Once we decide on topic we can talk about the price and everything.Thank you!
Answered 26 days After Mar 26, 2022


Suraj answered on Mar 27 2022
97 Votes
Data Analysis using R
Data Analysis using R
Since, the project is an open-source internet type project. So, I have selected data related to medical field where many variables are measured on the health state of people and the response variable is the death event. That is the person will survive or not. The population of the data set are all the patients who suffered from the cardiovascular disease earlier and whether they survived or not. A
ief explanation about the variables given as follows:
Description Table
There are 299 rows and 12 different variables in the data set. The data set does not contain any missing value for any row. It is check using R-Studio as follows:
Missing Value table
Here, many variables are of categorical type. So, our research question will be based on these categorical variables.
The data set is available from the following link:
Descriptive statistics:
The descriptive statistics of the different variables is calculated as follows:
Why this data set important?
This is a quite interesting question which can build our analytical thinking skills. This data set is medical data set, so we are interested to analyse some medical aspects of the human life. That is to check about the whether a diabetic person has high chances of death or not. Whether the presence of some disease features leads to heart attach and also the person will be survived or not. Here, all the variables are important for the analysis purpose, so no single variable is deleted from the data set.

Research question:
Since, we are interested in the analysis of the categorical variables. Thus, our variables of interest are Diabetes, High blood pressure and death event. Thus, the research question is that is there any association of Diabetes and High blood pressure and death event. In other words, we can say that is there any dependence of diabetes and high blood pressure on the death event variable.
Since, the question is based on the categorical variable and we are checking about the association. Thus, the best approach to do this type of analysis is by using...

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here