Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF...

1 answer below »
MIS771 Descriptive Analytics and Visualisations Page 1 of 8

MIS771 Descriptive Analytics and Visualisation
DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS
DEAKIN BUSINESS SCHOOL
FACULTY OF BUSINESS AND LAW, DEAKIN UNIVERSITY

Assignment One
Background
This is an individual assignment. You need to analyse a given data set, and then interpret and draw
conclusions from your analysis. You then need to convey your conclusions in a written report to a Business
professional with very little or no knowledge of Business Analytics.

Percentage of the final grade 30%
The Due Date and Time 8pm Thursday 20th August 2020
Submission instructions
The assignment must be submitted by the due date, electronically in CloudDeakin. When submitting
electronically, you must check that you have submitted the work co
ectly by following the instructions provided
in CloudDeakin. Please note that we will NOT accept any paper or email copies, or part of the assignment
submitted after the deadline.
Information for students seeking an extension BEFORE the due date
If you wish to seek an extension for this assignment prior to the due date, you need to apply directly to the Unit
Chair by completing the Assignment and Online Test Extension Application Form (PDF, 188.6KB). Please make
sure you attach all supporting documentation and a draft of your assignment.
This needs to occur as soon as you become aware that you will have difficulty in meeting the due date.
Please note: Unit Chairs can only grant extensions up to two weeks beyond the original due date. If you require
more than two weeks, or have already been provided an extension by the Unit Chair and require additional time,
you must apply for Special Consideration via StudentConnect within 3 business days of the due date.
Conditions under which an extension will normally be considered include:
• Medical – to cover medical conditions of a serious nature, e.g. hospitalisation, serious injury or chronic
illness.
Note: temporary minor ailments such as headaches, colds and minor gastric upsets are not serious medical
conditions and are unlikely to be accepted. However, serious cases of these may be considered.
• Compassionate – e.g. death of a close family member, significant family and relationship problems.
• Hardship/Trauma – e.g. sudden loss or gain of employment, severe disruption to domestic
a
angements, victim of crime.
Note: misreading the due date, assignment anxiety or returning home will not be accepted as grounds for
consideration.
https:
www.deakin.edu.au/__data/assets/pdf_file/0006/2055552/BL_AssignmentExtensionForm_Feb2020.pdf
MIS771 Descriptive Analytics and Visualisations Page 2 of 8


Information for students seeking an extension AFTER the due date
If the due date has passed, you require more than two weeks extension, or you have already been provided with
an extension and require additional time, you must apply for Special Consideration via StudentConnect. Please
e aware that applications are governed by University procedures and must be submitted within three business
days of the due date or extension due date.
Please be aware that in most instances the maximum amount of time that can be granted for an assignment
extension is three weeks after the due date, as Unit Chairs are required to have all assignment submitted before
esults/feedback can be released back to students.

Penalties for late submission
The following marking penalties will apply if you submit an assessment task after the due date without an
approved extension:
• 5% will be deducted from available marks for each day, or part thereof, up to five days.
• Work that is submitted more than five days after the due date will not be marked; you will receive 0%
for the task.
Note: 'Day' means calendar day.
The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the
task after the due date.
Additional information: For advice regarding academic misconduct, special consideration, extensions, and
assessment feedback, please refer to the document “Rights and responsibilities as a student” in the “Unit Guide
and Information” folder under the “Content” section in the MIS771 CloudDeakin site.
The assignment uses the dataset file Insurance.xlsx, which can be downloaded from CloudDeakin. Analysis of
the data requires the use of techniques studied in Module-1.
Assurance of Learning
This assignment assesses the following Graduate Learning Outcomes and related Unit Learning Outcomes:
Graduate Learning Outcome (GLO) Unit Learning Outcome (ULO)
GLO1: Discipline-specific knowledge and capabilities -
appropriate to the level of study related to a discipline or
profession.
GLO2: Communication - using oral, written and interpersonal
communication to inform, motivate and effect change
GLO5: Problem Solving - creating solutions to authentic (real
world and ill-defined) problems.
GLO6: Self-Management - working and learning independently,
and taking responsibility for personal actions
ULO 1: Apply quantitative reasoning skills
to solve complex problems.

ULO 2: Plan, monitor, and evaluate own
learning as a data analyst.

ULO 3: Deduce clear and unambiguous
solutions in a form that they useful
for decision making and research
purposes and for communication
to the wider public.

Feedback before submission
You can seek assistance from the teaching staff to ascertain whether the assignment conforms to submission
guidelines.
Feedback after submission
An overall mark together with feedback, will be released via CloudDeakin, usually within 15 working days. You
are expected to refer and compare your answers to the feedback to understand any areas of improvement.
MIS771 Descriptive Analytics and Visualisations Page 3 of 8


The Case Study
The United States has one of the highest healthcare costs in the world, spending trillions of dollars on
healthcare, which typically exceeds $10,000 per individual. Studies report that healthcare costs have gone up
from 5% of gross domestic product (GDP) to 18% during the period 1960 to 2018. Moreover, a number of
articles have suggested that predisposing personal characteristics, such as income, age, state, job etc. could
possibly be related to the cost of health services.



“The Americans dying because they can't afford medical
care - A December 2019 poll conducted by Gallup found
25% of Americans say they or a family member have
delayed medical treatment for a serious illness due to the
costs of care.”
Illustration: Mikyung Lee/The Guardian XXXXXXXXXX

“Millions of Americans – as many as 25% of the population – are delaying getting medical help because of
skyrocketing costs”
Michael Sainato Tue 7 Jan XXXXXXXXXXAEDT Last modified on Wed 8 Jan XXXXXXXXXXAEDT

“Young people, who are expected to benefit from lower premiums should the GOP repeal-and-replace efforts
succeed, already pay the least. But even their costs can be considerable, depending on where they live. In
2016, the financial data site ValuePenguin found that the average costs for coverage for a 21-year-old go from
$180 a month in Utah, plus a $2,160 deductible (potentially $4,320 a year, total), to $426 a month in Alaska,
with a $5,112 deductible (potentially $10,224 a year, total).”
Published Fri, Jun XXXXXXXXXX:52 AM EDT Updated Mon, Oct XXXXXXXXXX:55 AM EDT

“Average annual costs per person hit $10,345 in 2016. In 1960, the average cost per person was only $146 —
and, adjusting for inflation, that means costs are nine times higher now than they were then.”
Published Fri, Jun XXXXXXXXXX:52 AM EDT Updated Mon, Oct XXXXXXXXXX:55 AM EDT

“Americans pay a lot for healthcare. Depending on where they live, typical workers shelled out between
$4,500 and $8,300 for healthcare in 2017. But the US government pays even more.”
Tanza Loudenback Mar 8, 2019, 12:25 AM

The UnitedHealth Group: America’s most prominent health insurance provider aims to identify the
characteristics of the population to improve their understanding of the potential influence of these
characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US
Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of
Children, Smoking status, Region and Insurance charges.
You are a Data Analyst working for UnitedHealth Group. Your Manager – Edmond Kendrick has asked you to
conduct a preliminary analysis. In particular, you are expected to apply a series of statistical techniques and
produce a report based on your findings.
Edmond’s email is reproduced on the next page.
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
MIS771 Descriptive Analytics and Visualisations Page 4 of 8


Email from Edmond Kendrick
To:
Your Name

From: Edmond Kendrick
Subject: Analysis of US Health Insurance data
Hi,
As per our conversation, I have spoken with our reporting team and we have following questions relating to
the US health insurance data. Please complete the following analysis for me. Your responses will assist
them in writing the feature section of our next issue.
1. Provide your insights on how the specific attributes of the whole insured population is affecting their
insurance premiums based upon our sample data:
(a) An estimate of the difference in medical costs for a female versus a male
(b) An estimate of the difference in medical costs for a single person versus someone with a family
(c) Males with no dependents have claimed that they have, on average, been charged more than
their female counterparts. Can you check whether this claim is possibly true?
(d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater
proportion of males who are smokers compared to females? Can you check whether this claim can
also be substantiated? Briefly advise the findings in regard to the proportion of males and females who
are smokers.

2. Can you further analyse to see whether the beneficiary's residential area
egion in the US affect how
health insurance provider bill their medical costs?

3. We believe that individual medical costs billed by health insurance differs significantly across age group of
primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood:
Answered Same Day Aug 19, 2021 MIS771 Deakin University

Solution

Gajula Nagasai answered on Aug 20 2021
142 Votes
Introduction:
The main purpose is by analysing the data with different statistical we can conclude that weather there is any Significance difference between the Males with no dependents have claimed that they have, on average, been charged more than their female counterparts.
Through Confidence level we can provide the specified range that true population parameter or the estimated difference of two population parameter will lies between the specified range
Further we can analyse to see whether the beneficiary's residential area
egion in the US affect how health insurance provider bill their medical costs.
Further we can prove that the whether smoking behaviour of primary beneficiaries across each of the Body mass index levels (Under, Normal, Overweight, and Obese). By using Chi square test
Definition of confidence interval: In statistics, a confidence interval is a type of estimate computed from the statistics of the observed data. This proposes a range of plausible values for an unknown parameter. The interval has an associated confidence level that the true parameter is in the proposed range
Calculation:
1.
The formula for a confidence interval (CI) for the difference between two population proportions is


Here p1 – p2 is the differences between two proportions
P1= x1/n, x1 is the Number of Males (x1)
We have x1 = 676 and n is the sample size 1338
Then p1 = 676/1338 = 0.5052
P1= x2/n, x2 is the Number of Females (x2)
We have x2 = 662 and n is the sample size 1338
Then p2 = 662/1338 = 0.4947
Pooled estimate of proportion is √(p1(1-p1)/n+p2(1-p2)/n) = 0.0193
Z* is the critical value. At 0.05 level of significance the z critical value is 1.96
And Standard e
or is S.E = z*Sqrt(p1(1-p1)/n+p2(1-p2)/n) = 1.96*0.0193 = 0.0378
And p1 – p2 = 0.5052 – 0.4947 = 0.010
The Confidence interval for the estimate of the difference is
Lower bound is 0.010 – 0.037 = -0.02
Upper bound is 0.010 + 0.037 = 0.048
Conclusion: From the above statistical evaluation we are 95% confidence that the estimate of the difference in medical costs for a female versus a male will lie between -0.02 and 0.048
) An estimate of the difference in medical costs for a single person versus someone with a family
The formula for a confidence interval (CI) for the difference between two population proportions is


Here p1 – p2 is the differences between two proportions
P1= x1/n, x1 is the someone with a family (x1)
We have x1 = 764 and n is the sample size 1338
Then p1 = 764/1338 = 0.571
P1= x2/n, x2 is the Single person in a family (x2)
We have x2 = 574 and n is the sample size 1338
Then p2 = 574/1338 = 0.428
Pooled estimate of proportion is √(p1(1-p1)/n+p2(1-p2)/n) = 0.0191
Z* is the critical value. At 0.05 level of significance the z critical value is 1.96
And Standard e
or is S.E = z*Sqrt(p1(1-p1)/n+p2(1-p2)/n) = 1.96*0.0191 = 0.0375
And p1 – p2 = 0.571 – 0.428 = 0.142
The Confidence interval for the estimate of the difference is
Lower bound is 0.142 – 0.0375 = 0.104
Upper bound is 0.142 + 0.037 = 0.1795
Conclusion: From the above statistical evaluation we are 95% confidence that the estimate of the difference in medical costs for a single person versus someone with a family will lie between 0.104 and 0.179
c) Aim: To Check Males with no dependents have claimed that they have, on average, been charged more than their female counterparts
Two compare the two population groups it is recommended to use 2 sample Standard deviation when population Standard deviation is unknown
Setting of Hypothesis:
Null hypothesis Ho: µ1 - µ2 = 0 that is Males with no dependents have claimed that they have, on average, been charged not more than their female counterparts
Alternative hypothesis: Ho: µ1 - µ2 > 0 Null hypothesis Ho: µ1 - µ2 = 0 that is Males with no dependents have claimed that they have, on average, been charged more than their female counterparts
The following is the formula for two sample t test statistic
X1 bar and X2 bar are the two-sample means, s1 and s2 are the two sample standard deviations
Here we have x1 bar = 12832.69, x2 bar = 11905.71 and s1 =12560.75, s2 =11471.88
Sample sizes for the two samples are 285 and 289
The pooled variance is (SS1 + SS2)/(n1+n2-2) = 144596737.7
Standard e
or of mean is 1003.8385
t = 926.98/ 1003.8385 = 0.9234
if t = 0.9234 at 572 d.f then p value is 0.1781
Decision rule: if p value lower than 0.05 reject null hypothesis otherwise accept null hypothesis
So, here the p value greater than 0.05 need to accept null hypothesis
Conclusion: From the above Statistical evaluation we concluded that Males with no dependents have claimed that they have, on average, been charged not more than their female counterparts
d) Aim: To Check is there a greater proportion of males who are smokers compared to females? Can you check whether this claim can also be substantiated? Briefly advise the findings in regard to the proportion of males and females who are smokers
Null Hypothesis Ho: p1 -p2 =0
Alternative hypothesis Ha: p1 – p2 # 0
The formula for a confidence interval (CI) for the difference between two population proportions is


Here p1 – p2 is the differences between two proportions
P1= x1/n, x1 is the male smokers (x1)
We have x1 = 159 and n is the sample size 274
Then p1 = 159/274 = 58.03%
P1= x2/n, x2 is the female smokers (x2)
We have x2 = 114 and n is the sample size 274
Then p2 = 114/274 = 41.97%
Pooled...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here