Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

MIS771 Descriptive Analytics and Visualisations Page 1 of 8 MIS771 Descriptive Analytics and Visualisation DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS DEAKIN BUSINESS SCHOOL FACULTY OF...

1 answer below »
MIS771 Descriptive Analytics and Visualisations Page 1 of 8

MIS771 Descriptive Analytics and Visualisation
DEPARTMENT OF INFORMATION SYSTEMS AND BUSINESS ANALYTICS
DEAKIN BUSINESS SCHOOL
FACULTY OF BUSINESS AND LAW, DEAKIN UNIVERSITY

Assignment One
Background
This is an individual assignment. You need to analyse a given data set, and then interpret and draw
conclusions from your analysis. You then need to convey your conclusions in a written report to a Business
professional with very little or no knowledge of Business Analytics.

Percentage of the final grade 30%
The Due Date and Time 8pm Thursday 20th August 2020
Submission instructions
The assignment must be submitted by the due date, electronically in CloudDeakin. When submitting
electronically, you must check that you have submitted the work co
ectly by following the instructions provided
in CloudDeakin. Please note that we will NOT accept any paper or email copies, or part of the assignment
submitted after the deadline.
Information for students seeking an extension BEFORE the due date
If you wish to seek an extension for this assignment prior to the due date, you need to apply directly to the Unit
Chair by completing the Assignment and Online Test Extension Application Form (PDF, 188.6KB). Please make
sure you attach all supporting documentation and a draft of your assignment.
This needs to occur as soon as you become aware that you will have difficulty in meeting the due date.
Please note: Unit Chairs can only grant extensions up to two weeks beyond the original due date. If you require
more than two weeks, or have already been provided an extension by the Unit Chair and require additional time,
you must apply for Special Consideration via StudentConnect within 3 business days of the due date.
Conditions under which an extension will normally be considered include:
• Medical – to cover medical conditions of a serious nature, e.g. hospitalisation, serious injury or chronic
illness.
Note: temporary minor ailments such as headaches, colds and minor gastric upsets are not serious medical
conditions and are unlikely to be accepted. However, serious cases of these may be considered.
• Compassionate – e.g. death of a close family member, significant family and relationship problems.
• Hardship/Trauma – e.g. sudden loss or gain of employment, severe disruption to domestic
a
angements, victim of crime.
Note: misreading the due date, assignment anxiety or returning home will not be accepted as grounds for
consideration.
https:
www.deakin.edu.au/__data/assets/pdf_file/0006/2055552/BL_AssignmentExtensionForm_Feb2020.pdf
MIS771 Descriptive Analytics and Visualisations Page 2 of 8


Information for students seeking an extension AFTER the due date
If the due date has passed, you require more than two weeks extension, or you have already been provided with
an extension and require additional time, you must apply for Special Consideration via StudentConnect. Please
e aware that applications are governed by University procedures and must be submitted within three business
days of the due date or extension due date.
Please be aware that in most instances the maximum amount of time that can be granted for an assignment
extension is three weeks after the due date, as Unit Chairs are required to have all assignment submitted before
esults/feedback can be released back to students.

Penalties for late submission
The following marking penalties will apply if you submit an assessment task after the due date without an
approved extension:
• 5% will be deducted from available marks for each day, or part thereof, up to five days.
• Work that is submitted more than five days after the due date will not be marked; you will receive 0%
for the task.
Note: 'Day' means calendar day.
The Unit Chair may refuse to accept a late submission where it is unreasonable or impracticable to assess the
task after the due date.
Additional information: For advice regarding academic misconduct, special consideration, extensions, and
assessment feedback, please refer to the document “Rights and responsibilities as a student” in the “Unit Guide
and Information” folder under the “Content” section in the MIS771 CloudDeakin site.
The assignment uses the dataset file Insurance.xlsx, which can be downloaded from CloudDeakin. Analysis of
the data requires the use of techniques studied in Module-1.
Assurance of Learning
This assignment assesses the following Graduate Learning Outcomes and related Unit Learning Outcomes:
Graduate Learning Outcome (GLO) Unit Learning Outcome (ULO)
GLO1: Discipline-specific knowledge and capabilities -
appropriate to the level of study related to a discipline or
profession.
GLO2: Communication - using oral, written and interpersonal
communication to inform, motivate and effect change
GLO5: Problem Solving - creating solutions to authentic (real
world and ill-defined) problems.
GLO6: Self-Management - working and learning independently,
and taking responsibility for personal actions
ULO 1: Apply quantitative reasoning skills
to solve complex problems.

ULO 2: Plan, monitor, and evaluate own
learning as a data analyst.

ULO 3: Deduce clear and unambiguous
solutions in a form that they useful
for decision making and research
purposes and for communication
to the wider public.

Feedback before submission
You can seek assistance from the teaching staff to ascertain whether the assignment conforms to submission
guidelines.
Feedback after submission
An overall mark together with feedback, will be released via CloudDeakin, usually within 15 working days. You
are expected to refer and compare your answers to the feedback to understand any areas of improvement.
MIS771 Descriptive Analytics and Visualisations Page 3 of 8


The Case Study
The United States has one of the highest healthcare costs in the world, spending trillions of dollars on
healthcare, which typically exceeds $10,000 per individual. Studies report that healthcare costs have gone up
from 5% of gross domestic product (GDP) to 18% during the period 1960 to 2018. Moreover, a number of
articles have suggested that predisposing personal characteristics, such as income, age, state, job etc. could
possibly be related to the cost of health services.



“The Americans dying because they can't afford medical
care - A December 2019 poll conducted by Gallup found
25% of Americans say they or a family member have
delayed medical treatment for a serious illness due to the
costs of care.”
Illustration: Mikyung Lee/The Guardian XXXXXXXXXX

“Millions of Americans – as many as 25% of the population – are delaying getting medical help because of
skyrocketing costs”
Michael Sainato Tue 7 Jan XXXXXXXXXXAEDT Last modified on Wed 8 Jan XXXXXXXXXXAEDT

“Young people, who are expected to benefit from lower premiums should the GOP repeal-and-replace efforts
succeed, already pay the least. But even their costs can be considerable, depending on where they live. In
2016, the financial data site ValuePenguin found that the average costs for coverage for a 21-year-old go from
$180 a month in Utah, plus a $2,160 deductible (potentially $4,320 a year, total), to $426 a month in Alaska,
with a $5,112 deductible (potentially $10,224 a year, total).”
Published Fri, Jun XXXXXXXXXX:52 AM EDT Updated Mon, Oct XXXXXXXXXX:55 AM EDT

“Average annual costs per person hit $10,345 in 2016. In 1960, the average cost per person was only $146 —
and, adjusting for inflation, that means costs are nine times higher now than they were then.”
Published Fri, Jun XXXXXXXXXX:52 AM EDT Updated Mon, Oct XXXXXXXXXX:55 AM EDT

“Americans pay a lot for healthcare. Depending on where they live, typical workers shelled out between
$4,500 and $8,300 for healthcare in 2017. But the US government pays even more.”
Tanza Loudenback Mar 8, 2019, 12:25 AM

The UnitedHealth Group: America’s most prominent health insurance provider aims to identify the
characteristics of the population to improve their understanding of the potential influence of these
characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US
Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of
Children, Smoking status, Region and Insurance charges.
You are a Data Analyst working for UnitedHealth Group. Your Manager – Edmond Kendrick has asked you to
conduct a preliminary analysis. In particular, you are expected to apply a series of statistical techniques and
produce a report based on your findings.
Edmond’s email is reproduced on the next page.
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
https:
www.theguardian.com/us-news/2020/jan/07/americans-healthcare-medical-costs#img-1
MIS771 Descriptive Analytics and Visualisations Page 4 of 8


Email from Edmond Kendrick
To:
Your Name

From: Edmond Kendrick
Subject: Analysis of US Health Insurance data
Hi,
As per our conversation, I have spoken with our reporting team and we have following questions relating to
the US health insurance data. Please complete the following analysis for me. Your responses will assist
them in writing the feature section of our next issue.
1. Provide your insights on how the specific attributes of the whole insured population is affecting their
insurance premiums based upon our sample data:
(a) An estimate of the difference in medical costs for a female versus a male
(b) An estimate of the difference in medical costs for a single person versus someone with a family
(c) Males with no dependents have claimed that they have, on average, been charged more than
their female counterparts. Can you check whether this claim is possibly true?
(d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater
proportion of males who are smokers compared to females? Can you check whether this claim can
also be substantiated? Briefly advise the findings in regard to the proportion of males and females who
are smokers.

2. Can you further analyse to see whether the beneficiary's residential area
egion in the US affect how
health insurance provider bill their medical costs?

3. We believe that individual medical costs billed by health insurance differs significantly across age group of
primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood:
Answered Same Day Aug 03, 2021 MIS771 Deakin University

Solution

Biswajit answered on Aug 09 2021
147 Votes
Analysis
1. Provide your insights on how the specific attributes of the whole insured population is affecting their insurance premiums based upon our sample data:
(a) An estimate of the difference in medical costs for a female versus a male
Ans : The estimate of the difference in medical costs for a female vs male is -1387.The 95% confidence interval for difference lies between -2682 to -92.
Our hypothesis was
Null Hypothesis (H0) :No difference in medical costs of female & male
Alternate Hypothesis (Ha) :Medical cost for females is not same as that of males
As p value .0359 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for female is less than that for males.
(b) An estimate of the difference in medical costs for a single person versus someone with a family
Ans :An estimate of the difference in medical cots for a single person vs someone with a family is -1584.The 95% confidence interval for difference in medical costs lies between -2894 to -274.
Our hypothesis here was
Null Hypothesis (H0) :No difference in medical costs for a single person vs someone with family.
Alternate Hypothesis (Ha) :Medical cost for single person is not same as that for someone with family.
As p value .0178 is less than level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for a single person is less than that for someone with family.
(c) Males with no dependents have claimed that they have, on average, been charged more than their female counterparts. Can you check whether this claim is possibly true?
Ans :
Here our hypothesis was :
Null hypothesis :Male with no dependents had been charged less than or equal to that females with no dependents
Alternate Hypothesis : Males with no dependence had been charged more than the females with no dependents
As p value .1781 is greater than level of significance 0.05,we fail to reject the null hypothesis.
In a way we say that at 5% level of significance,medical costs for males with no dependents is same as those for females with no dependents.
The estimate of difference in medical costs between males with no dependents & females with no dependents is 927 & the 95% confidence interval for the difference lies between -1045 to 2899
So the claim of males with no dependents that they have been charged more than female counterparts is false.
(d) We would also like to know if there is gender bias in smoking behaviours. Specifically, is there a greater proportion of males who are smokers compared to females? Can you check whether this claim can also be substantiated? Briefly advise the findings in regard to the proportion of males and females who are smokers.
Ans : Here our hypothesis was :
Null hypothesis :Proportion of male smokers is less than equal to as that of female smokers
Alternate Hypothesis : Proportion of male smokers is more than that of female smokers
As p value .0027 is less than the level of significance 0.05,we reject the null hypothesis.
In a way we say that at 5% level of significance,proportion of male smokers is more than that of female smokers.
The estimate of difference in proportion of male & female smokers is 6.15% & the 95% confidence interval lies between 6.01% to 6.29%.
So the claim of higher male smokers than female smokers is substantiated
2. Can you further analyse to see whether the beneficiary's residential area
egion in the US affect how health insurance provider bill their medical costs?
Ans:Our hypothesis was ANOVA
Null Hypothesis (H0) :There are no differences in medical costs for all 4 regions
Alternate Hypothesis (Ha) :There are differences in medical cost between at least one pair of regions.
ANOVA in Excel built in function shows that there are differences in medical costs between at least one pair of regions as shown from p value of 0.0308
As evident in the Tukey Kramer test,means of medical costs are different between regions southeast & southwest.
So beneficiary’s residential regions affect how their health insurance provider bill their medical costs.
3. We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviour. Is there any evidence to support this assertion?
What associations, if any, exists focusing on smokers in the diverse age groups?
Ans :Here the hypothesis is 2 way ANOVA.
Null Hypothesis (H0) :Mean medical costs of different age groups are equal
Alternate Hypothesis (Ha) :Mean medical costs of at least two age groups are different
Null Hypothesis (H0) :Mean medical costs of smoking & non smoking groups are same
Alternate Hypothesis (Ha) :Mean medical costs of smoking & non smoking groups are different
Null Hypothesis (H0) :Age groups & Smoking group do not interact to affect the medical cost
Alternate Hypothesis (Ha) : Age groups & Smoking group interact to affect the medical cost
As found from the two way ANOVA analysis,the associated p value 2.64E-94, 3.87E-14, 1.71E-06 are all less than 0.05 ,so we reject all 3 null hypotheses.
We conclude that medical costs of at least two different age groups are different,medical costs of smoking & non smoking groups are different ,also there is interaction between age group & smoking behaviour.
We believe that individual medical costs billed by health insurance differs significantly across age group of primary beneficiary (young adults: 18 to 35 years; middle age: 36 to 55 years; and older adulthood: 56 years and older) with their smoking behaviou
Doing a one factor ANOVA shows that the p value 0.025 less than 0.05 shows that there are at least two groups among smokers which have statistically different medical costs.
Tukey Kramer test shows that age groups 18-35 & 36-55 have statistically different costs & absolute difference is 5034.
4. We are interested in comparing the smoking behaviour of primary beneficiaries across each of the Body mass index levels (Under, Normal, Overweight, and Obese). Could we say that there are differences in the proportion of smokers across the four BMI levels?
Further, some studies have claimed that the heavy smoking is often associated with higher body mass index (BMI). Could we reach a similar conclusion for all beneficiaries?
Ans : Here we are using Chisquare test
Null Hypothesis (H0) :No difference in proportion of smokers across different body mass index levels
Alternate Hypothesis (Ha) :There is difference in proportion of smokers across different body mass index levels.
The Marascuilo table shows that there is difference in proportion of smokers between normal & overweight groups.
While there are statistically significant differences in proportion of smokers between normal & overweight groups,we can not conclusively say that heavy smoking is associated with higher body mass index as no statistical difference between other groups.
Introduction :
America’s most prominent health insurance provider UnitedHealth Group aims to identify the characteristics of the population to improve their understanding of the potential influence of these characteristics on their high medical costs billed by an insurance provider. They have access to a sample of US Health Insurance data containing 1338 insured personnel with their Age, Sex, Body Mass Index, Number of Children, Smoking status, Region and Insurance charges.we will be analysis the co
elations,patterns etc in the data & derive significant insights to help improve our business.
We...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here