Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

STA2300 Data Analysis S1, 18 1 Assignment 3 Due Date: 29 May, 2018 Weighting: 25% Full Marks: 100  Answering the questions in this assignment should not be your first attempt at these types of...

1 answer below »
STA2300 Data Analysis S1, 18

1
Assignment 3
Due Date: 29 May, 2018
Weighting: 25%
Full Marks: 100


 Answering the questions in this assignment should not be your first attempt at these types of
questions. It is essential that you work through practice exercises from the tutorial sheets and
Text Book first.
 This assignment is important in checking your knowledge, providing feedback and helping to
establish competency in essential skills.
 Answer all the questions. The questions are not of equal weight; some questions are worth much
more than others.
 The questions relate to material up to and including Module 10.

 Before starting this assignment read Notes Concerning Assignments under the Introductory
Material link on the StudyDesk.
 When you are asked to comment on a finding, usually a short paragraph is required.

 Do not copy/paste SPSS output into your assignment unless specifically asked to do so. In many
cases the SPSS output contains much more information than is required for a co
ect and
complete answer. In those cases just reproducing the output may not attract any marks. Make
sure you report only the information from the SPSS output relevant to your answer.

 Unless instructed otherwise, show all working and formulae used in calculating confidence
intervals and performing hypothesis tests. (Answers may of course be checked where possible
using SPSS).

 In order to obtain full marks for any question you must show all working.

 Submission is via the link on the StudyDesk.
 This assessment item consists of 5 questions.

STA2300 Data Analysis S1, 18

2

Requirements for a passing grade:
As you may have seen in the Course Specification, to receive a passing grade you must achieve at least 40% (i.e., 20/50) of the
marks available in the final examination, and at least 50% of the total weighted marks available for the course.

If you get over 50% weighted marks in all assessments, but did not get at least 40% marks in the final exam, you will not pass
the course.

Also, if you get at least 40% marks in the final exam but do not get at least 50% of weighted marks in the course, you will not
pass the course.

Note on Assignment 3 Solutions, Marks and Late Submissions:
Because of the timing of Assignment 3, marks for this assignment will not be available until after the exam. However, feedback
for this assignment will be available in the form of comprehensive worked solutions before the exam via the StudyDesk. As a
esult, any Assignment 3 submitted after 5pm AEST on the Friday before the exam period will not receive any marks.


Question 1 (25 marks)

Use the information in the dataset DHS18.sav to answer the following questions. You should use SPSS
to calculate the sample statistics you will need to do this question, but for parts (a) and (d) you are
equired to do the rest of the calculations by hand, using a calculator.

(a) (7marks) Estimate the population mean weight of women with no education in 2011, using a
99% confidence interval (show all working). Make sure you ONLY select women who have no
education.

(b) (6marks) Check the appropriate conditions and assumptions needed for the validity of the
confidence interval or hypothesis test for the population mean weight of women with no
education (include an appropriate graph to support your answer).

(c) (3 marks) From historical data, a researcher knows that the average weight of women in
developing countries who have no education is 52.5 kg. State appropriate hypotheses (define
any symbols used) to perform a hypothesis test to see if there is evidence to support her
suspicion, based on the data in this study, that the average weight of women in developing
countries in 2011 who have no education is greater than the historical value (regardless of
whether the conditions in part (b) are satisfied).

(d) (2 marks) Calculate the value of a suitable test statistic for the test in part (c)

(e) (4 marks) Find the P-value of the test, based on the test statistic calculated in part (d), and write
a meaningful conclusion at the 1% level of significance.

(f) (3 marks) Now, check your answers for parts (d) and (e) by finding the value of the test statistic
and the P-value using SPSS. Include SPSS output in your answer and comment on the comparison
with the hand calculated values. Explain any differences.



STA2300 Data Analysis S1, 18

3

Question 2 (27 marks)

Use the information in the dataset DHS18.sav to answer the following questions. You should use SPSS
to calculate any sample statistics you will need to do this question, but for parts (d)-(g) you are required
to do the rest of the calculations by hand, using a calculator and statistical tables.

According to the Bureau of Statistics in the developing country being surveyed, 5% of women were
‘higher educated’ before 2011. The researcher believes that the proportion of all women in the
developing country with such qualifications was no longer 5% in 2011.

(a) (1 mark) What is the variable of interest to the researcher?

(b) (3 marks) State the appropriate hypotheses (define any symbols used) to test the researcher’s
claim that the proportion of women who are ‘higher educated’ in 2011 was no longer 5%.

(c) (4 marks) Check the conditions and assumptions for the test in part (b).

(d) (4 marks) Calculate the test statistic for the test in part (b)

(e) (8 marks) Find the P-value for the test in part (d) and write a meaningful conclusion in the context
of this situation.

(f) (4 marks) If the researcher wants to be 99% confident that the margin of e
or of the estimate
of the true proportion of women who are ‘higher educated’ is within 0.06, what minimum
sample size is required? Use a conservative method in determining the sample size.

(g) (3 marks) The researcher decides that instead of using a conservative method (as required in
part (f)), she will use information obtained from the DHS18.sav data to decide how many women
she would need to survey (keeping the same level of confidence and margin of e
or). What is
the impact of this decision? (Include evidence to support your answer).


Question 3 (16 marks)

Use the information in the dataset DHS18.sav again. The systolic blood pressure (BP) of the women was
measured in 2011 and in a follow-up in 2014. The researcher wants to know, if, on average, the systolic
BP of the poorest women in 2014 is significantly greater than the systolic BP of the same cohort in 2011.
Make sure you select ONLY poorest women.

(a) (3 marks) State appropriate hypotheses (define any symbols used).

(b) (2 marks) State (but do not check) the assumptions for ca
ying out this test. Describe the
assumptions in the context of this question.

(c) (2 marks) Without using SPSS, calculate the value of a suitable test statistic for this test. You can
use SPSS for calculating appropriate sample statistics.
STA2300 Data Analysis S1, 18

4


(d) (3 marks) Without using SPSS, calculate the P-value of this test.

(e) (3 marks) Interpret the P-value and describe the outcome of the test in the context of this
question.

(f) (3 marks) Now use SPSS to ca
y out the analysis. Copy and paste the relevant SPSS output to
your assignment solution. Do these results agree with those found in part (e)? (Hint: comment
on the P-value).


Question 4 (20 marks)

Use the information in the dataset DHS18.sav to answer the following questions. You should use SPSS to
calculate any sample statistics you will need to do this question, but for part (e) you are required to do
the rest of the calculations by hand, using a calculator.

The researcher is concerned that the weight of women in 2014 depends on their wealth. She believes
that the average weight of ‘poorer’ women is greater than that of the ‘richer’ women in this developing
country.

(a) (4 marks) Use an appropriate graph to compare the distribution of weight of ‘poorer’ women
with that of ‘richer’ women. Label the axes co
ectly, include a unit of measure and provide an
appropriate title. Make sure you select ONLY ‘richer’ and ‘poorer’ women.

(b) (2 marks) Using the graph produced in part (a),
iefly describe the distribution of weight for the
two groups of women (poorer and richer).

(c) (3 marks) State appropriate hypotheses (defining all symbols) to answer the question: ‘Is the
average weight of women greater for all ‘poorer’ women compared to all ‘richer’ women in this
developing country in 2011?’

(d) (2 marks) Check the assumptions for ca
ying out the test in part (c).

(e) (2 marks) Without using SPSS, calculate a suitable test statistic for the test in part (c).

(f) (4 marks) Without using SPSS, find the P-value of the test. Interpret the P-value and describe
the outcome of the original question.

(g) (1 mark) Now use SPSS to check your results for this hypothesis test. Copy and paste the relevant
output from SPSS for this test into your assignment.

(h) (2 marks) Briefly comment on how the test statistic and P-value from SPSS output are similar to
or differ from your hand calculations.


STA2300 Data Analysis S1, 18

5

Question 5 (12 marks)

Give a
ief answer to each of the following six (6) questions:

(a) (2 marks) State the differences between convenience sampling and cluster sampling.

(b) (2 marks) Explain the difference between a Type 1 and
Answered Same Day May 22, 2020 STA2300

Solution

Pooja answered on May 24 2020
135 Votes
Q1)
Sample statistics:
    Statistics
    weight
    N
    Valid
    87
    
    Missing
    0
    Mean
    53.0816
    Std. E
or of Mean
    .10619
    Median
    53.0000
    Std. Deviation
    .99048
    Variance
    .981
    Skewness
    -.090
    Std. E
or of Skewness
    .258
a)
CI = mean +- z(a/2,n-1)*(sd/sqrt(n))        
Lower = 53.08 - 2.576*(0.99/sqrt(87)) = 52.808
Upper = 53.08 + 2.576*(0.99/sqrt(87)) =    53.355
99% confidence interval for the population mean weight of women with no education in 2011 is (52.8019, 53.3613)
)
    Tests of Normality
    
    education
    Kolmogorov-Smirnova
    Shapiro-Wilk
    
    
    Statistic
    df
    Sig.
    Statistic
    df
    Sig.
    weight
    .00
    .055
    87
    .200*
    .991
    87
    .811
    
    1.00
    .061
    180
    .200*
    .995
    180
    .795
    
    2.00
    .058
    118
    .200*
    .991
    118
    .654
    
    3.00
    .079
    26
    .200*
    .967
    26
    .539
    *. This is a lower bound of the true significance.
    a. Lilliefors Significance Co
ection
Ho: data is normally distributed. V/s h1: data is not normally distributed. With Shapiro - wilk Statistic equal to.991 and co
esponding p-value >5%, I fail to reject Ho and conclude that data is normally distributed.
From the graph it is evident that distribution of weight of women with no education is negatively skewed.
c)
Null hypothesis, ho: the average weight of women in developing countries in 2011 who have no education is equal to the historical value. u = 52.5
Alternative Hypothesis, h1: the average weight of women in developing countries in 2011 who have no education is greater than the historical value. u > 52.5
d)
Mean = 53.08
sd = sqrt(var) = 0.99
u = 52.50
n = 87.00
Test statistic,
z = (mean-u)/(sd/sqrt(n))
= (53.08-52.5)/(0.99/sqrt(87))
5.477
e)
P-value
1- P(Z1-P(z<5.477)
=1-NORMSDIST(5.477)
0.0000
P-value = .000
With (t=5.477, p<1%), the null hypothesis is rejected at 1% level of significance and conclude that the average weight of women in developing countries in 2011 who have no education is greater than the historical value. u > 52.5
f)
    One-Sample Statistics
    
    N
    Mean
    Std. Deviation
    Std. E
or Mean
    weight
    87
    53.0816
    .99048
    .10619
    One-Sample Test
    
    Test Value = 52.5
    
    t
    df
    Sig. (2-tailed)
    Mean Difference
    95% Confidence Interval of the Difference
    
    
    
    
    
    Lowe
    Uppe
    weight
    5.477
    86
    .000
    .58161
    .3705
    .7927
With (t=5.477, p<1%), the null hypothesis is rejected at 1% level of significance and conclude that the average weight of women in developing countries in 2011 who have no education is greater than the historical value. u > 52.5
The same result is obtained using by hand and SPSS output.
Q2)
a)
Variable of interest is proportion of all women in the developing country with ‘higher educated’.
)
ho: proportion of women who are ‘higher educated’ in 2011 was 5%. P = 5%
h1: proportion of women who are ‘higher educated’ in 2011 was no longer 5%. P =/= 5%
c)
    education
    
    Frequency
    Percent
    Valid Percent
    Cumulative...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here