Statistics and Probability
Assignment on Hypothesis Testing
January 30, 2023
Instructions
This document contains the questions for your assignment project
on Statistical Testing. The questions refer to the data given in the
individual worksheets in Excel document â€˜Assignment Datasets.xlsxâ€™.
Please read the following points.
1. All submissions must be in the form of PDF documents. Spread-
sheets exported to PDF will be accepted, but calculations must
be annotated or explained.
2. It is up to you how you do the calculations in each question, but
you must explain how you arrived at your answer for any given
calculation. This can be done with a written explanation and
by using the relevant equations, along with showing the results
of intermediate stages of the calculations. In other words, you
need to show that you know how to do a calculation for a statistic
other than using spreadsheet functions.
3. Each one of the questions involves a statistical test. Marks within
each question will generally be awarded for:
1
â€¢ Deciding which statistical test to use,
â€¢ Framing your Hypotheses and proper conclusions,
â€¢ Identifying the parameters for the test and
â€¢ Showing a reasonable level of clarity, detail and explanation
in the calculations needed to carry out the test.
4. The data you have been given is in the worksheets of an Excel
spreadsheet. This spreadsheet is locked against editing. Please
to not try to circumvent this; if you wish to use a spreadsheet to
do your calculations, you should copy and paste your data into
your own spreadsheet and work with that.
Question 1
The lifetimes (in units of 106 seconds) of certain satellite components
are shown in the frequency distribution given in â€˜Dataset1â€™.
1. Draw a frequency polygon, histogram and cumulative frequency
polygon for the data.
2. Calculate the frequency mean, the frequency standard deviation,
the median and the first and third quartiles for this grouped data.
3. Compare the median and the mean and state what this indicates
about the distribution. Comment on how the answer to this ques-
tion relates to your frequency polygon and histogram.
4. Explain the logic behind the equations for the mean and standard
deviation for grouped data, starting from the original equations
for a simple list of data values. (This does not just mean â€™explain
how the equations are usedâ€™.)
Page 2
5. Carry out an appropriate statistical test to determine whether the
data is normally distributed.
Question 2
A manufacturer of metal plates makes two claims concerning the
thickness of the plates they produce. They are stated here:
â€¢ Statement A: The mean is 200mm
â€¢ Statement B: The variance is 1.5mm2.
To investigate Statement A, the thickness of a sample of metal plates
produced in a given shift was measured. The values found are listed
in Part (a) of worksheet â€˜Dataset2â€™, with millimetres (mm) as unit.
1. Calculate the sample mean and sample standard deviation for the
data in Part (a) of â€™Dataset2â€™. Explain why we are using the phrase
â€™sampleâ€™ mean or sampleâ€™ standard deviation.
2. Set up the framework of an appropriate statistical test on State-
ment A. Explain how knowing the sample mean before carrying
out the test will influence the structure of your test.
3. Carry out the statistical test and state your conclusions.
To investigate the second claim, the thickness of a second sample of
metal sheets was measured. The values found are listed in Part (b) of
worksheet â€˜Dataset2â€™, with millimetres (mm) as unit.
1. Calculate the sample mean and then the sample variance and
standard deviation for the data in Part (b).
Page 3
2. Set up the framework of an appropriate statistical test on State-
ment B. Explain how knowing the sample variance before carry-
ing out the test would influence the structure of your test.
3. Carry out the statistical test and state your conclusions.
Question 3
A manager of an inter-county hurling team is concerned that his team
lose matches because they â€˜fade awayâ€™ in the last ten minutes. He
has measured GPS data showing how much ground particular players
cover within a given time period; this is the data in list (a) in worksheet
â€˜Dataset3â€™. He has acquired the corresponding data from an opposing,
more successful team, which is given in list (b).
1. Calculate the sample mean and sample standard deviation for the
two sets of data.
2. Set up the frame work of an appropriate statistical test to deter-
mine whether there is a difference in the distances covered by the
two groups of players.
3. Explain how having the results of the calculations above in ad-
vance of doing your statistical test will influence the structure of
that test.
4. Carry out the statistical test and state your conclusions.
Question 4
A study was carried out to determine whether the resistance of the
control circuits in a machine are lower when the machine motor is
Page 4
running. To investigate this question, a set of the control circuits was
tested as follows. Their resistance was measured while the machine
motor was not running for a certain period of time and then again
while the motor was running. The values found are listed in worksheet
â€˜Dataset4â€™, with kilo-Ohms as the unit of measurement.
1. Set up the structure of an appropriate statistical test to determine
whether the resistance of the control circuit in a machine are
lower when the machine motor is running.
2. Explain how the order of subtraction chosen to calculate the dif-
ferences will influence the structure of the test.
3. Give a reason why the data is measured with the engine not run-
ning first and then with the engine running.
4. Explain how knowing the mean of the differences in advance will
influence the structure of your statistical test.
5. Carry out the statistical test and state your conclusions.
Question 5
A study was carried out to determine the influence of a trace element
found in soil on the yield of potato plants grown in that soil, defined as
the weight of potatoes produced at the end of the season. A large field
was divided up into 14 smaller sections for this experiment. For each
section, the experimenter recorded the amount of the trace element
found (in milligrams per metre squared) and the corresponding weight
of the potatoes produced (in kilograms). This information is presented
in the worksheet â€˜Dataset5â€™ in the Excel document. Define X as the
trace element amount and Y as the yield.
Page 5
1. Draw a scatterplot of your data set.
2. Calculate the coefficients of a linear equation to predict the yield
Y as a function of X.
3. Calculate the correlation coefficient for the paired data values.
4. Set up the framework for an appropriate statistical test to estab-
lish if there is a correlation between the amount of the trace ele-
ment and the yield. Explain how having the scatterplot referred
to above and having the value of r in advance will influence the
structure of your statistical test.
5. Carry out and state the conclusion of your test on the correlation.
6. Comment on how well the regression equation will perform based
on the results above.
Question 6
A multinational corporation is conducting a study to see how its em-
ployees in five different countries respond to three gifts in an incentive
scheme. The numbers of employees who choose each of the three gifts
(G1 to G3) in each of the five countries (A to E) are given in the table
in â€˜Dataset6â€™ in the Excel document.
1. Set up the structure of an appropriate statistical test to deter-
mine whether the data supports a link between choice of gift and
country, including the statistic to be used.
2. Carry out this test, showing clearly in your work how the expected
values are calculated for your test statistic.
Page 6
Dataset 1
Assignment : Hypothesis Testing
Type the last three digits of your student number in the green cell:
678
XXXXXXXXXX
XXXXXXXXXX 328
Dataset 1
7
20 Groups Frequencies
0.1 300 to 307 12 XXXXXXXXXX
0.8 307 to 314 18 XXXXXXXXXX
62 0.67 314 to 321 44 XXXXXXXXXX
0.8 0.76 321 to 328 88 XXXXXXXXXX
1 328 to 335 86 XXXXXXXXXX
0.73 335 to 342 41 XXXXXXXXXX
XXXXXXXXXX 0.45 342 to 349 15 XXXXXXXXXX
XXXXXXXXXX 0.42 349 to 356 9 XXXXXXXXXX
XXXXXXXXXX 0.14
XXXXXXXXXX 0
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
&"Helvetica Neue,Regular"&12&K000000&P
Dataset 2
XXXXXXXXXX Assignment : Hypothesis Testing
XXXXXXXXXX
XXXXXXXXXX Dataset 2
20
0.1 Part (a)
207.20 202.13 196.93 198.16 197.74 198.15
0.8 207.65 203.68 197.13 197.06 196.60 197.55
208.93 202.22 198.63 197.09 197.40 198.36
207.51 201.32 197.97 198.31 198.04 198.78
206.02 200.07 196.67 199.85 199.05 200.31
205.84 199.09 197.67 198.40 200.32 199.29
204.36 198.89 196.90 197.34 199.11 200.46
Part (b)
203.64 197.56 198.07 198.70 198.13 202.00
203.23 198.43 199.61 197.65 198.25 200.55
198.56 199.07 199.70 199.13 203.00 204.23
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
&"Helvetica Neue,Regular"&12&K000000&P
Dataset 3
XXXXXXXXXX Assignment : Hypothesis Testing
XXXXXXXXXX
XXXXXXXXXX Dataset 3
20
0.1 12
1500 List (a)
10 1571.96 1521.34 1469.33 1481.63 1477.40 1481.47 1500.52
0.8 1576.48 1536.76 1471.25 1470.57 1465.99 1475.55 1470.84
1589.26 1522.16 1486.27 1470.91 1473.99 1483.60 1504.37
1575.09 1513.18 1479.72 1483.13 1480.39 1487.82
List (b)
1548.18 1488.69 1454.72 1486.54 1478.50 1491.10 1477.71
1546.39 1478.92 1464.75 1472.01 1491.17 1480.93 1489.03
1531.61 1476.90 1457.05 1461.36 1479.14 1492.60 1472.54
1524.35 1463.57 1468.71 1474.99 1469.29 1508.04 1484.82
1520.35 1472.29 1484.15 1464.52 1470.48 1493.46
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
&"Helvetica Neue,Regular"&12&K000000&P
Dataset 4
XXXXXXXXXX
XXXXXXXXXX Assignment : Hypothesis Testing
20 Dataset 4
0.1
0.8 Resistance:
14
0.8 Motor running Motor not running
0.86 15.72 15.12 XXXXXXXXXX 0.2
0.9 15.80 15.18 XXXXXXXXXX 0.18
XXXXXXXXXX 1 16.00 15.27 XXXXXXXXXX 0.07
XXXXXXXXXX 0.88 15.76 15.10 XXXXXXXXXX 0.14
XXXXXXXXXX 0.76 15.52 14.74 XXXXXXXXXX 0.02
XXXXXXXXXX 0.75 15.50 14.74 XXXXXXXXXX 0.04
XXXXXXXXXX 0.63 15.26 14.62 XXXXXXXXXX 0.16
XXXXXXXXXX 0.57 15.14 14.45 XXXXXXXXXX 0.11
XXXXXXXXXX 0.54 15.08 14.28 XXXXXXXXXX 0
XXXXXXXXXX 0.45 14.90 14.18 XXXXXXXXXX 0.08
XXXXXXXXXX 0.57 15.14 14.36 XXXXXXXXXX 0.02
XXXXXXXXXX 0.45 14.90 14.21 XXXXXXXXXX 0.11
XXXXXXXXXX 0.38 14.76 14.20 XXXXXXXXXX 0.24
XXXXXXXXXX 0.28 14.56 13.88 XXXXXXXXXX 0.12
XXXXXXXXXX 0.2 14.40 13.63 XXXXXXXXXX 0.03
XXXXXXXXXX 0.18 14.36 13.59 XXXXXXXXXX 0.03
XXXXXXXXXX 0.07 14.14 13.47 XXXXXXXXXX 0.13
XXXXXXXXXX 0.14 14.28 13.74 XXXXXXXXXX 0.26
XXXXXXXXXX 0.72 15.44 14.78 XXXXXXXXXX 0.14
XXXXXXXXXX 0.72 15.44 14.69 XXXXXXXXXX 0.05
&"Helvetica Neue,Regular"&12&K000000&P
Dataset 5
XXXXXXXXXX
XXXXXXXXXX Assignment : Hypothesis Testing
0.8
28 Dataset 5 132
0.2 0.15
4.8
62 Additive Yield
XXXXXXXXXX 1 77 119.68 XXXXXXXXXX 0.48
0.2 XXXXXXXXXX 74.37 119.26 XXXXXXXXXX 0.31
62.2 XXXXXXXXXX 75.1 119.78 XXXXXXXXXX 0.44
XXXXXXXXXX XXXXXXXXXX 76.07 119.29 XXXXXXXXXX 0.37
XXXXXXXXXX XXXXXXXXXX 73.74 118.88 XXXXXXXXXX 0.21
XXXXXXXXXX XXXXXXXXXX 72.83 119.49 XXXXXXXXXX 0.31
XXXXXXXXXX XXXXXXXXXX 71.81 118.92 XXXXXXXXXX 0.16
XXXXXXXXXX XXXXXXXXXX 70.94 118.72 XXXXXXXXXX 0.09
XXXXXXXXXX XXXXXXXXXX 68.53 119.08 XXXXXXXXXX 0.09
XXXXXXXXXX XXXXXXXXXX 65.94 119.04 XXXXXXXXXX 0
XXXXXXXXXX XXXXXXXXXX 68.11 119.29 XXXXXXXXXX 0.12
XXXXXXXXXX XXXXXXXXXX 70.43 119.85 XXXXXXXXXX 0.31
XXXXXXXXXX XXXXXXXXXX 70.61 120.5 XXXXXXXXXX 0.45
XXXXXXXXXX XXXXXXXXXX 69.07 120.06 XXXXXXXXXX 0.31
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
&"Helvetica Neue,Regular"&12&K000000&P
Dataset 6
Assignment : Hypothesis Testing
15 Dataset 6
0.12
G1 G2 G3
A 10 13 14
B 18 11 6
School C 16 20 17
D 12 25 13
E 5 22 14
XXXXXXXXXX
XXXXXXXXXX 0.8 XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
&"Helvetica Neue,Regular"&12&K000000&P
Reference
3
678
Student numbers Last3 Seed value
B XXXXXXXXXX 204 1 0.8
B XXXXXXXXXX 224 1
B XXXXXXXXXX 476 1
B XXXXXXXXXX 935 0
B XXXXXXXXXX 479 1
B XXXXXXXXXX 662 1
B XXXXXXXXXX 463 1
B XXXXXXXXXX 837 0
B XXXXXXXXXX 309 1
B XXXXXXXXXX 219 1
B XXXXXXXXXX 85 1
B XXXXXXXXXX 353 1
B XXXXXXXXXX 414 1
B XXXXXXXXXX 800 0
B XXXXXXXXXX 347 1
B XXXXXXXXXX 307 1
B XXXXXXXXXX 44 1
B XXXXXXXXXX 110 1
B XXXXXXXXXX 967 0
B XXXXXXXXXX 464 1
B XXXXXXXXXX 570 1
B XXXXXXXXXX 650 1
B XXXXXXXXXX 882 0
B XXXXXXXXXX 304 1
B XXXXXXXXXX 276 1
B XXXXXXXXXX 488 1
B XXXXXXXXXX 585 1
B XXXXXXXXXX 295 1
B XXXXXXXXXX 300 1
B XXXXXXXXXX 346 1
B XXXXXXXXXX 534 1
B XXXXXXXXXX 448 1
B XXXXXXXXXX 233 1
B XXXXXXXXXX 32 1
B XXXXXXXXXX 458 1
B XXXXXXXXXX 56 1
B XXXXXXXXXX 582 1
B XXXXXXXXXX 439 1
B XXXXXXXXXX 196 1
B XXXXXXXXXX 627 1
B XXXXXXXXXX 455 1
B XXXXXXXXXX 814 0
B XXXXXXXXXX 322 1
B XXXXXXXXXX 901 0
B XXXXXXXXXX 724 0
B XXXXXXXXXX 328 1
B XXXXXXXXXX 853 0
B XXXXXXXXXX 7 1
B XXXXXXXXXX 463 1
B XXXXXXXXXX 522 1
B XXXXXXXXXX 878 0
B XXXXXXXXXX 983 0
B XXXXXXXXXX 503 1
B XXXXXXXXXX 367 1
B XXXXXXXXXX 975 0
B XXXXXXXXXX 51 1
B XXXXXXXXXX 146 1
B XXXXXXXXXX 765 0
B XXXXXXXXXX 210 1
B XXXXXXXXXX 959 0
B XXXXXXXXXX 834 0
B XXXXXXXXXX 572 1
B XXXXXXXXXX 67 1
B XXXXXXXXXX 640 1
B XXXXXXXXXX 863 0
B XXXXXXXXXX 876 0
B XXXXXXXXXX 39 1
B XXXXXXXXXX 956 0
B XXXXXXXXXX 73 1
B XXXXXXXXXX 969 0
B XXXXXXXXXX 10 1
B XXXXXXXXXX 688 0
B XXXXXXXXXX 187 1
B XXXXXXXXXX 882 0
B XXXXXXXXXX 112 1
B XXXXXXXXXX 282 1
B XXXXXXXXXX 654 1
B XXXXXXXXXX 373 1
B XXXXXXXXXX 176 1
B XXXXXXXXXX 24 1
B XXXXXXXXXX 972 0
B XXXXXXXXXX 339 1
B XXXXXXXXXX 312 1
B XXXXXXXXXX 10 1
B XXXXXXXXXX 95 1
B XXXXXXXXXX 610 1
B XXXXXXXXXX 198 1
B XXXXXXXXXX 430 1
B XXXXXXXXXX 754 0
B XXXXXXXXXX 841 0
B XXXXXXXXXX 311 1
B XXXXXXXXXX 946 0
B XXXXXXXXXX 852 0
B XXXXXXXXXX 676 1
B XXXXXXXXXX 770 0
B XXXXXXXXXX 18 1
B XXXXXXXXXX 255 1
B XXXXXXXXXX 502 1
B XXXXXXXXXX 838 0
B XXXXXXXXXX 111 1
B XXXXXXXXXX 30 1
&"Helvetica Neue,Regular"&12&K000000&P