Great Deal! Get Instant \$25 FREE in Account on First Order + 10% Cashback on Every Order Order Now

# Directions: Answer each question below in its entirety. Submit your answers through Canvas by the due date. Your submission should be a Word document and named with the following structure:...

due date. Your submission should be a Word document and named with the following structure:
LASTNAME_hw4.doc. Be sure to show your work and to list the names of any students with whom
you worked on this assignment. Please round to the nearest hundredth in your calculations.
Part I: Two-Sample Hypothesis Tests
1.) An Amazon warehouse manager wants to know if their day crew is more time efficient with
their work than the night crew. They draw a random sample of 23 day crew workers and 24
night crew workers. Using these samples, they find that the day crew manages to box an
average of 132 boxes per hour while the night crew manages 122 boxes per hour. The
standard deviation for the day crew is 20 boxes per hour; the standard deviation for the night
crew is 16 boxes per hour. Using this information and an a = .05, decide on the appropriate
hypothesis test, calculate the test statistic, and interpret the results. Be sure to follow the
“five step” system for hypothesis testing outlined in class and in the textbook. Show all of
2.) An English professor teaching a university-required expository writing class is curious if the
English majors in their class are more likely to read the material prior to lecture than the
non-English majors. The class is much too large to get information from every student, so
they randomly sample 55 English majors and 56 non-English majors in the class. In these
samples, 37% of the English majors and 22% of the non-English majors read the material
prior to class. Using this information and an a = .01, decide on the appropriate hypothesis
test, calculate the test statistic, and interpret the results. Be sure to follow the “five step”
system for hypothesis testing outlined in class and in the textbook. Show all of your work.
3.) A researcher studying perceptions of neighborhood safety wants to know if there is a
difference between men and women when it comes to fear of walking in their neighborhood
at night. They run a hypothesis test in R using the “fear” measure in the 2014 General Social
Survey (GSS)—a dichotomous variable where 0 indicates they are not afraid to walk in their
neighborhood at night and 1 indicates they are afraid. Using the output below, answer the
following questions:
a. What is the dependent variable? What is the independent variable?
b. What test do you think they ran, and why did they run it?
c. What is the null hypothesis? What is the alternative hypothesis?
d. The z-statistic is XXXXXXXXXXWhat is this number telling us? Be sure to interpret it in
relation to the null hypothesis.
e. How would you interpret the outcome of this test? What is the p-value telling you?
SOC 353 2
PAfraid
z -statistic
p-value (twotailed)
Male 0.224
Female 0.421
Difference XXXXXXXXXX
4.) A researcher wants to know if people who see themselves as working class tend to work
longer hours than people who see themselves as middle class. The dependent variable (from
the same GSS dataset as before) is the average number of hours usually worked per week by
the respondent. The researcher runs an independent samples t-test and presents you with the
output below. They interpret the p-value for the one-tailed test to the right as telling them
that there is a 33.85% that the class difference in hours worked is non-zero in the
population. They are happy with these chances, so they say that this difference is statistically
significant. Explain to them why they are wrong, and how they are interpreting the p-value
incorrectly. After you do this, show them how to correctly interpret the results from the test.
n
Mean
t -statistic
p-value (onetailed,
to right)
Working Class XXXXXXXXXX
Middle Class XXXXXXXXXX
Difference XXXXXXXXXX
Part II: Two-Sample Hypothesis Tests in R
5.) There is a dataset called the “Swiss Fertility and Socioeconomic Indicators XXXXXXXXXXData”
(Mosteller and Tukey XXXXXXXXXXThe dataset provides a set of fertility and socioeconomic
variables for 47 French-speaking Swiss provinces around 1888. I have provided an adapted
version of the dataset as part of this assignment. Download the “swiss_data.rds” file and
move it to your working directory. Once it is there, continue with the prompt below.
You are an historical sociologist interested in the following question: “Were provinces with
higher proportions of Catholic residents more likely to have higher proportions of residents
in agricultural occupations than provinces with higher proportions of Protestant residents?”
Looking at the data, you see two relevant variables. The first variable, “Agriculture,” lists the
total proportion of working men in each province with an agricultural profession. The
second variable, “Religion,” is a dichotomous variable where “More Catholic” is a province
where at least 50% of residents are Catholic and “More Protestant” is a province where at
least 50% of residents are Protestant.
Carry out an independent samples t-test in R to answer this question. You will want to carry
out the following commands (once the RDS file is in your working directory):
swiss
t.test(A ~ B, alternative = “C”, var.equal = FALSE)
where the A indicates where you’ll need to specify the dataset (or “object,” in R lingo) and
the dependent variable name, B indicates where need to again specify the dataset and the
SOC 353 3
independent variable name, and “C” indicates whether the test is two-tailed or one-tailed. If
the test is two-tailed, then “C” will be “two.sided”; if the test is one-tailed, then “C” will be
one of “less” or “greater” (depending on the direction of the hypothesis).
Copy and paste the R output showing the t-test into your write-up. After you do that, please
interpret the output. Why did you run an independent samples t-test instead of an
independent samples z-test?
Reference
Mosteller, F. and J. W. Tukey XXXXXXXXXXData Analysis and Regression: A Second Course in Statistics.

## Solution

Pooja answered on Nov 02 2021
1)
Step 1:
Ho: there is no significant difference in the mean time-efficiency between day crew and night crew.
h1: day crew is more time efficient with their work than the night crew.
step 2:
alpha = 5%
step 3:

SAMPLE 1
SAMPLE 2
n=
23
24
mean=
132.00
122.00
s=
20.0000
16.0000
s^2/n
17.3913
10.6667
test statistic, t = (Xbar1 - Xbar2) / sqrt(s1^2/n1 + s2^2/n2)
t =     = (132 - 122) / sqrt(17.3913+10.6667)
t =     1.8879
Step 4:
df=    (s1^2/n1 + s2^2/n2)^2 / ((s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1))
df=    (17.3913+10.6667)^2 / ( 17.3913^2/(23-1) + 10.6667^2/(24-1) )
df=    42
t(a, df) = t(0.05,42)
t(a, df) = abs(t.INV(0.05,42))
t(a, df) = 1.682
Step 5:
Since t > t(a, df), I reject the null hypothesis at 5% level of significance and conclude that day crew is more time efficient with their work than the night crew.
2)
Step 1:
ho: there is no difference in the proportion of material read prior to lecture for English and non-English majors
h1: English majors in their class are more likely to read the material prior to lecture than the non-English majors
Step 2:
alpha = 1%
Step 3:
p = (p1 * n1 + p2 * n2) / (n1 + n2)
p = (0.37*55+0.22*56)/(55+56)
p = 0.29430
z = (p1 - p2) / sqrt{ p*(1-p) * [ (1/n1) + (1/n2) ]...
SOLUTION.PDF