Week 12 Lab: Bivariate Inferential Statistics
Name: ___________________________________
Part 1:
Interpret an ANOVA in SPSS:
Introduction:
You are interested in the importance of email in modern American society. You wonder why written co
espondence (even if it’s electronically transfe
ed) is still so central to the lives of Americans given the widespread use of cell phones and the ease of calling, texting, and video chats. You believe that the use of email as a means of formal “work co
espondence” might explain its continued prominence. Therefore, you hypothesize that email is likely used more for business than for personal use. To test this hypothesis, you examine two variables from a sample of Americans in the 2016 General Social Survey.
Your dependent variable, number of hours spent using email per week (emailhr in GSS), is a continuous variable. Your independent variable, work status (wrksts in GSS), measured as “full-time,” “part-time,” or “not employed” is discrete. Based on the level of measurement of your variables, you decide to perform an ANOVA to test whether work status is significantly related to hours spent emailing per week in the population. You want to be 95% confident that any relationship you might find in the GSS sample is also true for the US population in XXXXXXXXXXYour null hypothesis is that there is no relationship in the population between work status and number of hours spent emailing (i.e. the mean hours of emailing in the population is the same for all three categories of work status). Your alternative hypothesis is that full-time workers use email more often than part-time workers who use email more often than those that are not working.
1. First you create a bar chart in SPSS to jointly describe your variables to see if it looks like there is a relationship in your sample.
Based on the bar chart that you’ve produced, does it look like there’s a relationship between work status and the number of hours spent emailing per week in the sample? Explain. How does average number of hours spent emailing each week change by work status?
______________________________________________________________________
______________________________________________________________________
2. Next you perform an ANOVA in SPSS and determine if there are differences in email use by job status in the population.
ANOVA
EMAIL HOURS PER WEEK
Sum of Squares
df
Mean Square
F
Sig.
Between Groups
XXXXXXXXXX
2
XXXXXXXXXX
51.297
.000
Within Groups
XXXXXXXXXX
1466
120.390
Total
XXXXXXXXXX
1468
Based on the ANOVA output that you’ve produced above…
What is the value of the F-ratio for this model? ___________
Is there a significant relationship between work status and email usage in the 2016 US population (to 95% confidence)? Explain your answer. _______________________________
____________________________________________________________________________
3. Calculate the strength of the relationship called eta-squared using the chart above.
a. Look in your output table for two numbers: “Between Groups” and “Total” in the “Sum of Squares” column.
. To find eta-squared, divide the value for between groups by the value for total and write your answer below (rounding to two decimal places is fine).
What is the value for eta-squared? ___________
What is the strength of the relationship (weak? moderate? strong?) and why?
______________________________________________________________________________
Part 2:
Interpret a crosstab in SPSS and check for a significant relationship using chi-square:
Introduction:
You are interested in the potential factors that might influence whether Americans are proud of their nation’s history. You think political views might be an important factor. By definition, conservative means holding traditional values. Therefore, you believe that politically conservative Americans, who are probably more likely to hold traditional values, might have fonder feelings towards the past (and ultimately American history), than those that are less conservative. To test this hypothesis, you examine two variables from a sample of 1,180 Americans in the 2016 General Social Survey.
Your independent variable, political views, is an orderable discrete (ordinal) variable where higher values indicate greater political conservatism. The categories for political affiliation (polviews in GSS) are in the following order from lowest levels of conservatism to highest: “Extremely Liberal,” “Liberal,” “Slightly Liberal,” “Moderate,” “Slightly Conservative,” “Conservative,” “Extremely Conservative.” Your dependent variable, feelings of proudness toward American history (proudhis in GSS), is also an orderable discrete variable with the following categories: “Not proud at all,” “Not very proud,” “Somewhat proud,” “Very proud”. Based on the level of measurement of your variables, you decide to perform a chi-square test to determine whether political affiliation is significantly co
elated with the feeling of proudness respondents have toward American history. You want to be 95% confident that any relationship you might find in the GSS sample is also true for the US population in XXXXXXXXXXYour null hypothesis is that there is no relationship in the population between political affiliation and degree of proudness. Your alternative hypothesis is that more politically conservative respondents will have greater feelings of proudness toward American history.
1. First you create a crosstab to determine whether or not political ideology is related to proudness of American history in the sample.
The output above shows you the crosstabulation with actual frequencies for each combination of categories for the two variables and the percentages of each column for those frequencies.
What percentage of people in the “conservative” category are “not very proud” of American history? _________
What percentage of people in the “liberal” category are “not very proud” of American history? _________
What percentage of people in the “conservative” category are “very proud” of American history? _________
What percentage of people in the “liberal” category are “very proud” of American history? _________
Given your answers to the four questions above, does there appear to be a pattern consistent with our alternative hypothesis in the sample? Explain why you said yes or no.
_____________________________________________________________________________
2. Next you perform a chi-square test to determine whether or not political ideology is related to proudness of American history in the U.S. population.
Chi-Square Tests
Value
df
Asymp. Sig. (2-sided)
Pearson Chi-Square
59.120a
18
.000
Likelihood Ratio
64.082
18
.000
Linear-by-Linear Association
42.175
1
.000
N of Valid Cases
1180
The box above shows you the statistical significance test.
What is the chi-square value for this model? ___________
Is there a significant relationship between political views and proudness of American history in the 2016 US population (to 95% confidence)? Explain your answer. ____________________
____________________________________________________________________________
Because both variables are orderable discrete, you calculate Gamma in SPSS to find the strength of the relationship.
Symmetric Measures
Value
Asymp. Std. E
ora
Approx. T
Approx. Sig.
Ordinal by Ordinal
Gamma
.255
.036
6.910
.000
N of Valid Cases
1180
What is the value for gamma? ___________
What is the strength of the relationship (weak? moderate? strong?) and why?
______________________________________________________________________________
Is the relationship POSITIVE or NEGATIVE (circle one)? Explain what that means in terms of the independent and dependent variables (e.g. “The more/less conservative someone is, the more/less proud of American history they are.”):
______________________________________________________________________________
Part 3:
Interpret a linear regression in SPSS and check for a significant relationship:
Introduction:
You are concerned about a friend who has had recent trauma, and now spends a lot of time watching TV. You wonder if TV watching might be a symptom of (i.e. is related to) poor mental health. You are unsure of the causal mechanism, but are interested in testing whether or not there is a significant relationship between amount of time watching TV and mental health. You would expect, based on your friend’s behavior that individuals who frequently experience poor mental health might watch more TV on average. To test this hypothesis, you examine two variables from the 2016 General Social Survey.
Your independent variable, number of poor mental health days in the past month (mntlhlth in GSS), is a continuous variable. Your dependent variable, number of hours spent watching TV each day (tvhours in GSS), is also continuous. Based on the level of measurement of your variables, you decide to perform linear regression in SPSS to test whether mental health is significantly related to time spent watching TV. You want to be 95% confident that any relationship you might find in the GSS sample is also true for the US population in XXXXXXXXXXYour null hypothesis is that there is no relationship in the population between mental health and time spent watching TV. Your alternative hypothesis is that individuals that have experienced a greater number of poor mental health days in the past month will spend more hours per day watching TV.
1. First you create a scatterplot to look for a relationship in your sample.
Looking at the scatterplot above which has a regression line through the data, does it look like there is a relationship between the two variables?
YES/NO (circle one)
If so, what direction is the relationship and how strong is it (give your rough guess by looking at the scatterplot)? ____________________________________________________________________
2. Next, you perform a linear regression to determine whether or not mental health is significantly tied to time spent watching TV in the US population.
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
B
Std. E
o
Beta
1
(Constant)
2.827
.175
16.115
.000
Days of poor mental health past 30 days
.050
.019
.142
2.705
.007
a. Dependent Variable: Hours per day watching TV
The box above shows the value for the y-intercept and slope of the best fit line through the data. The slope is your regression coefficient.
What is the regression coefficient for this model? b = ____________
This means that for each additional day of poor mental health someone experiences, they will watch an additional b hours of TV per day.
Is there a significant relationship between mental health and TV watching in the 2016 US population (to 95% confidence)? Explain your answer. ______________________________
____________________________________________________________________________
Is the relationship observed in the sample POSITIVE or NEGATIVE (circle one)?
Model Summary
Model
R
R Square
Adjusted R Square
Std. E
or of the Estimate
1
.142a
.020
.017
2.912
The table above shows the measures of association.
What is the strength of the relationship (weak? moderate? strong?) and why?
______________________________________________________________________________
1