Post-hoc Tests and Two-way ANOVA
Follow up Comparisons and
Two-way ANOVA
Plus SPSS
Review One-way ANOVA
Can use to examine the effect of an IV on a DV when there are 2+ independent groups for the IV
Assumptions
Homogeneity of variance
DV is continuous and normally distributed
IV is categorical
Hypothesis Being Tested with a One-way ANOVA
The null hypothesis is that there are no differences among groups
Alternative hypothesis is that there are differences
Hypotheses are set-up to be non-directional in F-test
Example using 3 groups:
Ho: μ1 = μ2 = μ3
Ha: μ1 ≠ μ2 ≠ μ3 (or said differently - at least two of the means are not equal)
PSYC A303 - CLASS 11
3
Determining Significance
F-test p-value is conceptually similar to t-test test except always on right side
Repeated Measures ANOVA
The repeated measures ANOVA works like the regular ANOVA but we further partition SSw into
SSs: variability that arises from individual differences between subjects
SSE: variability that arises from other sources of e
or in our measurement
We then subtract SSs from SSw to get SSE which will go into our F-ratio equation.
SSw = SSs + SSE -> SSE = SSw – SSs
dfE = (n-1)(k-1)
We will not be calculating these by hand
5
Follow up Tests
Limitations of F-test
F-test only tells you that IV had a significant effect (i.e., at least one mean is different from another)
It doesn’t tell you what means are different from each othe
In order to determine which differences between groups are significant, we need to now ca
y out a follow up test
A test used to make direct comparisons between specific group means.
It’s called post hoc because this happens after you run the ANOVA
7
Planned vs. Post Hoc Comparisons
If we have specific predictions to how the means will turn out, we can conduct planned comparisons (or planned contrasts)
In this case we just conduct independent samples t-tests
No need to co
ect for multiple comparisons -> higher powe
Technically, the planned contrasts need to be orthogonal (linearly independent comparisons)
We don’t need to co
ect for multiple comparisons here because we are clearly not just looking for significant effects after the fact. That’s a good way to find a false positive.
8
Post-Hoc Tests
When comparisons are not planned (i.e., do not have good theoretical reasons to expect certain outcomes) need to co
ect for multiple comparisons (to avoid inflating Type I e
or)
There are other ways to do this
Bonfe
oni co
ection
Tukey HSD to compare all pairwise comparisons
9
t-test with Bonfe
oni Co
ection
Multiple comparisons will inflate the Type I e
or rate such that
Type I e
or rate = 1- (1-alpha)c, where c is the number of comparisons
Bonfe
oni co
ection divides alpha by number of comparison
So p-value must meet a higher threshold to be significant
Bonfe
oni Example
Suppose you have want to test all pairwise comparisons between 3 groups.
There are 3C2 = 3 comparisons.
Bonfe
oni threshold is alpha/# comparisons
If alpha is .05, then -> .05/3= .0167
11
Bonfe
oni Co
ection Characteristics
The Bonfe
oni co
ection is considered very conservative
Low power -> Higher level of Type II e
o
Power OK when # of comparisons relatively low (i.e., <= 3)
when # of comparisons high, too conservative
Example: If alpha is .05 and there are 4 groups, then 4C2 = 6 comparisons, then threshold is .05/6 = .008
12
Tukey Honestly Significant Difference (HSD)
Tukey HSD co
ects Type I e
or for all possible pairwise comparisons
Works like a t-test, but with a different in how e
or is calculated to control for multiple comparisons
Then find Qcrit in a look up table
when sample sizes are equal
with unequal sample sizes
Tukey HSD Characteristics
Tukey HSD gives good results with moderately high power when sample sizes are equal
When sample sizes are unequal, the test is conservative
E.g., the confidence intervals are go beyond 95%.
Summary of follow-up Comparisons
When possible, have a priori planned comparisons
higher powe
When comparisons are post-hoc, choosing the appropriate post-hoc test depends upon a number of factors
In general best to use Tukey HSD as it give relatively high powe
Two-way ANOVA
Two-way ANOVA
With a one-way ANOVA, you calculated an F-ratio to determine if group means diffe
With a two way ANOVA, you will be doing the F-test three times
Once for each main effect using columns means (1 “way”) and row means (the other “way”)
Once for the interaction effect between your factors using cell means
Example: Bandura’s Bo-Bo Doll Experiment
Do you think we humans need to be rewarded or punished to learn effectively or can just watching what happens to other people teach us how to behave? Albert Bandura, along with two colleagues, set up a now famous experiment to see just that (Bandura, Ross, & Ross, 1963).
They asked 40 boys and 40 girls, selected at random, to watch one of two movies. Both movies showed adults hitting, pounding on, pushing, and assaulting a balloon doll called a Bobo doll. Bobo has a weighted bottom so he always bounces back for more punishment.
In half of the movies, the adult assaulting the doll was the same sex as the observer child, and in the other half the adult beating up the doll was the opposite sex of the observer child. In the end, Bandura et al. had four groups:
(1) male children who saw the male adult model
(2) male children who saw the female adult model
(3) female children who saw the male model and
(4) female children who saw the female model.
18
The results of the study
Male Female Row totals
Male 106 41
117 40 Σx = 724
108 34 N = 10
101 38 = 72.4
97 42
Σx = 529 Σx = 195
N = 5 N = 5
= 105.8 = 39.0
Female 51 58
50 51 Σx = 531
49 60 N = 10
49 56 = 53.1
45 62
Σx = 244 Σx = 287
N = 5 N = 5
= 48.8 = 57.4
Overall
Column totals Σx = 773 Σx = 482 Σx = 1,255
N = 10 N = 10 N = 20
= 77.3 = 48.2 = 62.75
Sex of the Child Observe
Sex of the Adult Model
Main effect of sex of the adult model
Main effect of sex of the child observe
We will not be doing Two-way ANOVAs by hand
Instead I will focus the rest of the lecture on reading SPSS output
SPSS ANOVA Example
Create Variables
For your between-subjects IVs need to create a grouping variable
Even though groups are categorical, it’s best to label them with a number and then specify the category level that each number co
esponds to (e.g., 1 for male and 2 for female).
Create a variable for your DV
If you have a within-subjects IV, then you will need to create multiple variables for you DV, one for each level of your within-subject variable
Create Variables
For your between-subjects IVs need to create a grouping variable
Even though groups are categorical, it’s best to label them with a number and then specify the category level that each number co
esponds to (e.g., 1 for male and 2 for female).
Insert values to represent groups
6 infants in each group
Note, there are 6 infants
in group 1.
If there are 6 infants in each
Group and 4 groups, N = 24
Input your Data
SPSS ANOVA
Analyze -> General Linear Model -> Univariate...
You can also use Compare Means -> One-Way ANOVA
If you have a Repeated Measures Design, need to go to Repeated Measures
Univariate
DV goes into Dependent Variable
IVs go into your Fixed Factor(s)
Random factors and Covariates are more advanced features you don’t need to wo
y about yet
Post Hoc
Click Post Hoc...
Post Hoc
Here you can decide which post hoc tests you want to run
We will mostly be using Tukey
Under Options and/or EM Means
In Display, pick Descriptive statistics, Estimates of effect size, and Homogeneity tests
Under EM Means or Options depending upon your version of SPSS
Move your OVERALL and IV over to get mean estimates
Can also Compare main effects (means), if you have planned comparisons
Double check your n and N
Means and SDs of all groups
Again, you do not want this to be significant.
.807>.05, not significant
df: 3 and 20
Examine data for IV. F = 7.43, p = .002. Is p< .05, yes. Statistically significant. Partial eta
squared of .53 is your effect size.
Note: 95% CI appears in next box.
31
These are your post hoc results to determine where significant differences between means lie. An asterix indicates the difference is significant at the .05 level.
Less than .05? Statistically significant difference!
So, for example, the difference between non and mod smokers is 1.45, which is significant at
p=.015 (and .015<.05).
Notice, no difference between non and light.
p=.993 which is greater than .05
Practice Time
F =
MS
MSw
=
SSb / df
SSw / dfw
F=
MS
MS
w
=
SS
df
SS
w
df
w
Qobt =
X 1 − X 2
MSw / n
Q
obt
=
X
1
-X
2
MS
w
n
Qobt =
X 1 − X 2
MSw
2
1
n1
+
1
n2
⎛
⎝
⎜
⎞
⎠
⎟
Q
obt
=
X
1
-X
2
MS
w
2
1
n
1
+
1
n
2
æ
è
ç
ö
ø
÷
Microsoft Word - SPSS_Lab4_Instructions.docx
Lab Assignment #4
To test whether memory changes with age, a researcher conducts an experiment in which
there are four groups of 6 subjects each. The groups differ according to age. In group 1,
the subjects are 30 years old; group 2, 40 years old; group 3, 50 years old; and group 4,
60 years old. Each subject is shown a series of nonsense words at a rate of one word
every 4 seconds. The series is shown twice, after which the subjects are asked to write
down as many of the words as they can remember. The number of words remembered by
each subject is shown below.
30 years 40 years 50 years 60 years
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
a. Conduct the ANOVA (include descriptive statistics and effect size). Use a Tukey
test for your post-hoc test. Remember to test for homogeneity of variance.
.