|
|
||||||
|---|---|---|---|---|---|---|
Analysis of Variance - One Way
The ANalysis Of VAriance (or ANOVA) is a powerful and common statistical procedure in the social sciences. It can handle a variety of situations. We will talk about the case of one between groups factor here and two between groups factors in the next section.
The example that follows is based on a study by Darley and Latané (1969). The authors were interested in whether the presence of other people has an influence on whether a person will help someone in distress. In this classic study, the experimenter (a female graduate student) had the subject wait in a room with either 0, 2, or 4 confederates. The experimenter announces that the study will begin shortly and walks into an adjacent room. In a few moments the person(s) in the waiting room hear her fall and complain of ankle pain. The dependent measure is the number of seconds it takes the subject to help the experimenter.
How do we analyze this data? We could do a bunch of between groups t tests. However, this is not a good idea for three reasons.
| Number Groups |
Number Pairs of Means |
|---|---|
| 3 | 3 |
| 4 | 6 |
| 5 | 10 |
| 6 | 15 |
| 7 | 21 |
| 8 | 28 |
The reason this analysis is called ANOVA rather than multi-group means analysis (or something like that) is because it compares group means by analyzing comparisons of variance estimates. Consider:
We draw three samples. Why might these means differ? There are two reasons:
The ANOVA is based on the fact that two independent estimates of the population variance can be obtained from the sample data. A ratio is formed for the two estimates, where:
| one is sensitive to ® | treatment effect & error | between groups estimate | ||
| and the other to ® | error | within groups estimate |
Given the null hypothesis (in this case HO: m1=m2=m3), the two variance estimates should be equal. That is, since the null assumes no treatment effect, both variance estimates reflect error and their ratio will equal 1. To the extent that this ratio is larger than 1, it suggests a treatment effect (i.e., differences between the groups).
It turns out that the ratio of these two variance estimates is distributed as F when the null hypothesis is true.
Note:![]()
Using the F, we can compute the probability of the obtained result occurring due to chance. If this probability is low (p £ a), we will reject the null hypothesis.
We already knew that:
What is new here is that:
Thus:
| Group | |||
|---|---|---|---|
| 1 | 2 | J | P |
| X11 | X12 | X1j | X1p |
| X21 | X22 | X2j | X2p |
| Xi1 | Xi2 | Xij | Xip |
| Xn1 | Xn2 | Xnj | Xnp |
| T1 | T2 | Tj | Tp |
| n1 | n2 | nj | np |
And:
| 1. | ![]() |
|---|---|
| 2. | ![]() |
| 3. | ![]() |
| 4. | ![]() |
| 5. | ![]() |
So the variance is the mean of the squared deviations about the mean (MS) or the sum of the squared deviations about the mean (SS) divided by the degrees of freedom.
To make this more concrete, consider a data set with 3 groups and 4 subjects
in each. Thus, the possible deviations for the score X13
are as follows:

As you can see, there are three deviations and:
total within
groupsbetween
groups#3 #1 #2
To obtain the Sum of the Squared Deviations about the Mean (the SS), we can square these deviations and sum them over all the scores.
Thus we have:
Note: nj in formula for the SSBetween means do it once for each deviation.
It is simply the ratio of the two variance estimates:
As usual, the critical values are given by a table. Going into the table, one needs to know the degrees of freedom for both the between and within groups variance estimates, as well as the alpha level.
For example, if we have 3 groups and 10 subjects in each, then:
| DfB | = p - 1 | = 3 – 1 = 2 |
|---|---|---|
| DfW |
= p(n - 1) or with unequal N's:
|
= 3 * (10-1) = 27 |
| DfT | = N - 1 | = 30 - 1 = 29 |
| In Symbols | In Words | |
|---|---|---|
| HO | m1=m2=m3 | The presence of others does not influence helping. |
| HA | Not Ho | The presence of others does influence helping. |
Here is the data (i.e., the number of seconds it took for folks to help):
| # people present | |||
|---|---|---|---|
| 0 | 2 | 4 | |
A good way to describe this data would be to plot the means:
For the analysis, we will use a grid as usual for most of the calculations:
| 0 | X2 | 2 | X2 | 4 | X2 | |||
|---|---|---|---|---|---|---|---|---|
| 25 | 625 | 30 | 900 | 32 | 1024 | |||
| 30 | 900 | 33 | 1089 | 39 | 1521 | |||
| 20 | 400 | 29 | 841 | 35 | 1225 | |||
| 32 | 1024 | 40 | 1600 | 41 | 1681 | |||
| 36 | 1296 | 44 | 1936 | |||||
| 107 | 168 | 191 | =466 |
|
||||
| 4 | 5 | 5 | =14 | |||||
| 26.8 | 33.6 | 38.2 | ||||||
![]() |
2949 | 5726 | 7387 | =16062 |
|
|||
![]() |
2862.25 | 5644.8 | 7296.2 | =15803.25 |
|
|||
Now we need the grand totals and the three intermediate quantities:
| I. |
![]() |
|---|---|
| II. | ![]() |
| III. |
![]() |
And:
Thus:
| Source | SS | df | MS | F | p |
|---|---|---|---|---|---|
| Between | 292.11 |
2 |
146.056 |
6.21 |
<.05 |
| Within | 258.75 |
11 |
23.520 |
||
| Total | 550.86 |
13 |
|||
In the formal example presented above, we rejected the null and asserted that the groups were drawn from different populations. But which groups are different from which? A "comparison" compares the means of two groups. There are two kinds of comparisons that we can perform: "preplanned" and "post hoc". These are outlined below. Which approach is used should be based on our goals. In reality, the post hoc approach is the one that is most often taken.
Preplanned Post hoc We have a theory (or some previous research) which suggests certain comparisons. Have a significant overall (or omnibus) F & then want to localize the effect. In this case, we might not even compute the omnibus F (this approach is somewhat analogous to a one-tailed test). Are more commonly used than preplanned comparisons.
In addition, there are "simple" (involving two means) and "complex" (involving more than two means) comparisons. With three groups (Groups 1, 2 & 3), the following 6 comparisons are possible.
| Simple | Complex |
|---|---|
|
1 vs. 2 |
(1 + 2) vs. 3 |
As the number of groups increases, so does the number of comparisons that are possible. Some of these can tell us about trend (a description of the form of the relationship between the IV and DV).
The problem with post hoc tests is that the type I error rate increases the more comparisons we perform. This is a somewhat controversial area and there are a number of methods currently in use to deal with this problem. We will consider one of the more simple methods below.
The protected t test - [Minitab] [Spreadsheet]
instead of
and as a result the df is greater.
The formula is:

So, for our example the critical value of F is 4.84 (from the table) and:
Thus, the only comparison that is significant is that between the first and third groups.
Since the F test is just an extension of the t test to more than two groups, they should be related and they are.
With two groups, F = t2 (and this applies to both the critical and observed values).
For example, consider the critical values for df = (1, 15) with a = .05:
Fcrit (1, 15) = tcrit (15)2
Obtaining the values from the tables, we can see that this is true:
4.54 = 2.1312