Part V: Analysis of Variance (ANOVA)
Repeated Measures Designs (RMD)

 
 
The repeated measures design is a frequently used ANOVA design in which all subjects participate under all levels of the IV (hence subjects are repeatedly measured). It is also referred to as a totally within subjects design. Whereas in the SRD ANOVA, subjects are nested within each group, in the repeated measures design, subjects are now crossed with each group since all subjects participate under all levels. The decomposition of SS for each term is as follows.

Picture (575x175, 2.8Kb)

 
 

The main advantage of the RMD is that it controls for subject heterogeneity (individual differences). In the SRD, individual differences between subjects within each group will be superimposed over whatever treatment effects we may have produced in the experiment, and there is no way to tease apart these two sources of variation. In the RMD, since we have only one group of subjects serving in all levels of the IV, we are reducing but not eliminating the error component of the model. Subjects are still likely to respond differently over repeated measures due to changes in motivation, practice effects, etc., but these intrasubject fluctuations are likely to be less than intersubject variations found in the SRD. This reduction in error variance in the RMD represents a direct increase in economy and power. A reduction in time required to run the experiment may also be seen since you do not have to repeatedly give instructions to subjects in different groups. Such a design is also the most common experimental design used to study learning, transfer and practice effects of various sorts. In this case, the interest is in the changes in performance that results from successive experience with a task.

Three major disadvantages are associated with RMD. All of these are interrelated.

  • The first, called practice effects, concerns the fact that subjects will change systematically during the course of multiple testing. Only the treatment administered first is immune to practice effects. A common solution to this problem is to employ enough testing orders to ensure the equal occurrence of each experimental treatment at each stage of practice in the experiment. This is usually accomplished through conterbalancing, and the resulting experimental design becomes a form of Latin Square.

  • A second difficulty is the possibility of differential carry over effects, which counterbalancing will not control. Differential carry over effects are quite specific, in that the earlier administration of one treatment affects a subject's performance on a later condition one way and on a different condition another way. In contrast, practice effects affect all treatment conditions equally.

  • A final problem of RMD is statistical. The statistical model justifying the analysis is highly restrictive in the sense that the scores of the individual are supposed to exhibit certain mathematical properties. Even when carry over effects can be shown to be symmetrical and have caused no distortion of the effects of the IV, the data may not fit the assumptions of the model, producing complications in the statistical analyses.
 
 
We hold the assumptions of normality and homogeneity of within group variances. While most statisticians agree that the F test is robust and insensitive to violations of these assumptions in the SRD design, the same can not be said about the RMD. Of critical concern are the assumptions of homogeneity of within treatment variances and homogeneity of covariance between pairs of treatment levels (more commonly referred to as the compound symmetry assumption). While tests for violations of this assumption exist, they are extremely sensitive to departures from these assumptions and most experiments in the behavioral sciences violate these assumptions anyway. The effect of such violations is to shift the sampling distribution of F to the right; that is, when violations are present, the critical values we are using are too small. The actual critical values we should be using, based on the correct sampling distribution, are larger than those listed in the F table. This results in an F test which is biased in a positive direction. Thus, we will reject the Ho falsely a greater percentage of the time than our statements of significance would imply (we would make more type I errors). To deal with such situations, the F ratio can be corrected to a new critical value which assumes the presence of maximal heterogeneity. This type of correction is called the Geisser-Greenhouse correction. Other less stringent correction strategies that correct the F ratio based on the amount of heterogeneity present are given by Box and by Huynh and Feldt. These strategies can be found in Keppel.

It should be noted that the compound symmetry assumption can only be violated when the repeated measures factor has more than two levels. When there are only two levels, there is only one variance of differences and there is no problem of heterogeneity. It follows from this that single df comparisons conducted on repeated factors are immune to violations of compound symmetry since they involve only two levels of the factor.