Part X: Other General Linear Model Techniques
Overview of Canonical Variate Analysis (CVA)

 
 
Suppose that we have several continuous predictor variables and we wish to use the entire set to predict several criterion variables, each of which is also continuous. This could be solved in three basic ways:
  • Compute a simple bivariate correlation between each x and each y variable. However, if we had ten predictors and twenty criterion measures, we would need to calculate an unwieldy number of bivariate correlations. Then, how do you interpret everything as a whole?
  • Compute a separate MR equation for each y variable. While this is more efficient than above, we would still have to calculate separate equations which may be redundant.
  • Relate several x variables to the y variables simultaneously. In other words, use Canonical Variate Analysis.

CVA simultaneously calculates a linear composite of all x variables and a linear composite of all y variables. Unlike other multivariate techniques, these weighted composites are derived in pairs. Each linear combination is called a canonical variate and takes the general linear form.

The weights of these equations are analogous to standardized beta weights and are simultaneously calculated such that the Pearson correlation between the two composites is at a maximum value. This simple bivariate correlation between the set of x variables and the set of y variables is called the Canonical Correlation.

The number of dimensions--pairs of canonical variates--is determined by the smaller of the following: (a) the number of x variables, or (b) the number of y variables. Thus, the first Canonical Correlation is derived such that canonical correlation between the two composites is maximized. Then, from the residual variance (variance not accounted for by the first correlation), the next Canonical Correlation is formed, and so on. Therefore, each Canonical Correlation is orthogonal and successive correlations between pairs get smaller and smaller.

Canonical Variate Analysis is the most far-reaching analysis in this overview of techniques. In general, almost all other multivariate tests are special cases of CVA. For example, when only one dependent variable exists, the calculation of CVA is identical to that of Multiple Regression. This is true for all techniques that assume linearity.

 
 
A single CVA yields four sources of information:
  • R Canonical Values: These Canonical Correlations can be tested for overall significance (i.e. how well the set of x predicts the set of y). Wilk's Lambda is used in CVA as it represents the combined information of all CV pairs. That is, if we have calculated K pairs of CV's, the first Wilk's Lambda tests the null hypothesis that each correlation is equal to the others. If this is rejected, we conclude that R Canonical > 0. A dimension reduction analysis can be done such that any correlation can be removed from the null hypothesis, thereby testing a smaller number of canonical variates using Wilk's Lambda.
  • C and D Weights: These tell roughly about the relative contributions of the original x and y variables to the relationship between the x set and the y set.
  • Structure Correlations: As in MANOVA and LDFA, structure correlations are defined as the correlation between the original variable and the composite. This provides a more stable source of information about the relative contribution of a variable.
  • Redundancy Analysis: This basically determines what percentage of variance in one set is accounted for by the variance in the other set. To see how redundant the sets are, examine the residual variance for the sets after all the pairs of Canonical Variates have been extracted. If one set has residual variance and the other does not, then the second is merely a subset of the first.

A couple last words of caution about CVA: it is highly susceptible to sampling error. CVA determines a huge number of weights based on the sample, and therefore suffers from possible capitalization on chance; this is very much like previous techniques and involves many of the issues raised in the discussion of shrinkage. In general, an N:k ratio of at least 20:1 should be used as should the cross-validation of findings with an independent sample.