Part
X: Other General Linear Model Techniques |
| | The Logic of CVA | Interpreting CVA Results | |
Suppose that we have
several continuous predictor variables and we wish to use
the entire set to predict several criterion variables,
each of which is also continuous. This could be solved in
three basic ways:
CVA simultaneously calculates a linear composite of all x variables and a linear composite of all y variables. Unlike other multivariate techniques, these weighted composites are derived in pairs. Each linear combination is called a canonical variate and takes the general linear form. The weights of these equations are analogous to standardized beta weights and are simultaneously calculated such that the Pearson correlation between the two composites is at a maximum value. This simple bivariate correlation between the set of x variables and the set of y variables is called the Canonical Correlation. The number of dimensions--pairs of canonical variates--is determined by the smaller of the following: (a) the number of x variables, or (b) the number of y variables. Thus, the first Canonical Correlation is derived such that canonical correlation between the two composites is maximized. Then, from the residual variance (variance not accounted for by the first correlation), the next Canonical Correlation is formed, and so on. Therefore, each Canonical Correlation is orthogonal and successive correlations between pairs get smaller and smaller. Canonical Variate Analysis is the most far-reaching analysis in this overview of techniques. In general, almost all other multivariate tests are special cases of CVA. For example, when only one dependent variable exists, the calculation of CVA is identical to that of Multiple Regression. This is true for all techniques that assume linearity. |
A single CVA yields four
sources of information:
A couple last words of caution about CVA: it is highly susceptible to sampling error. CVA determines a huge number of weights based on the sample, and therefore suffers from possible capitalization on chance; this is very much like previous techniques and involves many of the issues raised in the discussion of shrinkage. In general, an N:k ratio of at least 20:1 should be used as should the cross-validation of findings with an independent sample. |