Part
II: Statistical Tests for Nominal Data |
| | Goodness of Fit Tests | Tests of Independence | |
| The general form of the
Goodness of Fit null hypothesis specifies the expected
proportion of sample outcomes which should fall within
each category. Thus, as an index of departure, we must
compare some expected frequency with an observed
frequency. Suppose
we have Ho: p1 = .25; p2 = .50; p3 = .25. That is, we
expected that 25% will be in the first group, 50% in the
second, and 25% in the third. Also, say we have N = 100
and observe the following frequencies:
The number of degrees of freedom here is equal to the number of groups - 1, or 3 - 1 = 2. Following the formula for chi square:
The result of this calculation would then be compared to the critical value for chi square with 2 degrees of freedom. When we have only two outcomes (making chi square equivalent to the binomial), we must apply Yates' correction for continuity and the formula changes slightly:
|
||||||||||||
This occurs when we have
two nomial variables, X and Y (or Y1 and Y2), and we wish
to know whether or not they are independent. For chi
square tests of independence, we will use a contingency
table were the number of outcomes falling in each cell
are recorded. Each individual cell is te intersection of
each X and Y event class (remember that probability!).
With this in mind, if X and Y are truly independent, then
the probability of the intersection is equal to the
probability associated with the two marginal
probabilities. This is how we form the expected
proportions for each cell:
Say for example, we actually have observed values for some problem. We would calculate the expected frequencies from the observed values (with marginal frequencies included) as follows:
The calculation of the chi-square statistic is the same as it was for the Goodness of Fit test. Now, however, the formula would add in the squared deviations for all rows rather than just the single row in Goodness of Fit test. For a test of independence, the degrees of freedom is equal to (the number of rows -1) multiplied by (the number of columns - 1). In the examples above, the degrees of freedom would be (3 - 1)(2 - 1) = 2, and (2 - 1)(2 - 1) = 1 respectively. |
||||||||||||||||||||||||||||||||||