Part II: Statistical Tests for Nominal Data
Testing Chi Square Hypotheses

 
 
The general form of the Goodness of Fit null hypothesis specifies the expected proportion of sample outcomes which should fall within each category. Thus, as an index of departure, we must compare some expected frequency with an observed frequency.

Suppose we have Ho: p1 = .25; p2 = .50; p3 = .25. That is, we expected that 25% will be in the first group, 50% in the second, and 25% in the third. Also, say we have N = 100 and observe the following frequencies:

  Event 1 Event 2 Event 3
Observed 30 40 30
Expected 25 50 25

The number of degrees of freedom here is equal to the number of groups - 1, or 3 - 1 = 2. Following the formula for chi square:

Picture (319x140, 2.7Kb)

The result of this calculation would then be compared to the critical value for chi square with 2 degrees of freedom. When we have only two outcomes (making chi square equivalent to the binomial), we must apply Yates' correction for continuity and the formula changes slightly:

Picture (203x72, 1.6Kb)

 
 
This occurs when we have two nomial variables, X and Y (or Y1 and Y2), and we wish to know whether or not they are independent. For chi square tests of independence, we will use a contingency table were the number of outcomes falling in each cell are recorded. Each individual cell is te intersection of each X and Y event class (remember that probability!). With this in mind, if X and Y are truly independent, then the probability of the intersection is equal to the probability associated with the two marginal probabilities. This is how we form the expected proportions for each cell:

  Event 1 Event 2
Event 1

X1 & Y1

X1 & Y2

Event 2

X2 & Y1

X2 & Y2

Say for example, we actually have observed values for some problem. We would calculate the expected frequencies from the observed values (with marginal frequencies included) as follows:

 

Event 1

Event 2

Event 1

10

5

Event 2

10

20

Sum

15

30

Sum

20

25

45

E(X1 & Y1 = (15/45)(20/45) * 45
E(X2 & Y1 = (30/45)(20/45) * 45
E(X1 & Y2 = (15/45)(25/45) * 45
E(X2 & Y2 = (30/45)(25/45) * 45

The calculation of the chi-square statistic is the same as it was for the Goodness of Fit test. Now, however, the formula would add in the squared deviations for all rows rather than just the single row in Goodness of Fit test.

For a test of independence, the degrees of freedom is equal to (the number of rows -1) multiplied by (the number of columns - 1). In the examples above, the degrees of freedom would be (3 - 1)(2 - 1) = 2, and (2 - 1)(2 - 1) = 1 respectively.