Part
II: Statistical Tests for Nominal Data |
| | The Binomial and Its Approximation | Chi Square Distributions | |
| Binomials are utilized
when our measure y is dichotomous (2 outcomes) and
both outcomes are mutually exclusive and exhaustive.
Therefore, the sum of the probabilities for the events
equals 1 [p(event 1) + p(event 2) = 1.00]. This can then
be extended to situations where this is done several
times. For example, we have three coins. For each coin, the probability of obtaining a head is p=.5, and the probability of obtaining a tail is q=.5 (where q = 1-p). However, when all three are flipped, there are four basic outcomes: 3 heads, 2 heads, 1 head, and 0 heads. The probability of 3 heads is simply, p(3 heads) = (.50)(.50)(.50) = .125. Following similar logic, the probability of 3 tails equals q(0 heads) = (.50)(.50)(.50) = .125. The probability of obtaining 2 heads equals p(2 heads) = p(2 heads and a tail) = ppq = .125. However throwing 2 heads and 1 tail can happen three ways: HHT, HTH, and THH; therefore, p(2 heads and a tail) = (3) p(2 heads and a tail) = .375. The same logic follows for figuring the probability of 1 head and 2 tails such that p(1 head and 2 tails) = .375. Overall, the sum of the probabilties of each event is equal to 1 (Total p for all 4 outcomes = .125 + .125 +.375 + .375 = 1.00).
[3! / (3-2)!(2)!] (.52) (.51) = .375 You will note that the binomial distribution is the exact sampling distribution for dichotomous situations. However, as the sample size gets infinitely large, the binomial distribution approaches a normal curve. Thus, when np > 5 and nq > 5, we can use z scores and the normal curve to approximate the binomial. However, one consideration must be made. Since the binominal is a discrete distribution and the normal curve is a continuous distribution, we must correct for continuity by using Yates' Correction. Below is the formula for the normal approximation of the binomial:
|
| Recall that the binomial
distribution can be used for only one variable with
mutually exclusive and exhaustive, dichotomous outcomes [Y=1,2].
What happens, however, if the possible outcomes of y
are still categorical but have more than two classes? That
is, suupose we have Y=1,2,3,...k, what do we do? Just as the binomial is the exact sampling distribution for the Y=1,2 situation, the multinomial is the exact sampling distribution for the Y=1,2,3,...k case. And, just as we can use a normal curve to approximate the binomial, the chi square sampling distribution (along with the statistic you can calculate with your data) will approximate the multinomial.
A small technical distinction: Chi Square is a continuous, theoretical sampling distribution. The sampling distribution of your statistic is discrete. Chi square approximates the discrete statistic just as the normal curve approximates the discrete binomial. |