Part I: Probability & Hypothesis Testing
Intermediate Probability

 
 
Picture (384x192, 3.1Kb)There are three basic types of probability with which we will be concerned: marginal, joint, and conditional. To best demonstrate these, consider the following example. The chart simply shows the number of individuals, male or female, that achieved a GRE score either above or below 1100.

The first type of probability is the marginal probability. This involves the numbers in the margins and the total number of students. For example, asking the question "What is the probability that a student scored above 1100?" involves the following: p(>1100) = 100/300 = 1/3. In effect, taking the marginal frequency and dividing by the total number of individuals will give you the marginal probability.

The second type of probability--joint probability--involves the numbers within particular cells. Thus, joint probability concerns the intersection between two classes. For example, the question "What is the probability of being male and scoring above 1100?" can be answered by the following: p(male [intersection] >1100) = 40/300 = 2/15. The joint probability, then, can be determined by dividing the number of individuals within the cell by the total number of individuals.

Lastly, conditional probability [p(A/B)] is used when you are given a marginal probability. For example, asking "What is the probability that a student will score higher than 1100 given that he is a male?" can be answered by the following: p(>1100 / male) = (40/300) / (140/300) = 2/7. Thus, the conditional probability is simply the joint probability divided by the marginal probability. Note that the part that follows the "given" is always on the bottom of the fraction.

The notion of conditional probability is a very important one, perhaps one of the most useful in probability theory for research purposes. It is frequently the case that the occurrence of some event will affect, in some way, the occurrence of another event. In the above example, it may well be the case that the sex of the individual has, in some way, affected the abilities of that sex on the average. This concern becomes central to the calculation of probabilities for event classes.

 
 
The concept of conditional probability leads us to consider 2 other terms: independence and dependence. Independence between 2 events exists when the occurrence of the first event does not change the probability of the occurrence of the second event. Thus, p(A) = p(A/B). For example, suppose that we know that within some neurotic sample, the spontaneous recovery rate is 50% [p(recovery) = .5] and that now we want to administer psychotherapy to this sample. If independence exists between therapy and recovery [p(recovery) = p(recovery/therapy)], then we should not expect the occurrence of therapy in our sample to change the rate of recovery. On the other hand, dependence exists if the occurrence of one event does change the probability of a second event. If therapy had some beneficial impact in the sample we might find that p(recovery) = .50 and p(recovery/therapy) = .80. In this case, p(A) does not equal p(A/B).

The principle of conditional probability can be used to solve for intersections. Remember that the formula for conditional probability is as follows:

Picture (600x50, 2.6Kb)

Note that when the events are independent (i.e. there is no intersection), p(A) = p(A/B); thus, the marginal probability can be substituted for the conditional probability.

In general, then, intersections can be found using two separate multiplication rules, each of which are applicable to situations where the event classes have a particular relationship:

  • When A and B are Dependent: p(A & B) = p(A) * p(B/A) = p(B) * p(A/B).

For example, what is the probability of drawing two aces from a deck of cards when there is no replacement after each draw? In this case, the events are clearly dependent since the probability of drawing a second ace is certainly altered by the fact that an ace is drawn on the first try and then not placed back into the deck of cards. Therefore, this can be found by: p(Ace on First Draw) * p(Ace on Second Draw / Ace on First Draw) = 4/52 * 3/51.

  • When A and B are Independent: p(A & B) = p(A) * p(B).

Now consider the probability of drawing two aces when there is sampling with replacement. Now these two events are clearly independent because the first card is replaced after being drawn. In other words, p(Ace on Second Draw) = p(Ace on Second Draw / Ace on First Draw). Therefore, this is found by: p(Ace on First Draw) * p(Ace on Second Draw) = 4/52 * 4/52.

 
 
Since the calculation of probability depends on knowledge of total number of possible events, it becomes important at this point to discuss the five rules for determining sample space:
  1. If any one of K mutually exclusive and exhaustive (meaning that every outcome in the sample space is used) events can occur on each of N trials, then there are KN different sequences that may result from a set of trials.
  2. If K1, K2, ..., KN are the number of distinct events that can occur on trials 1, 2, ..., N in a series, then the number of different sequences of N events that can occur is (K1)(K2)...(KN). This rule differs from the first in that each trial can have a different number of possibilities for each trial.
  3. Permutations: The number of different ways that N distinct things can be arranged is N! where N! = (N)(N-1)(N-2)...(0). (Note: 0! always equals 1.) In other words, a permutation is the number of ways N objects can be arranged when taken N at a time.
  4. Order Permutations: The number of ways to select and arrange r objects from among N distinct objects is found by N! / (N-r)!. Order permutations are different from permutations in that they look at the number of ways N can be arranged when taken r at a time. In this case, the arrangement of two letters ("ab") is considered separate from other orderings of the same letters ("ba"). In other words, order matters!
  5. Combinations: The total number of ways of selecting r distinct combinations of N objects irrespective of order is N! / (N-r)!(r)!. In this case, the order of arrangement of two items does not matter ("ab" = "ba"). Combinations, therefore, are concerned with the number of ways objects can be arranged when taken r at a time when order does not matter. Notice that combinations are always smaller than permutations.