THE IMPORTANCE OF DATA DISTRIBUTIONS   

    

A.  INTRODUCTION

        When data are collected to formulate or to test cause and effect
        hypotheses, there is typically some variability to the values that  
       
are measured. The range in values and how frequently specific  
        values occur is important to a proper interpretation of data.                        

 

B.  FREQUENCY DISTRIBUTIONS

        There is almost an infinite variety of data frequency distributions,
        but we shall consider only three in our efforts to understand how 
        frequency distributions affect the interpretation of data. 
     
        1.  NORMAL DISTRIBUTIONS


             The normal distribution (sometimes called the "bell curve") is
             perhaps the best-known. Data are symmetrically distributed to
             either side of a central value, so the entire population can be
             represented equally well by
the mean, median, or the mode.
 

        2.  SKEWED DISTRIBUTIONS

             Skewed distributions are similar to the normal distribution, but
             the data are not distributed symmetrically.  In these cases, the
             entire population is better represented by either the median or
             the mode than by the mean.
    

        3.  UNIFORM DISTRIBUTION

             A uniform distribution means that there is no central value but
             that every value has the same likelihood of occurring.  In such
             cases, the mean, median, and mode are meaningless.
            

C.  PROBABILITY DISTRIBUTIONS

        Probability distributions show the likelihood that a given value will
        occur as an increasing number of data points are collected.  The 
        best way to illustrate this is to contrast the probability distributions
        for two of the frequency distributions described above:

        1.  UNIFORM DATA

             When data have a uniform frequency distribution, it means that
             they are a result of
random variations; and randomness means
             that no simple cause and effect relationship exists between the
             observed values and some other variable.

             Example:  probability of rolling a six with six-sided die.

 

        2.  NORMAL DATA

             When data have a normal frequency distribution, it means that
             they are not the result of random variations, and certain values
             are more likely than others. This suggests that there is a some
             reason (or cause) for the observed values (effects).

             Example:  probability of eating a meal during the day.

 

        Note that these two probability distributions have different shapes.

 

D.  EXAMPLE:  RECURRENCE INTERVALS

        1.  DEFINITION

             A recurrence interval is the average time period, usually in
             years, between "events" of the same magnitude (such as a
             Magnitude 7 earthquake or 100-year flood).  But remember
             that average (or mean) values are meaningful only for data
             distributions that have a
central tendency.

       

         2.  RANDOM EVENTS

             In this context, random events are those that occur without 
             any pattern to their sequence through time.  Therefore, the
             probability (or likelihood) of a truly random event occurring
             does not in any way depend on what occurred previously.
 

             a.  One-Year Probability

                   The probability that a random event will occur during 
                   any given year depends on the recurrence interval of 
                   that event (in years):

                   Risk or Probability  =   1 / (Recurrence Interval)
 

              b.  Cumulative Probability 

                   The probability that a random event will occur during
                   any given time period depends on the time period of
                   interest and the recurrence interval of that event:

                   Risk or Probability  =  1 - [1 - (1 / R.I.)]n

 

         3.  NON-RANDOM EVENTS 

               The probability of non-random events occurring during a
               given year or time period does depend on what happened
               previously (example:  normal data distribution).