DATA ANALYSIS:  CORRELATION

 

A.  BACKGROUND

       1.  CORRELATION

            Correlation means that two variables (sets of data) have some
            type of association with each other, such that as one variable
            increases, the other also increases (a positive correlation), or
            decreases (a negative correlation).              
 

       2.  CAUSE AND EFFECT 

            It is tempting to assume that when two variables are positively
            correlated that one causes the other (i.e., the variables have a
            "cause and effect" relationship, but this is not always the case

            The purpose of today's lecture is learn how to establish cause
            and effect relationships from correlations and why this can be
            a difficult task.
   
   

B.  TYPES OF CORRELATIONS

       In geology there are several different types of correlations that can
       be used to help establish cause and effect relationships:
 

       1.  SPATIAL PATTERNS

            Example:  the distribution of landslides and topography (U.S.)

            Spatial correlations could be coincidental, so there needs to be
            a reasonable causal mechanism to explain why the relationship
            establishes cause and effect.

 
       2.  TEMPORAL PATTERNS

            Example:  trends in mean sea level though time

            Note that time is not the cause of temporal trends, but temporal
            trends suggest that the variable which changes through time is
            not behaving randomly (i.e., there is a reason for the trend). 

 

       3.  PHYSICAL PROPERTIES

            Example:  sediment grain size and permeability

            Geologists look for correlations between the physical properties
            of earth materials when they already suspect that there a reason
            for such a relationship to exist.  Thus the correlation becomes a
            test of the hypothesized cause and effect relationship.

 

       4.  PHYSICAL PROCESSES

            Example:  rainfall and surface water runoff

            Geologists look for correlations between earth processes when
            they suspect that there a reason for such a relationship to exist.
            Thus the correlation becomes a test of the hypothesized cause
            and effect relationship.  

  

C.  DIFFICULTIES

       1.  COINCIDENTAL CORRELATION

            Just because two variables are correlated does not necessarily
            mean that one causes the other (e.g.,  the behavior of the stock
            market and its relationship to the winner of the NFL Super Bowl). 

            There always needs to be a reasonable causal mechanism that
            explains why the correlation reflects cause and effect.

               
       2.  APPARENT TRENDS

            Trends that are not statistically significant, either because they
            are too weak or because there are too few data, should not be
            used as evidence for cause and effect (e.g., trying to establish
            mean sea level trends with only one year of data).
 

        3.  CORRELATED EFFECTS 

            Two variables might be correlated because both are effects of
            the same cause (e.g., the worldwide distributions of volcanoes
            and earthquakes are highly correlated, because both occur at
           
subduction zone plate boundaries).

 

       4.  THRESHOLDS

            Cause and effect relationships may not become apparent until
            after a certain  threshold  has been reached (e.g., the effect of
            slope angle on landslide velocity).

 

       5.  MULTIPLE FACTORS

            Relationships might be obscured by the fact that there is more
            than one factor controlling the "effect" (e.g., the occurrence of
            landslides is controlled by slope angle and cohesion).

 

       6.  NON-LINEAR CORRELATION

            Correlations that are not linear require more data to be defined
            accurately (e.g., river discharge and flood levels).


 

D.  ESTABLISHING CAUSE AND EFFECT

       In summary, to establish that a cause and effect relationship exists
       between correlated variables, one should ask the following:
 

       1.  ARE THERE ENOUGH DATA?

            Correlations based on two or three points and correlations that
            are not statistically significant do not establish cause and effect.

        
            

       2.  IS THERE A CAUSAL MECHANISM?

            A correlation is more suggestive of cause and effect if there is a
            causal mechanism which can explain the relationship.
 

 

       3.  HAS THE MODEL BEEN TESTED?

            A correlation is more suggestive of cause and effect if it can be
            used to predict a future event or condition. 

            However, one must always be cautious about extrapolation!

             

       EXAMPLEThe Rocky Mountain Arsenal

       When the arsenal began injecting liquid wastes into the subsurface
       in 1962, there was a sudden increase in the number of earthquakes
       near Denver.  Geologists suspected that high pressures associated
       with waste injection might be causing these earthquakes, and they
       were able to test this hypothesis to establish a cause and effect.