Discussion of Z or Standard Scores

Any set of scores can for which you can calculate a mean and standard deviation can be changed into standard scores, also called "Z scores." Subtract the mean from each score. Note that all scores greater than the mean will result in a positive answer to the subtraction, called the score's "deviation." All scores less than the mean will have a negative deviation. Keep track of the negative signs! Divide each deviation by the standard deviation to obtain the Z score. As explained in the video lessons, usually about 2/3 of the scores will be within one standard deviation of the mean so they will have Z scores from about -1 to about +1. It is possible that all the scores will be within one standard deviation or even that they are all equal to the mean, as in the set of scores: 5,5,5,5,5,5,5,5

In my basic statistical course, I find it is helpful for the students if we always calculate the standard deviation with the following steps:

  1. Calculate the mean
  2. Subtract the mean from each score (resulting in the deviation for each score, mentioned above and needed to calculate the score's corresponding Z score)
  3. Square each deviation (that is, multiply each deviation by itself)
  4. Sum these squared deviations
  5. Divide that "sum of squares" by N, the number of scores (resulting in the "variance")
  6. Take the square root of that variance (that is, find a number that equals the variance when the number is multiplied by itself. This square root is the Standard Deviation.

Why bother with all of this? Because it is required in the course, Silly! A better reason is that the transformation of scores into Z scores allows for better examination of oddly divergent scores, called "outliers" [out-lie-ers]. Any score too far from the mean may need checking or further verification or investigation. In this course, we use 2 standard deviations, a Z score outside the range -2 to +2, as the definition of outlier. Sometimes it is better to use 3 or 2.4 standard deviations as the warning tracks for outliers but we will use 2.

Note that the process of calculating the standard deviation of a set of scores involves dividing by N. In the early 1900's, a famous statistician first showed that dividing by N-1 gives a better estimate of the standard deviation of a parent population when you only have figures from a sample. Therefore, some courses, books, teachers and calculators automatically divide by N-1. In my course, we do not do that. We just divide by N. In using Excel, you can obtain the standard deviation using N as a divider by using the function =stdevp(firstcell:lastcell) The "p" stands for the word population. We divide by N when we have the entire population at hand and are not trying to estimate the population value from a sample.