Before introducing the mean, there is another term that you should know. A set of data is called a distribution. In the remainder of these notes, the term, distribution, will be used to describe a set of data or numbers that have been collected and that represent different quantified observations of phenomena.

- Add up all of the values (e.g., test scores)
- Divide the sum in Step 1 by the number of values (e.g., number of tests)

The mathematical formula looks like this: |

The first step of adding all the numbers happens quite frequently in statistical calculations; so statisticians have chosen to use the uppercase Greek letter sigma, ∑, to represent summation. Just remember the S in Sigma and the S in Summation to link the symbol with the process. Also n represents the number of values that we're adding. The symbols used for the mean include M and X, pronounced X-bar.

Now let's use the formula with a small distribution of test scores. Even though this example will only involve a few scores, the procedure for calculating the mean is the same no matter how many scores there are.

Scores: 7, 5, 8, 9, 7, 6, 9, 8, 7, 4

Step 1. Add the values. ∑X = 7 + 5 + 8 + 9 + 7 + 6 + 9 + 8 + 7 + 4 = 70

Step 2. Divide by the number of values. X = 70/10 = 7

If you have Excel, you can use it to calculate the mean. First, enter the numbers in a column - one number in each cell. Let's say that the numbers are in cells A1 through A10. Then, enter the following formula in cell A11 (or any other empty cell): = average(A1:A10)

That should be fairly straightforward. So, why isn't the mean sufficient for describing central tendency for all distributions? There are two reasons. The first reason is that the first step in the process, namely adding the values, doesn't always make sense. Think of a variable that represents ethnicity. Different ethnic classifications are arbitrarily assigned different numerical values. African American may be assigned a 1, while Asian is assigned a 3, and Caucasian is assigned a 5. Because these assignments are arbitrary, the actual values, as quantities, mean nothing. So adding them together would make no sense. Calculating a mean score for ethnicity would result in a meaningless number. Means make no sense for variables measured at a nominal level.

The second reason involves the mean's sensitivity to extreme values. What does this refer to? Recall the bricks and board example (or think of a teeter-totter). Placing a new brick near the middle (balancing point) has only a small effect on the balance, but if you placed a new brick at one of the far ends, it would have a great effect on the balance. This effect is larger when there is a small number of bricks. The brick on the far end is called an outlier. The presence of outliers makes the mean inaccurate as a measure of the center. An alternative measure of central tendency is needed in situations with outliers.

- Sort the values from smallest to largest
- If there is an odd number of values, choose the middle value. If there is an even number of values, choose the middle two and average them - add them and divide by 2.

Step 1. Sort the values. 4, 5, 6, 7, 7, 7, 8, 8, 9, 9

Step 2. Because there are 10 scores, pick the middle two scores, which are 7 and 7. Add them, 7 + 7 = 14, and then divide by 2, 14/2 = 7.

If you have Excel, you can use it to calculate the median. First, enter the numbers in a column - one number in each cell. Let's say that the numbers are in cells A1 through A10. Then, enter the following formula in cell A11 (or any other empty cell): = median(A1:A10)

The median is insensitive to extreme values. In the example, you could replace one of the 9s with 100 and the median would not change. Likewise, you could replace the 4 with a 0 and the median would not change either. This illustrates the insensitivity of the median to extreme values. There are situations when neither the mean nor the median is appropriate to use. Because the first step in determining the median requires that the values be sorted, the measurement level of the variable must be at the ordinal level or higher. Like the case with the mean, reporting a median score for a nominal variable is meaningless.

- List the unique values that occur.
- Tally the number of occurrences for each value - the one with the largest tally is the mode.

Step 1. List the unique values that occur.

Values |

4 |

5 |

6 |

7 |

8 |

9 |

Step 2. Tally the occurrences

Values | Tally (frequency) |

4 | 1 |

5 | 1 |

6 | 1 |

7 | 3 |

8 | 2 |

9 | 2 |

So, what is the mode? Is it 7 or 3? The mode is the most frequently occurring value, which is 7. The number of times that the modal value occurs is 3. So, in this example, the mode is 7. Notice that for this distribution the mean, median, and mode are all the same. This happens in certain situations but in most cases these three descriptive statistics will be different. Notice that in this distribution there is only one mode. Sometimes there are two modes or even more. In the context of education, you may have heard a group of students referred to as being bimodal, which would indicate that there are two distinct groups - perhaps good readers and struggling readers, for example.

If you have Excel, you can use it to calculate the mode. First, enter the numbers in a column - one number in each cell. Let's say that the numbers are in cells A1 through A10. Then, enter the following formula in cell A11 (or any other empty cell): = mode(A1:A10)

- There are three main types of measures: mean, median, and mode.
- Each measure attempts to provide a summary for a distribution, namely where the center occurs.
- Which measure to use depends on the specific characteristics of the distribution.
- The only appropriate statistic for a nominal variable is the mode.
- Medians are used for ordinal variables and interval or ratio variables with outliers.
- The mean is used with interval or ratio variables without outliers.
- Each measure has a particular process for determining its value.
- The mode is determined from a tally of values.
- The median is determined from a sorted list of values.
- The
mean is
calculated by adding the values and dividing by the number of values.

The subject of the next section addresses the need to describe the spread of scores in addition to their central tendency. With these two descriptive statistics, more summary information can be provided.

- Locate the maximum value (Max) and the minimum value (Min).
- Subtract the minimum value from the maximum value. Range = Max - Min.

Step 1. Max = 9 and Min = 4

Step 2. Range = 9 - 4 = 5

If you have Excel, you can use it to calculate the range. First, enter the numbers in a column - one number in each cell. Let's say that the numbers are in cells A1 through A10. Then, enter the following formula in cell A11 (or any other empty cell): = max(A1:A10) - min(A1:A10)

First, a bit of terminology. What is a deviation and what might make it standard? A deviation is the distance between a value in a distribution and the mean. Every value in the distribution has a deviation associated with it. Because the process of averaging deviations always results in a value of 0, there is no average deviation. Instead, statisticians use a process of squaring the deviations, averaging them, and then taking the square root of the result to generate a standard deviation. [Squaring is multiplying a number by itself - 3 squared is 3X3 or 9. Taking the square root is the opposite operation - the square root of 9 is 3. For numbers that are not perfect squares, like 4, 9, 16, 25, 36, 49, 64, 81, and 100 are - among infinitely many others, it is handy to have a spreadsheet program or a calculator.] Here are the steps and the formula for the standard deviation. Refer back to these steps to understand the example presented after the formula.

- Calculate the mean, X.
- Subtract each value from the mean - these are the deviations.
- Square the deviations.
- Sum the squared deviations.
- Divide the sum of squared deviations by n-1 - this is called the variance.
- Calculate the square root of the variance - this is the standard deviation.

The mathematical formula looks like this: |

If you have Excel, you can use it to calculate the standard deviation. First, enter the numbers in a column - one number in each cell. Let's say that the numbers are in cells A1 through A10. Then, enter the following formula in cell A11 (or any other empty cell): = stdev(A1:A10)

Scores: 7, 5, 8, 9, 7, 6, 9, 8, 7, 4

Step 1. The mean that we calculated earlier is X = 7.

Step 2. The deviations are shown in the second column below.

Step 3. The squared deviations are shown in the third column below.

Step 4. Summing the squared deviations gives 24, as shown in the third to last cell of the third column.

Step 5. Dividing 24 by n-1 (10-1=9) gives 2.67, as shown in the second to last cell of the third column.

Step 6. Calculating the square root of 2.67 gives 1.63, as shown in the last cell of the third column.

X | X-X | (X-X)^{2} |

7 | 7-7=0 | 0X0=0 |

5 | 5-7=-2 | -2X-2=4 |

8 | 8-7=1 | 1X1=1 |

9 | 9-7=2 | 2X2=4 |

7 | 7-7=0 | 0X0=0 |

6 | 6-7=-1 | -1X-1=1 |

9 | 9-7=2 | 2X2=4 |

8 | 8-7=1 | 1X1=1 |

7 | 7-7=0 | 0X0=0 |

4 | 4-7=-3 | -3X-3=9 |

sum of squared deviations | 24 | |

sum divided by n-1 (9) | 2.67 | |

square root (standard deviation) | 1.63 |

Just to reiterate, no one calculates the standard deviation by hand following the steps listed here; however, understanding that the standard deviation is a measure of the amount of variation (or spread) in a distribution is very important. For example, if you have two standard deviations for similar sets of data and one is quite a bit larger than the other, the means that describe the centers of the two distributions are not equally accurate. The mean with the smaller deviation is a better summary of the distribution than the mean with the larger standard deviation. Think of the standard deviation as a quality-control measure for the mean. In fact, means should never be reported without accompanying standard deviations.

Here is one way to remember what the standard deviation tells you. Suppose you are attending a conference and are offered your choice of two dorm rooms for your housing. The only difference between the two rooms has to do with the plumbing. Because it is farther from the water heater, the shower water temperature in one room fluctuates more than in the other. Let's say that the hot water in Room 652 has a mean temperature of 100 degrees and a standard deviation of 6 degrees. Room 247 also has a mean temperature of 100 degrees but its standard deviation is only 2 degrees. Assuming that the distribution of water temperatures resembles a bell-shaped curve (much more about this in the weeks ahead), the person showering in Room 247 will generally experience hot water temperatures between 96 and 104 degrees 95% of the time - generally tolerable. Consider the person showering in Room 652. The hot water temperature in that shower will fluctuate between 88 and 112 degrees 95% of the time. [The range of temperatures reflects plus or minus two standard deviations from the mean of 100 degrees.] Which room would you pick?

How does this apply to educational settings? Would you prefer to teach in a classroom where the standard deviation of students' reading scores is large or small? Of course, this is not a math question. Students with different skill levels can benefit from interaction with more and less capable peers. Large fluctuations in prerequisite skill levels can serve to frustrate the teacher as well as the students, however. If you are an administrator and you don't really understand standard deviations, you might create classrooms where the mean score levels of students are equal but where some classrooms have more variation than others. As we are focusing on raising mean scores of students, leaving no students behind, should we set goals to increase standard deviations as well or should these be decreasing?