This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chapter 4 The Mean:
The Most Popular Average In this chapter, you will learn how to compute the most popular aver—
age, what it really tells us, and when it should and should not be used. The most widely used average is the arithmetic mean, usually called
_ the mean. Computing it is simple: just sum (add up) the scores and divide
by the number of scores. This is the formula: 2x
M=T Where:
M = mean
X = scores
N = number of scores Note that the Greek letter 2 (sigma) is pronounced “sum of” in statistics.
Although most applied researchers use M as the symbol for the mean,
many statisticians use this symbol, which is pronounced “Xbar”: X
Although computing the mean is easy, understanding its meaning is a
bit harder. Since its meaning has im ortant implications, we’ll consider it
carefully. The mean is deﬁned as “the point around which the deviation
scores sum to zero.’ Now that’s a mouthful! Let’s see what it means with an example. Let’s suppose these scores are the number of cents donated by
kindergarten children to charity: 4, 8, 8, 9,11,14,16 For these scores, the mean is 10.0. (Since the scores sum to 70 and there
are 7 scores, 70/7 = 10.0). Now, if we subtract the mean from each score,
we have what are called the deviation scores, which indicate the number
of points that each score differs from the mean. The symbol for the devi
ation scores is a lower case x, to differentiate it from the scores 2] Chapter 4 The Mean: The Most Popular Average themselves, which are represented by an upper case X. The scores we are
considering and the corresponding deviation scores are shown in Table 4.1.
Table 4.1 Scores and Deviation Scores Score Deviation Score (X) Minus Mean Equals (x)
—16—_ f W T T 14 — 10.0 = 4.0 11 — 10.0 = 1.0 9 — 10.0 = l.0 8 — 10.0 = ~2.0 8 — 10.0 = —2.0 4 — 10.0 = —6.0 Sum = 0.0 Notice that the deviation scores in the last column sum to 0.0. This result
is not unique to this set of scores. Indeed, for any set of scores, the sum of
the deviations from the mean equals zero. Thus, the mean is the value that
has an equal number of deviation points above it (1.0 + 4.0 + 6.0 = 11.0)
and below it (—6.0 +—2.0 + —2.0 + 1.0 = —l 1.0). That is, the sum of the
deviations above the mean equals the sum of the deviations below the
mean. It’s important to note that the mean is not deﬁned as the value that
has an equal number of cases above and below it. Indeed in our example,
there are four cases below the mean (with scores of 4, 8, 8, and 9) and only
three cases above the mean (with scores of l 1, l4, and 16).] Why should we care that the mean has an equal number of deviation
points instead of an equal number of cases on both sides of it? Simply
because one case or a small percentage of cases with very extreme scores
can greatly affect the mean. Here’s an example: suppose the highest score in our example was 80 cents instead of 16 cents. Then these are the scores
and their mean: 4, 8, 8, 9,11,14,80 M= 19.1 In a perfectly normal, symmetrical distributlon, the mean has equal numbers of cases on both
sides of it. In a skewed distribution, it does not. 22 Chapter 4 The Mean: The Most Popular Average Notice that the mean has been pulled up by the one case with a large
number of deviation points, the case with a score of 80. Is 19.1 a good or
representative average? No, because none of the children gave close to 19
cents and because 6 of the 7 children gave less than 19 cents. A small number of cases that are very different from the others, either
because they are very high or very low, are called mm the example
we just considered, a score of 80 is clearly an outlier. Outliers pull the
mean toward them and, if sufﬁciently different from the other values, may
make the mean unrepresentative. This occurs because the mean must
balance the number of deviation points (and not the number of cases) on
each side. {Nam 0975 5 You may have already guessed that outliers create a tail that produces
“gm/gala skewed distribution. You learned about skewed distributions in the last
chapter. Thus, when a distribution is highly skewed, the mean is not a Ohm) good choice as an average. We’ll explore alternative averages that should
be used for describing skewed distributions in the next chapter. There’s one other important implication derived from the fact that the
mean is the balance point of the number of deviation points on each side
of it: the mean is appropriate only for use with equal interval data, which
we explored in Chapter 1. It makes no sense to talk about the balance point
among the deviation points if the differences among scores represent dif
ferent amounts of the variable we have measured. Thus, for nominal and
ordinal data, we will need to use one of the alternative averages presented
in the next chapter. If the mean is subject to undue inﬂuence by skewness, why is it the
most popular? First, because the mean is appropriate for use with the nor
mal curve, which is widely found in research. In addition, the mean is as
sociated with other very important statistics such as the standard deviation,
which we’ll examine in Chapter 6. For this reason, even when a distribu
tion is only “approximately normal“ because it has some skewness, re
searchers tend to treat it as normal and compute the mean as the average.
Thus, for the emotional health scores of the foster care adolescent males
we examined in Chapters 1 and 3, a researcher would be likely to consider
the distribution as approximately normal and compute the mean. As it
turns out, the mean emotional health score of the males is 43.3. How does
this help us understand the emotional health of these boys? Well, it also
turns out that the scale used to measure emotional health had been 23 ~\ l
g NC/‘frn/Iﬂ\ TV‘
DO 0.0V \W" “View JD" “ Chapter 4 The Mean: The Most Popular Average nationally standardized with a noninstitutionalized (not foster care home)
sample of adolescent males. For this norm group, the mean was 50.0.
Thus, by comparing the two means, we can see that the foster care males
have a lower average on emotional health than the national norm group.
Notice, however, that this comparison of means tells only part of the story.
If you refer to Table 3.1 on page 13, you will notice that a good percentage
of the foster care males had scores as high as or higher than the national
mean of 50.0. Thus, while means allow us to be very concise, they do not
convey as much information as frequency distributions, polygons, or
histograms. In Chapter 6, you’ll leam that we usually report the mean with
its ﬁrst cousin, the standard deviation, in order to provide readers with a
better understanding of a distribution than they can get by considering
only its mean. When analyzing a set of scores, the ﬁrst thing naive people usually
think of is to compute the mean. You now know that the ﬁrst thing they
should do is examine the shape of the distribution by preparing a frequen
cy istribution and polygon. If the shape is normal or approximately nor
mal and if the data are equal interval, it is appropriate to compute the
mean. Otherwise, one of the alternative averages discussed in the next
chapter should be used. EXERCISE FOR CHAPTER 4 Factual Questions 1. To one decimal place, what is the mean of the scores in the box below? 0, 2, 4, 6, 9, 9 2. From reading this chapter, you know that the deviations around the
mean you computed in question 1 should sum to zero. Demonstrate that this is true by preparing for the scores for question 1 a table like
Table 4.} in this chapter. 3. Does a mean always have an equal number of cases on both sides of it? 24 I
Chapter 4 The Mean: The Most Popular Average 4. What is the name for a score that is much higher or much lower than
the other scores? 5. Is the mean a good choice as an average when a distribution is highly
skewed? 6. For which type of data is the mean appropriate?
A. nominal B. ordinal C. equal interval Questions for Discussion 7. Name a variable for which the mean would be an apprOpriate measure of the average because the variable is equal interval and you do not
think the distribution would be skewed. 8. Suppose a friend collected some equal interval data and computed a
mean without ﬁrst examining the shape of the distribution. How would you explain to your friend the desirability of ﬁrst examining the
shape? 25 ...
View Full
Document
 Spring '08
 Yanovitsky

Click to edit the document details