MEASURES OF CENTRAL TENDENCY
What is the measure of center? It is the value of the middle of the data set. How do we find
it? In order to answer that question we need to discuss the four different measures of
center.
1. The
mean
. This term for measure is sometimes called the average but please know that
all four measures are averages. To determine we add all of the data values in the set and
divide by the number of values. The upper case Greek letter sigma,
Σ
, is used for
summation. To readily distinguish between a sample and a population we use the lower
case
n
for the size of a
sample
set, and the upper case
N
for the size of a
population
.
We will continue to use the variable
x
to denote the observation/data values.
(Sometimes the sample mean,
is referred to as xbar) The
μ
symbol for population
mean is the lower case Greek letter mu.
Wow! That one value really made the mean of the population different from the mean of the
sample! Our center measure is greatly affected by extreme values when using the mean to
describe data. We call extreme values
outliers
. They may be correct values reflecting an
abnormal characteristic, or they could be observation or recording errors. If the outlier
does
not affect
the statistic, it is said to be
robust
. The mean is not robust; it is affected by
outliers.
Because the mean is very sensitive to outliers we sometimes use a
trimmed mean
. To
determine we arrange the data values in ascending order and then delete the bottom 10%
(or some other specified percentage) of the values and the same top percentage amount of
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Documentvalues. We then follow the usual procedure to find the mean of this revised data set. It is
imperative that we identify the result as a trimmed mean with the percentage that was
trimmed when we report our findings.
Example: Find the trimmed mean for the following data set: {3, 5, 5, 9, 13, 301}
Since this data set is so small, in order to trim one number from each end the percentage
will be 16% (1 out of 6 = 0.16666667), The new trimmed data set will be: {5, 5, 9, 13}.
The trimmed mean of the population is much closer to the sample mean, and probably a
much better descriptive statistic of the data set than before with the outlier.
Notice the medians for both sets are about the same. The extreme value in the second set
did not affect the MD. The value of the last number in the data set could be any value >
9
with the same result for MD. This tells us the median is robust; it is not affected by outliers.
3.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 BARCUS
 Standard Deviation, Mean

Click to edit the document details