Chapter 12
Describing Distributions with Numbers
In this chapter
•
Measures of central tendency
•
Measures of dispersion
•
Measures of position
•
Boxandwhisker plot
Measures of central tendency
In this chapter we will mainly deal with the calculation of statistics. Remember a statistic
is a numerical characteristic of a sample. These are descriptive statistics since they will
summarize the data in the sample.
A measure of central tendency is a measure of average or typical value. The three
measures of central tendency we will look at are the mean, median, and mode.
sample mean:
When most people use the word average, they are talking about the mean.
n
x
X
∑
=
where
X
is the sample mean
∑
is the sum
x
is the data values
n
is the sample size
The mean is the sum of the data values divided by the sample size.
Example 1
Select 4 students and ask “how many brothers and sisters do you have?”
Suppose the sample yields the following Data: 2,3,1,3
Calculate the mean.
Do you think the mean is a good measure of center for this data?
Example 2
Suppose we had selected a 5
th
person for our sample which had 10 siblings.
New Data: 2,3,1,3,10
Calculate the mean.
Do you think the mean is a good measure of center for this data?
Important characteristics of the mean are below:
•
X
is sensitive to extreme scores
•
X
is not necessarily a possible value
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
An applied example of the mean not being used when extreme values are present is
income. If you hear anyone talk about average income, they should say median income.
The median is a much better measure for center in this case than the mean. Consider if I
wanted to estimate average income for this class. If Bill Gates (a super rich guy) was in
the class, what effect would that have on the average? It would make it very high and not
a good measure for a typical value.
sample median:
the middle score
Procedure for calculating
X
~
(denotes the sample median) follows:
•
rank data from smallest to largest
•
if n is odd, median is the middle score
•
if n is even, median is the average of two middle scores
Example 3
Back to number of siblings Data: 2,3,1,3
Solve for the median.
Example 4
New Data: 2,3,1,3,10
Solve for the median.
Important characteristics of the median are below:
•
X
~
is not sensitive to extreme scores
•
exactly half of the data is below
X
~
and exactly half of the data
is above
X
~
Because of the characteristics of the mean and median, if extreme scores exist in a data
set the median is a better measure of central tendency.
If extreme scores are unlikely, the mean varies less from sample to sample than the
median and is a better measure.
sample mode:
the most frequent score
Example 5
Data: 2,3,1,3
New Data: 2,3,1,3,10
Calculate the mode for the above data sets.
There are some major weaknesses with the mode. For example suppose that in the New
Data the 10 was changed to a 2. Then what is the mode? You can say it has two modes,
both 2 and 3 or you can say the mode does not exist. Even worse, suppose one of the
values of 3 was instead a 4. Then you can say the mode does not exist or that all the data
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '10
 Bradley,W
 Statistics, Standard Deviation

Click to edit the document details