Statistics I
Chapter 3

Numerical Descriptive Measures
•
If our ultimate goal is statistical inference, we want to use
sample
numerical descriptive measures
to make inferences
about the corresponding population measures
•
Most statistical techniques rely on looking at 1 of 2 data
characteristics:
– Central Tendency
– Variability

Numerical Descriptive Measures
•
Central tendency
is the tendency of the data to cluster
around certain values
•
Variability
is the spread of the data

Numerical Measures of Central Tendency
• Mean
– Technically, there are 3 different types of mean
• Arithmetic
• Geometric
• Harmonic
– For the purposes of this course, we will focus on the
arithmetic mean
,
and just call it the
mean
• Median
• Mode

Mean
• The
mean
of a set of quantitative data is the sum of the measurements
divided by the number of measurements.
In everyday language this is
called the
average
.
• Example
– Calculate the mean of the following measurements: 4, 8, 9, 2, 5
g1850
g3364
g3404
sumofthemeasurements
numberofmeasurements
g3404
4g33978g33979g33972g33975
5
g3404
28
5
g34045.6

Median
• The
median
of a quantitative data set is the middle number
when the measurements are put in ascending (or descending)
order
•
The median is the value such that exactly half of the
measurements are below and half above the median
• The
sample median
is denoted
g1865
, and the
population
median
is denoted
g2015
(eta).

Median
• Examples:
•
What is the median of the following measurements: 2, 5, 4, 8,
7, 100, 6?
– Reordering gives us 2, 4, 5, 6, 7, 8, 100.
There are 7 observations, so
g1865
is the middle number, 6.
•
What about: 2, 5, 4, 7, 8, 100?
– Reordering gives us 2, 4, 5, 7, 8, 100.
There are 6 observations, so
g1865
is the mean of the middle 2 numbers , 5 and 7.
Therefore
g1865g3404
g3121g3126g3123
g3118
g3404
g3117g3118
g3118
g34046

Mode

Mode
•
For some data sets, the mode may not be very meaningful.
– Consider the following data 3, 3, 5, 5, 8, 8, 9
– Here, 3, 5, and 8 all occur twice and so all three are modes of the
sample, and none of these (as a mode) are useful as a measure of
central tendency

Data Set I
Data Set II
Means, medians, and modes of salaries in Data Set I and
Data Set II
Example 1: Weekly Salary Data

Numerical Measures of Central Tendency
•
Sometimes the median is a better measure of central
tendency than the mean.
– The median is a
resistant measure
that is to say it is not sensitive to
the influence of extremely large or small values in the dataset.
The
mean is not a resistant measure.
– Consider the previous example, 2, 4, 5, 7, 8, 100.
The median, 6, is
unaffected by the single large value of 100, but it causes the mean, 21,
to lie to the right of all of the other measurements
– Sometimes we use a
trimmed mean
to improve the resistance of the
mean.
This involves removing a certain percentage of the smallest and
largest observations before calculating the mean.

#### You've reached the end of your free preview.

Want to read all 77 pages?

- Spring '08
- NAUS
- Statistics, Standard Deviation, Mean