Chapter 3 – Numerically Summarizing Data
After we have become somewhat familiar with the data through representing it graphically and
observing the characteristics of the distribution, we want to describe the characteristics with numerical
values called descriptive statistics.
Recall from Chapter 1:
Defn
: A parameter
is a numerical characteristic of a population.
Defn
: A statistic
is a numerical characteristic of a sample. (Remember, a sample is a subset of a
population.)
We want to use the value of a statistic found from the sample data to gain knowledge about the value
of the corresponding parameter, which we would be able to get directly if we had access to the entire
population.
Measures of Central Tendency
give us information about the location of the center (in some sense) of
the distribution of (numeric) data values. We will discuss four measures of central tendency:
mean,
median, mode, and the midrange.
Defn
: If we have a set of n sample data values, x
1
, x
2
, … , x
n
, the mean of these data values is their
arithmetic average:
( 29
∑
=
=
+
+
+
=
n
i
i
n
x
n
x
x
x
n
x
1
2
1
1
1
.
If we have a set of N population data values, the mean of these values is:
(
29
∑
=
=
+
+
+
=
N
i
i
N
x
N
x
x
x
N
1
2
1
1
1
μ
.
Note
:
x
is a statistic;
μ
is a parameter.
Example
: p. 123, Example 6
1) Go to
STAT
,
1:Edit.
2) Enter the data, with a suitable variable name, such as BP.
3) Choose
STAT
,
CALC
,
1:1-Var Stats
.
4) Enter the variable name, and press
ENTER
.
5) You will see a list of numerical values for the data, including
488
.
7
=
x
,
and
4
.
374
50
1
=
∑
=
i
i
x
.
The average, or mean, birthweight for the babies was found to be 7.448 pounds.
The total weight for
the babies is 374.4 pounds.
Properties of the Mean:
1) One computes the mean by using all of the values of the data.
2) The mean varies less than the other two measures of central tendency when samples are taken
from the same population and all three measures are computed for these samples.