determined by the longer tail. In both cases, we see that the right tail of the distribution is
longer than the left tail, so we say that these distributions are skewed to the right.
We often draw smooth curves to illustrate the general shape of a distribution. Smoothing a
histogram into a curve helps us to see the shape of the distribution with less jagged edges at the
corners. When we describe a histogram with a smooth curve, we don't try to match every bump
and dip seen in a particular sample. Rather we find a relatively simple curve that follows the
general pattern in the data
Common Shapes for Distributions
A distribution shown in a histogram or dotplot is called:
Symmetric if the two sides approximately match when folded on a vertical center line
Skewed to the right if the data are piled up on the left and the tail extends relatively far out to
the right
Skewed to the left if the data are piled up on the right and the tail extends relatively far out to
the left
Bell-shaped if the data are symmetric and, in addition, have the shape shown in Figure 2.9(c)

Two summary statistics that describe the center or location of a distribution for a single
quantitative variable are the
mean
and the
median
.
Mean
The mean for a single quantitative variable is the numerical average of the data values:
To express the calculation of the mean in a mathematical formula, we let
n
represent the
number of data cases in a dataset and
x
1
,
x
2
,…,
xn
represent the numerical values for the
quantitative variable of interest.
The Greek letter Σ is used as a shorthand for adding all the x values.
As with a proportion, we use different notation to indicate whether a mean summarizes the
data from a sample or a population.
Notation for a Mean
The mean of a sample is denoted and read “x-bar.”
The mean of a population is denoted μ, which is the Greek letter “mu.”
(a)
For a random sample of 50 seniors from a large high school, the average SAT (Scholastic
Aptitude Test) score was 582 on the Math portion of the test.
(b)
Nearly 1.6 million students in the class of 2010 took the SAT,22 and the average score
overall on the Math portion was 516
(a)
The mean of 582 represents the mean of a sample, so we use the notation
for the
mean, and we have
.
(b)
The mean of 516 represents the mean for everyone who took the exam in the class of 2010,
so we use the notation
μ
for the population mean, and we have
μ
=516.
The median is another statistic used to summarize the center of a set of numbers. If the
numbers in a dataset are arranged in order from smallest to largest, the median is the middle
value in the list. If there are an even number of values in the dataset, then there is not a unique
middle value and we use the average of the two middle values.
The median of a set of data values for a single quantitative variable, denoted m, is
the middle entry if an ordered list of the data values contains an odd number of entries, or
the average of the middle two values if an ordered list contains an even number of entries.

The median splits the data in half.