8/22/08
1
Examining Distributions

Displaying Distributions with Graphs
Section 1.1
Variables
In a study, we collect information—data—from
individuals
.
Individuals
can be people, animals, plants, or any object of interest.
A
variable
is any characteristic of an individual. A variable varies
among
individuals.
Examples: height, blood pressure, ethnicity, dividend rate, annual spending
The
distribution
of a variable tells us what values the variable takes and
how often it takes these values.
Two types of variables
Variables can be either
quantitative…
Something that takes numerical values for which arithmetic operations
such as adding and averaging make sense
Example: How tall you are, your age, your blood cholesterol level, the
number of credit cards you own
… or
categorical.
Something that falls into one of several categories. What can be counted
is the count or proportion of individuals in each category.
Example: Your blood type (A, B, AB, O), your hair color, your ethnicity,
whether you paid income tax last tax year or not
Ways to chart categorical data
Because the variable is categorical, the data in the graph can be
ordered any way we want (alphabetical, by increasing value, by year,
by personal preference, etc.)
Bar graphs
Each category is
represented by
a bar.
Pie charts
The slices must
represent the parts of one whole.
Child poverty before and after government
intervention—UNICEF, 1996
What does this chart tell you?
•The United States has the highest rate of child
poverty among developed nations (22% of under 18).
•Its government does the least—through taxes and
subsidies—to remedy the problem (size of orange
bars and percent difference between orange/blue
bars).
Could you transform this bar graph to fit in 1 pie
chart? In two pie charts? Why?
The poverty line is defined as 50% of national median income.
Ways to chart quantitative data
Histograms and stemplots
These are summary graphs for a single variable. They are very useful to
understand the pattern of variability in the data.
Line graphs: time plots
Use when there is a meaningful sequence, like time. The line connecting
the points helps emphasize any change over time.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
8/22/08
2
Histograms
The range of values that a
variable can take is divided
into equal size intervals.
The histogram shows the
number of individual data
points that fall in each
interval.
Example: Histogram of the
December 2004 unemployment
rates in the 50 states and
Puerto Rico.
Interpreting histograms
When describing the distribution of a quantitative variable, we look for the
overall pattern and for striking deviations from that pattern. We can describe
the
overall
pattern of a histogram by its
shape, center,
and
spread.
Histogram with a line connecting
each column
too detailed
Histogram with a smoothed curve
highlighting the overall pattern of
the distribution
Most common distribution shapes
A distribution is
symmetric
if the right and left
sides of the histogram are approximately mirror
images of each other.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 HOLT
 Normal Distribution, Standard Deviation, unemployment rates, µ, NCAA

Click to edit the document details