2.1 Types of Data
Def’n: A variable
is any characteristic that is recorded for subjects in a study.
 Qualitative (categorical): cannot assume a numerical value but classifiable into 2 or
more nonnumeric categories. e.g. gender, smell
 Quantitative (numerical): measured numerically.
 Discrete: only certain values with no intermediate values.
e.g. integers, grades
 Continuous: any numerical value over a certain interval or intervals.
e.g. GPA, gas prices
Def’n: A frequency table
(for qualitative data) is a listing of possible values for a
variable, together with the # of observations for each value.
Major
Frequency (
f
)
Relative frequency
Percentage (%)
Science
Arts
Business
Nursing
Other
∑
=
f
frequency
frequency
Relative
_
pctg. = (Relative frequency) x 100
2.2 Graphical Summaries
Def’n: A pie chart
is a circle divided into portions that represent relative frequency
belonging to different categories. e.g. (above table used in class)
Look for
: categories that form large and small proportions of the data set.
A bar graph
displays vertical bars whose heights represent the frequencies of
respective categories. e.g. (above table used in class)
Look for
: frequently and infrequently occurring categories.
Graphs for quantitative variables:
Def’n: A stemandleaf plot
has each value divided into two portions: a stem and a leaf.
The leaves for each stem are shown separately in a display.
(Values should be ranked.)
Look for
:  typical values and corresponding spread
 gaps in the data or outliers
 presence of symmetry in the distribution
 number and location of peaks
Ex2.1) U.S. Box Office for weekend of January 2, 2011
25.8
24.4
18.8
12.4
10.3
10.0
9.8
9.3
8.9
7.8
0  7.8
8.9
9.3
9.8
1  0.0
0.3
2.4
8.8
2  4.4
5.8
comparative SandL plot
has a common stem to compare two related distributions:
9.8
8.4
7.3
7.0
6.8
5.2
4.8  0  7.8
8.9
9.3
9.8
5.0
3.8
0.7  1  0.0
0.3
2.4
8.8
 2  4.4
5.8
Note:
Dot plots
also exist (see p. 33 in textbook), but “replace” the values with dots.
Def’n: A histogram
, like a bar graph, graphically shows a frequency distribution.
The
data here, however, is quantitative.
Look for
:  central or typical value and corresponding spread
 gaps in the data or outliers
 presence of symmetry in the distribution
 number and location of peaks
The data divide into intervals (normally of equal width).
