Who:
what is being studied, individual cases
What
: the variables
Relative frequency table
: displays percentages instead of counts
Area principle:
area occupied by a part of the graph should correspond to its value
Bar chart
: displays the distribution of a categorical variable
Relative frequency:
shows percentages
Contingency table
: shows how individuals are distributed along each variable
Histogram:
shows distribution of quantitative variables
Stem and leaf
: shows individual values (having a key is very important)
Marginal distribution:
the frequency distribution of variables
Dot plot
: dot along axis for each data point
Segmented Bar Chart:
treats each bar as a whole and divides it up proportionally into segments corresponding to the
percentage of each group
Simpson’s paradox:
When averages are taken between numbers that cannot be evenly averaged – unfair averaging
When describing, always use shape, center, and spread
 humps in histogram are called modes (unimodal, bimodal, multimodal, uniform)
 symmetric, or skewed (Graphs are skewed towards the tail)
 mention outliers
 median is resistant to outliers
Interquartile Range (IQR)= Q3 – Q1
Measures of Center:
Use mean when graph is symmetric, median when skewed
Measures of Spread:
Use standard deviation when graph is symmetric, IQR when skewed, range can also be used to
generally describe data
Fences of Box plot
= Q1  1.5IQR (lower fence)
Q3 + 1.5IQR (upper fence)
Z score
= (ymean of y)/STD
 689599.7 Rule 68% of values lay within 1STD  1STD, 95 lay between 2 – 2 STDs, 99.7% lay within 3 STDs
Standard deviation =
(  )

y y 2n 1
5 Number Summary:
Includes Min, Q1, Median, Q3, Max
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '07
 VELLEMANP
 Normal Distribution, variable Relative frequency, Q3 Shifting data

Click to edit the document details