lec4v5_1up - Stat 104: Quantitative Methods for Economists...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
Stat 104: Quantitative Methods for Economists Class 4: Summarizing Data-Measures of Dispersion 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
What were we talking about ? s Dotplot s Population versus Sample s Mean, Median uartiles s Quartiles s Effect of outliers 2
Background image of page 2
Measures of Variation (Dispersion) Variation Variance Standard Deviation Coefficient of Variation Range Population Variance Sample Variance Population Standard Deviation Sample Standard Deviation Interquartile Range 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
The mean and median give us information about the central tendency of a set of observations, but these numbers shed no light on the dispersion, or spread of the data. Measure of Dispersion Example: Which data set is more variable ?? 5,5,5,5,5 Mean = 5 1,3,5,8,8 Mean = 5 4
Background image of page 4
s Measures of variation give information on the spread or variability of the data values . s Variation Same center, different variation 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Range s Simplest measure of variation s Difference between the largest and the smallest observations: Range = x maximum – x minimum 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 14 - 1 = 13 Example: 6
Background image of page 6
s Ignores the way in which data are distributed 7 8 9 10 11 12 Range = 12 - 7 = 5 7 8 9 10 11 12 Range = 12 - 7 = 5 Disadvantages of the Range s Sensitive to outliers 1 ,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4, 5 1 ,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4, 120 Range = 5 - 1 = 4 Range = 120 - 1 = 119 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Interquartile Range (IQR) s Can eliminate some outlier problems by using the interquartile range s Eliminate some high-and low-valued observations and calculate the range from the remaining values. s Interquartile range = 3 rd quartile – 1 st quartile 8
Background image of page 8
Interquartile Range Median (Q2) maximum minimum Q1 Q3 Example: •Developed by John Tukey, the founder of EDA (exploratory data analysis) •Doesn’t’ take into account all your data-not used that much 25% 25% 25% 25% 12 30 45 57 70 Interquartile range = 57 – 30 = 27 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Tukey s Invented the terms “bit” and “software” 10 John Tukey’s whole life was one of public service, and as the preceding quotes make clear, he had profound influence. He was a member of the President’s Scientific Advisory Committee for each of Presidents Eisenhower, Kennedy, and Johnson. He was special in many ways. He merged the scientific, governmental, technological, and industrial worlds more seamlessly than, perhaps, anyone else in the 1900s. His scientific knowledge, creativity, experience, calculating skills, and energy were prodigious. He was renowned for creating statistical concepts and words.
Background image of page 10
(Arbitrary) Outlier Detection s Tukey deemed any observation in the data set an outlier if it was b Greater than Q3+1.5*IQR r Or b Less than Q1-1.5*IQR 11 Paul Velleman, a statistician at Cornell University, was a student of John Tukey, who invented the boxplot and the 1.5*IQR Rule. When he asked Tukey, ‘Why 1.5?’, Tukey answered, ‘Because 1 is too small and 2 is too large.’
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Example: Haircut Data Again 12 IQR=40-15 = 25 1.5*IQR = 1.5(25) = 37.5 Outlier if above 40+37.5 = 77.5 or below 15-37.5 = -22.5 These outlier definitions seem reasonable
Background image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 48

lec4v5_1up - Stat 104: Quantitative Methods for Economists...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online