{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lec4v5_1up - Stat 104 Quantitative Methods for Economists...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
Stat 104: Quantitative Methods for Economists Class 4: Summarizing Data-Measures of Dispersion 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
What were we talking about ? square6 Dotplot square6 Population versus Sample square6 Mean, Median square6 Quartiles square6 Effect of outliers 2
Background image of page 2
Measures of Variation (Dispersion) Variation Variance Standard Deviation Coefficient of Variation Population Range Population Variance Sample Variance Standard Deviation Sample Standard Deviation Interquartile Range 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
The mean and median give us information about the central tendency of a set of observations, but these numbers shed no light on the dispersion, or spread of the data. Measure of Dispersion Example: Which data set is more variable ?? 5,5,5,5,5 Mean = 5 1,3,5,8,8 Mean = 5 4
Background image of page 4
square6 Measures of variation give information on the spread or variability of the data values . square6 Variation Same center, different variation 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Range square6 Simplest measure of variation square6 Difference between the largest and the smallest observations: Range = x – x maximum minimum 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 14 - 1 = 13 Example: 6
Background image of page 6
square6 Ignores the way in which data are distributed square6 Sensitive to outliers 7 8 9 10 11 12 Range = 12 - 7 = 5 7 8 9 10 11 12 Range = 12 - 7 = 5 Disadvantages of the Range 1 ,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4, 5 1 ,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4, 120 Range = 5 - 1 = 4 Range = 120 - 1 = 119 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Interquartile Range (IQR) square6 Can eliminate some outlier problems by using the interquartile range square6 Eliminate some high-and low-valued observations and calculate the range from the remaining values. square6 Interquartile range = 3 rd quartile – 1 st quartile 8
Background image of page 8
Interquartile Range Median (Q2) maximum minimum Q1 Q3 Example: 25% 25% 25% 25% •Developed by John Tukey, the founder of EDA (exploratory data analysis) •Doesn’t’ take into account all your data-not used that much 12 30 45 57 70 Interquartile range = 57 – 30 = 27 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Tukey square6 Invented the terms “bit” and “software” 10 John Tukey’s whole life was one of public service, and as the preceding quotes make clear, he had profound influence. He was a member of the President’s Scientific Advisory Committee for each of Presidents Eisenhower, Kennedy, and Johnson. He was special in many ways. He merged the scientific, governmental, technological, and industrial worlds more seamlessly than, perhaps, anyone else in the 1900s. His scientific knowledge, creativity, experience, calculating skills, and energy were prodigious. He was renowned for creating statistical concepts and words.
Background image of page 10
(Arbitrary) Outlier Detection square6 Tukey deemed any observation in the data set an outlier if it was boxshadowdwn Greater than Q3+1.5*IQR Or boxshadowdwn Less than Q1-1.5*IQR 11 Paul Velleman, a statistician at Cornell University, was a student of John Tukey, who invented the boxplot and the 1.5*IQR Rule. When he asked Tukey, ‘Why 1.5?’, Tukey answered, ‘Because 1 is too small and 2 is too large.’
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Example: Haircut Data Again IQR=40-15 = 25 12 1.5*IQR = 1.5(25) = 37.5 Outlier if above 40+37.5 = 77.5 or below 15-37.5 = -22.5 These outlier definitions seem reasonable
Background image of page 12
The basic idea is to view variability in terms of distance between each measurement and the mean.
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}