Lecture 5

© 2008 Haipeng Shen 1/24/08 Lecture 5 1 STOR 155 Introductory Statistics Lecture 5: Displaying Distributions with Numbers (II) The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

© 2008 Haipeng Shen 1/24/08 Lecture 5 2 Numerical Summary for Distributions Center Mean Median Spread Quartiles, IQR, Five-number summary and Boxplot Standard Deviation
© 2008 Haipeng Shen 1/24/08 Lecture 5 3 Examples: 2004 Two-Seater Cars The highway mileages of the 21 two-seater cars: 13 15 16 16 17 19 20 22 23 23 23 24 25 25 26 28 28 28 29 32 66 Q1 =18 Q3 =28 IQR = Q3 – Q1=10 1.5*IQR=15 Q3+1.5*IQR=43 Q1-1.5*IQR=3 66 is a suspected outlier.

© 2008 Haipeng Shen 1/24/08 Lecture 5 4 The five-number summary To get a quick summary of both center and spread, use the following five-number summary: Minimum Q1 M Q3 Maximum
© 2008 Haipeng Shen 1/24/08 Lecture 5 5 Example: HWY Gas Mileage of 2004 Two-seater/Mini Cars Two-seater Five-number summary: 13, 18, 23, 27, 32 Mini-compact Five-number summary: 19, 23, 26, 29, 32

© 2008 Haipeng Shen 1/24/08 Lecture 5 6 Boxplots of highway/city gas mileages (Two-seaters/minicompacts)
© 2008 Haipeng Shen 1/24/08 Lecture 5 7 Pros and cons of Boxplots Location of the median line in the box indicates symmetry/asymmetry. Best used for side-by-side comparison of more than one distribution at a glance. Less detailed than histograms or stemplots. The box focuses attention on the central half of the data.

© 2008 Haipeng Shen 1/24/08 Lecture 5 8 Income for different Education Level
© 2008 Haipeng Shen 1/24/08 Lecture 5 9 Modified Boxplot The current boxplot can not reveal those possible outliers. To modify it,

