stat-60_lect-03_April_03

STATS 60, Spring 2008 April 3, Lecture 03 1 More on box plots (and histograms) Example 1 Consider the sample x = - 2 , - 2 , - 1 , - 1 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 2 , which has 13 elements. To create a histogram, stack the observations along the number line. To find quantiles, unstack the observations and put them in serial order. The 25th percentile (1st quartile) is at element number 0 . 25( n + 1) , the 30th percentile is at element number 0 . 30( n + 1) and so on. 1

Example 2 x min 1stQ median 3rdQ max mean 1 -3, -2.5, -2, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 3 -3 -1.5 0 1.5 3 -0.00 Notes: x is a sequence mean = median The scale for the density curve is not shown in the histogram. The density curve is a smoothed version of the histogram adjusted so that its total area equals 1.0. 2
Example 3 x min 1stQ median 3rdQ max mean 2 -3, -2.5, -2, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 8 -3 -1.5 0 1.5 8 .38 Notes: x is a sequence with last observation changed mean > median SD increased sample distribution is right tailed one outlier upper whisker is the maximum with outliers skipped

Unformatted text preview: 3 Example 4 x min 1stQ median 3rdQ max mean 3-2.29, -1.66, 0.51, 0.79, 0.98, 0.99, 1.11, 1.14-2.29-0.03 0.89 1.02 1.14 0.20 Notes: • sample distribution is left-tailed • mean < median • When R draws histograms, it uses the ﬁve number summary, where the upper/lower hinges are obtained as the median of the observations above/below the median. This is more complicated than simply using the 1st and 3rd quartile. Using the quartiles is ﬁne for this class. The 1st and 3rd quartiles are given in the box plot in parentheses. • The tail that’s heavy will either have a longer whisker or an outlier(s). 4 Example 5 x min 1stQ median 3rdQ max mean 1-0.37, 0.09, 0.21, 0.26, 0.59, 0.59, 0.74, 1.69-0.37 0.18 0.43 0.63 1.69 0.47 Notes: • sample distribution is right-tailed • mean > median • one outlier • upper whisker is the maximum with outliers skipped 5...
