Section 2.3-2.4

Section 2.3-2.4 - STT 351 Sections 2.3-2.4 2. 3 More...

This preview shows pages 1–7. Sign up to view the full content.

STT 351 Sections 2.3-2.4 2. 3 More Detailed Summary Quantities The ‘ quartiles’ and ‘ percentiles ’ yield more information about the location of a data set. Similarly, median and IQR (inter quartile range) are used to construct boxplot, a visual summary of the data. 2.3.1 Quartiles and IQR Let 7 . 5 5 . 3 41 = = p denote the data set of size n . First order the observations. (i) Compute % x , the median. (ii) If n is even, first 2 n observations form the lower half; and the remaining 2 n observations form the upper half (median separates the data into two parts) (iii) If n is odd, % x is the ( n+1)/2 -th value of the orderd data and include it both the parts. Quartile: The (i) lower quartile = Q 1 = median of the lower half of the data; 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
(ii) Upper quartile = Q 3 = median of the upper half of the data; and (iii) The interquartile range IQR = Q 3 – Q 1 . Example 1 Consider the following data: 5.2, 3.9, 4.8, 5.1, 3.7, 4.5, 4.2 Here, n = 7. Orderd data: 3.7, 3.9 4.2, 4.5, 4.8, 5.1, 5.2 The median = 4.5. Since n is odd, include the median in lower half and upper half of the data. Lower half: 3.7, 3.9, 4.2, 4.5 Upper half: 4.5, 4.8, 5.1, 5.2 Q 1 = 3.9 4.2 81 40.5 2 2 + = = Q 3 = 4.8 5.1 99 49.5 2 2 + = = Hence, IQR = 49.5 – 40.5 = 9 2
Population Quantities : Let x have density ( ) f x . Then we know the median % u satisfies 1 ( ) 0.05 2 u f x dx - = = Similarly, the lower quantile q l , and upper quantile q u are defined by 1 ( ) 4 l q f x dx - = 3 ( ) 4 u q f x dx - = Example 2. Let x have exponential E ( λ ) with density f ( x ) = λ e –λ x , x 0 0, otherwise. Then, we know that for any c 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
1 c x c c dx e λ α - - - = - ( the lower limit should be zero) 1 1 ( ) 1 4 4 l l q q f x dx e - - = - = Hence, 3 4 l q e - = ; ln(.75) l q - = ln(.75) l q - = Similarly, ln(.25) l q - = (it should be q u ) IQR Criteria for an Outlier . An observation that lies above Q 3 + (1.5) IQR or below Q 1 – (1.5) IQR may be suspected to be an outlier. An outlier is called extreme if it lies outside (Q 1 – 3 IQR, Q 3 + 3 IQR). Otherwise; it is called a mild outlier. Boxplot : A box plot is a visual display of 5 number summary: 4
% ( 29 (1) 1 3, ( ) , , , n x Q x Q x . Procedure : (i) The middle box denotes the Q1, mediam and the Q3. (ii) The whiskers extend above Q3 or below Q1 till Q3+3IQR or Q1-3IQ . (iii) The outliers are denoted by special symbols. 7 . 5 5 . 3 41 = = p 5

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Remarks : (i) More compact than stem plot or histogram. (ii) Central box contains roughly 50% of the data. (iii) Does not reveal the presence of “clusters”. (iv) Very useful in comparing (similarity & differences) data sets on same scale. (v) Height of the box = IQR (vi) If the median is roughly in the middle of the box, then the distribution is symmetric; or else it is skewed (vii) Whiskers show skewness if they are not of the same length. (viii)
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 07/25/2008 for the course STT 351 taught by Professor Palaniappan during the Summer '08 term at Michigan State University.

Page1 / 23

Section 2.3-2.4 - STT 351 Sections 2.3-2.4 2. 3 More...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online