{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Section 2.3-2.4

# Section 2.3-2.4 - STT 351 Sections 2.3-2.4 2 3 More...

This preview shows pages 1–7. Sign up to view the full content.

STT 351 Sections 2.3-2.4 2. 3 More Detailed Summary Quantities The ‘ quartiles’ and ‘ percentiles ’ yield more information about the location of a data set. Similarly, median and IQR (inter quartile range) are used to construct boxplot, a visual summary of the data. 2.3.1 Quartiles and IQR Let 7 . 5 5 . 3 41 = = p denote the data set of size n . First order the observations. (i) Compute % x , the median. (ii) If n is even, first 2 n observations form the lower half; and the remaining 2 n observations form the upper half (median separates the data into two parts) (iii) If n is odd, % x is the ( n+1)/2 -th value of the orderd data and include it both the parts. Quartile: The (i) lower quartile = Q 1 = median of the lower half of the data; 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
(ii) Upper quartile = Q 3 = median of the upper half of the data; and (iii) The interquartile range IQR = Q 3 – Q 1 . Example 1 Consider the following data: 5.2, 3.9, 4.8, 5.1, 3.7, 4.5, 4.2 Here, n = 7. Orderd data: 3.7, 3.9 4.2, 4.5, 4.8, 5.1, 5.2 The median = 4.5. Since n is odd, include the median in lower half and upper half of the data. Lower half: 3.7, 3.9, 4.2, 4.5 Upper half: 4.5, 4.8, 5.1, 5.2 Q 1 = 3.9 4.2 81 40.5 2 2 + = = Q 3 = 4.8 5.1 99 49.5 2 2 + = = Hence, IQR = 49.5 – 40.5 = 9 2
Population Quantities : Let x have density ( ) f x . Then we know the median % u satisfies 1 ( ) 0.05 2 u f x dx - = = Similarly, the lower quantile q l , and upper quantile q u are defined by 1 ( ) 4 l q f x dx - = 3 ( ) 4 u q f x dx - = Example 2. Let x have exponential E ( λ ) with density f ( x ) = λ e –λ x , x 0 0, otherwise. Then, we know that for any c 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
1 c x c c dx e λ λ α λ - - - = - ( the lower limit should be zero) 1 1 ( ) 1 4 4 l l q q f x dx e λ α - - = - = ° Hence, 3 4 l q e λ - = ; ln(.75) l q λ - = ln(.75) l q λ - = Similarly, ln(.25) l q λ - = (it should be q u ) IQR Criteria for an Outlier . An observation that lies above Q 3 + (1.5) IQR or below Q 1 – (1.5) IQR may be suspected to be an outlier. An outlier is called extreme if it lies outside (Q 1 – 3 IQR, Q 3 + 3 IQR). Otherwise; it is called a mild outlier. Boxplot : A box plot is a visual display of 5 number summary: 4
% ( 29 (1) 1 3, ( ) , , , n x Q x Q x . Procedure : (i) The middle box denotes the Q1, mediam and the Q3. (ii) The whiskers extend above Q3 or below Q1 till Q3+3IQR or Q1-3IQ . (iii) The outliers are denoted by special symbols. 7 . 5 5 . 3 41 = = p 5

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Remarks : (i) More compact than stem plot or histogram. (ii) Central box contains roughly 50% of the data. (iii) Does not reveal the presence of “clusters”. (iv) Very useful in comparing (similarity & differences) data sets on same scale. (v) Height of the box = IQR (vi) If the median is roughly in the middle of the box, then the distribution is symmetric; or else it is skewed (vii) Whiskers show skewness if they are not of the same length. (viii) Useful to detect outliers. Main use of boxplots is to compare the groups: Example 3. The following data denotes the shear strength (MPa) of a joint bonded in a particular manner.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}