STA6166 F05-5 Measures of Spread - Topic(5 SUMMARIZING DATA...

Topic (5) SUMMARIZING DATA – SPREAD OR VARIABILITY 5-1 Topic (5) SUMMARIZING DATA – SPREAD OR VARIABILITY How do we capture variability in a single summary statistic? X 0 1 2 3 4 5 6 7 Y 0 1 2 3 4 5 6 7 Z 0 1 2 3 4 5 6 7 Note how each of these datasets vary in their minimum and maximum values and how they vary within their distribution as well. a) Range of a Variable Defn: Range = Maximum value – Minimum Value e.g. fish lengths: range = 26 cm (51 - 25) fish weights: range = 1322 gms (1763 - 441) Question: is the range a robust measure for variability??

Topic (5) SUMMARIZING DATA – SPREAD OR VARIABILITY 5-2 b) Standard Deviation of a set of data The distance x x i is called the deviation of the i th value from the sample mean. EXAMPLE: fish lengths ( 08 . 40 = x ) •• ____|______|______|_____|______|_____|____ 25 30 35 40 45 50 deviation x x i = 25 - 40.1= -15.1 25.5 - 40.1= -14.6 26 - 40.1= -14.1 28.5 - 40.1 = -11.6 44 - 40.1= 3.9 44 - 40.1= 3.9 45 - 40.1= 4.9 46 - 40.1= 5.9 48 - 40.1= 7.9 49 - 40.1= 8.9 49 - 40.1= 8.9 51 - 40.1= 10.9
5-3 Question: Might these deviations be useful information to describe the variability in a set of data? The standard deviation is a measure of the average deviation of values in a set of data. FACT: for any set of data, the deviations always sum to 0! So to be useful, we do the following: 1) calculate the deviations, x x i , i=1,…,n 2) square each deviation, 2 ) ( x x i , i=1,…,n 3) sum up the squares, = n i i x x 1 2 ) ( 4) divide by ( n-1

STA6166 F05-5 Measures of Spread - Topic(5 SUMMARIZING DATA...

