Class%20Notes-1 - 11-09-06 Class NotesData Set 1{46, 49,...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 11-09-06 Class NotesData Set 1{46, 49, 52, 57}Mean(46 + 49 + 52 + 57)/4 = 51Standard Deviation (sx)sx= Data Set 2{24, 34, 55, 91}Mean = 51Standard Deviation29.631SHARP EL 531MODESTATSD46 M+49 M+52 M+57 M+RCL (mean) 51 RCL sx4.690CASIOMODE MODE SD46 M+49 M+52 M+57 M+[2ndfunction] (upper left)S-varExploratory Data Analysis (EDA)John TukeyThe MedianThe median of a data set is found by putting the numbers into numerical order and finding:i.The middle number if nis oddii.The average of the two middle numbers if n is evenThe median is called the 50thpercentile.Tukey states that the median divides the data set into a lower half and a higher half. If nis odd, Tukey considers the median to belong both to the lower half and the upper half. Tukeys lower hinge is the median of the lower half, and the upper hinge is the median of the upper half.Lower hinge 25thpercentile = Q1= first quartileUpper hinge 75thpercentile = Q3= third quartileTukeys h-spread = upper hinge lower hinge interquartile range=Q3 Q1Interquartile range helps protect against outliers.Application (to data sets from worksheet)Question 2 (stem-and-leaf notation)n= 45mean = 87.04444sx= 18.764median (23rdnumber) = 85mean > medianThis reflects the positive skew.NB: These numbers are meaningful for large data sets but are relatively meaningless for small data sets.Lower hinge (12thnumber) = 74Upper hinge (12thnumber from end) = 99Tukeys h-spread = 99 74 = 25Interquartile rangeTukeys FencesThe lower inner fence is:Lower hinge minus 1.5 * h-spreadThe upper inner fence is:Upper hinge plus 1.5 * h-spreadNB: 1.5 is an arbitrary number according to Prof. But Tukey claimed that after spending 10 years in the computer lab that this was a good number for his purposes.A number in the data set is called a Tukey outlier if either it is greater than the upper inner fence or less than the lower inner fence.For the data in example 2,Lower inner fence = 74-1.5 * 25 = 36.5No low outliersUpper inner fence = 99 + 1.5 * 25 = 136.5138 is an outlierPotential exam question why do statisticians care about finding outliers? Outlier data are fairly often incorrect data. Even if sometimes they are correct, statisticians sometimes choose to exclude them from their analyses because they throw everything off. For this reason, average income has been replaced with median income when a citys incomes are reported.Range = highest number, minus lowest number= 138 51 = 87Tukeys Adjacent ValuesTukeys lower adjacent value= lowest non-outlier in the data setTukeys upper adjacent value= highest non-outlier in the data setTukeys Five-Point SummaryThe five-point summary:Lower adjacent valueLower hingeMedianUpper hingeUpper adjacent valueBox-and-Whisker PlotSee class handout page for exampleBoundaries of box are the lower and upper hinge; mark in middle of box is the median; whiskers extend out to the upper and lower adjacent values; outliers are indicated by asterisks.are indicated by asterisks....
View Full Document

This note was uploaded on 02/27/2012 for the course BUSINESS 101 taught by Professor All during the Spring '09 term at McGill.

Page1 / 32

Class%20Notes-1 - 11-09-06 Class NotesData Set 1{46, 49,...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online