This preview shows pages 1–7. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 STAT 5615 Statistics in Research (I) Summary Statistics Ott & Longnecker 3.4, 3.5 2 Summary Statistics Numerical Descriptive Measures • Summary statistics provide quantitative descriptions of data. • They form the basis for statistical inference. • They are used for exploratory analysis and data description. • Sample summary measures, called statistics, tend to resemble their corresponding population constants, called parameters . 3 Categories of Summary Statistics Basic data summarization concepts: • Central tendency: where do the data tend to fall? • Variability: how diverse are the data? • Shape: any unusual features in the data, such as asymmetry or outliers? Relative frequency statistics : mode Calculated from relative frequency table Rankbased statistics : quartiles, median, range, IQR Calculated from percentiles Averaging statistics : mean Calculated from summation 4 Relative Frequency Statistics • Mode the measurement that occurs most frequently * For data presented in a frequency distribution, the modal class is the most frequently occurring class interval listed in the relative frequency table. The mode is the midpoint of the modal class. * The mode may not exist. For small samples, the mode may NOT be representative of an average or a typical value. * There may be more than one mode. * Unlike other measures of central tendency that are only appropriate for quantitative data, the mode may be used with qualitative data. Ex. Favorite Letters Mode = A, but S occurs nearly as frequently as A Disadvantage: mode does not clearly indicate what is most typical letter: A S M J R … pct %: 9.67 9.50 7.83 5.00 5.00 … 5 Ages of Students in Summer I STAT 5615 Class 21, 21, 21, 22, 22, 23, 23, 23, 24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 30, 31, 34, 39, 42, 44, 50 2  1 1 1 2 2 3 3 3 4 4 4 4 2  5 5 5 5 6 6 6 3  0 1 4 3  9 4  2 4 4  5  0 n=26 6 Rankbased Statistics • Median Q 2  measurement of central tendency that locates the “middle” value of data • First quartile Q 1 and third quartile Q 2 : measurements of central tendency that locate the “middle 50%” of data * Interquartile range , IQR = Q 3 – Q 1 : measurement of variability that measures the spread of the “middle 50%” of the data set * Range is the maximum value of the data set minus the minimum value; measures the spread of all the data Note: IQR more robust than range – less sensitive to outliers 7...
View
Full
Document
This note was uploaded on 09/11/2008 for the course STAT 5615 taught by Professor Pdu during the Fall '08 term at Virginia Tech.
 Fall '08
 PDU
 Statistics

Click to edit the document details