3_Descriptive Statistics_handouts

3_Descriptive Statistics_handouts - 1 STAT 5615 Statistics...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 STAT 5615 Statistics in Research (I) Summary Statistics Ott & Longnecker 3.4, 3.5 2 Summary Statistics Numerical Descriptive Measures • Summary statistics provide quantitative descriptions of data. • They form the basis for statistical inference. • They are used for exploratory analysis and data description. • Sample summary measures, called statistics, tend to resemble their corresponding population constants, called parameters . 3 Categories of Summary Statistics Basic data summarization concepts: • Central tendency: where do the data tend to fall? • Variability: how diverse are the data? • Shape: any unusual features in the data, such as asymmetry or outliers? Relative frequency statistics : mode- Calculated from relative frequency table Rank-based statistics : quartiles, median, range, IQR- Calculated from percentiles Averaging statistics : mean- Calculated from summation 4 Relative Frequency Statistics • Mode- the measurement that occurs most frequently * For data presented in a frequency distribution, the modal class is the most frequently occurring class interval listed in the relative frequency table. The mode is the midpoint of the modal class. * The mode may not exist. For small samples, the mode may NOT be representative of an average or a typical value. * There may be more than one mode. * Unlike other measures of central tendency that are only appropriate for quantitative data, the mode may be used with qualitative data. Ex. Favorite Letters Mode = A, but S occurs nearly as frequently as A Disadvantage: mode does not clearly indicate what is most typical letter: A S M J R … pct %: 9.67 9.50 7.83 5.00 5.00 … 5 Ages of Students in Summer I STAT 5615 Class 21, 21, 21, 22, 22, 23, 23, 23, 24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 30, 31, 34, 39, 42, 44, 50 2 | 1 1 1 2 2 3 3 3 4 4 4 4 2 | 5 5 5 5 6 6 6 3 | 0 1 4 3 | 9 4 | 2 4 4 | 5 | 0 n=26 6 Rank-based Statistics • Median Q 2 - measurement of central tendency that locates the “middle” value of data • First quartile Q 1 and third quartile Q 2 : measurements of central tendency that locate the “middle 50%” of data * Interquartile range , IQR = Q 3 – Q 1 : measurement of variability that measures the spread of the “middle 50%” of the data set * Range is the maximum value of the data set minus the minimum value; measures the spread of all the data Note: IQR more robust than range – less sensitive to outliers 7...
View Full Document

This note was uploaded on 09/11/2008 for the course STAT 5615 taught by Professor Pdu during the Fall '08 term at Virginia Tech.

Page1 / 23

3_Descriptive Statistics_handouts - 1 STAT 5615 Statistics...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online