This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Lecture 6: Descriptive Measures Lecture 6: Outline 1. Questions 2. Review: Describe a numerical variable with 3 concepts of Central Tendency, Variability, and Shape: 1. Central Tendency (Mean, Median, Mode) Measures of Non Central Tendency (1st and 3rd Quartiles) 2. Variability (Range, Interquartile Range, Standard Deviation) 1. New Material Coefficient of Variation (one more measure of variability) 3rd Concept for Description of numerical data: Shape . Symmetry vs. Skewness (Skewness Coefficient) . Normal shape vs. Flatness/Peakedness (Kurtosis Coefficient) Explanatory Data Analysis . 5 Number Summary, BoxandWhisker Plot, Defining Outliers Z scores . Useful to identify outliers, especially in bellshaped data Chapter3 Review: Summary Measures Arithmetic Mean Median Mode Describing Data Numerically Standard Deviation Coefficient of Variation Range Interquartile Range Skewness Central Tendency Variability Shape Quartiles Kurtosis We looked at Central Tendency and Variability If the data are bellshaped, Mean and SD tell you what you need to know. If the data are not bellshaped, most informative are Median, IQR and Range Measures of Central Tendency : This concept refers to the extent to which all values in a dataset group around a typical / central value Mean : ( Average) = Sum of Values / Number of Values Most common measure of a typical value for a dataset Most useful with normally distributed data (bellshaped curve) avoid for datasets with extreme values Median : Middle Number Rank the data from smallest to largest & 50% of data are above the median, 50% of data are below the median Find the median position with the formula or the 2 rules, then see which value is at that position Not affected by extreme values use as typical value for datasets that are not normally distributed (not bellshaped) Measures of Variation : This concept refers to the amount of scattering of values in a dataset away from a typical / central value Range : Total Spread Most simple measure of variation in a dataset Does not inform how values are distributed between (can be misleading with extreme value) Standard Deviation : Average Deviation from Mean Typical deviation of a dataset from its mean Shows how data are clustered Affected by extreme values use if datasets are normally distributed (bell shaped) Interquartile Range : Middle 50% of the data Difference between 3rd and 1st quartiles A systems manager tracks the number of server failures that occur in a day. Determine the typical number of server failures for the following data, which represent the number of server failures per day over the Mean : (Average) = Sum of Values / Number of Values 1 3 0 3 26 2 7 4 0 2 3 3 6 3 X = X i i = 1 14 14 = 1 + 3 + + 3 + 26 + 2 + 7 + 4 + + 2 + 3 + 3 + 6 + 3 14 = 4.5 L6.xlsx A systems manager tracks the number of server failures that occur in...
View
Full
Document
This note was uploaded on 04/07/2011 for the course BUS 311 taught by Professor Reardon,j during the Spring '08 term at University of Hawaii, Manoa.
 Spring '08
 Reardon,J

Click to edit the document details