Statistics I Chapter 3
Numerical Descriptive Measures • If our ultimate goal is statistical inference, we want to use sample numerical descriptive measures to make inferences about the corresponding population measures • Most statistical techniques rely on looking at 1 of 2 data characteristics: – Central Tendency – Variability
Numerical Descriptive Measures • Central tendency is the tendency of the data to cluster around certain values • Variability is the spread of the data
Numerical Measures of Central Tendency • Mean – Technically, there are 3 different types of mean • Arithmetic • Geometric • Harmonic – For the purposes of this course, we will focus on the arithmetic mean , and just call it the mean • Median • Mode
Mean • The mean of a set of quantitative data is the sum of the measurements divided by the number of measurements. In everyday language this is called the average . • Example – Calculate the mean of the following measurements: 4, 8, 9, 2, 5 g1850 g3364 g3404 sumofthemeasurements numberofmeasurements g3404 4g33978g33979g33972g33975 5 g3404 28 5 g34045.6
Median • The median of a quantitative data set is the middle number when the measurements are put in ascending (or descending) order • The median is the value such that exactly half of the measurements are below and half above the median • The sample median is denoted g1865 , and the population median is denoted g2015 (eta).
Median • Examples: • What is the median of the following measurements: 2, 5, 4, 8, 7, 100, 6? – Reordering gives us 2, 4, 5, 6, 7, 8, 100. There are 7 observations, so g1865 is the middle number, 6. • What about: 2, 5, 4, 7, 8, 100? – Reordering gives us 2, 4, 5, 7, 8, 100. There are 6 observations, so g1865 is the mean of the middle 2 numbers , 5 and 7. Therefore g1865g3404 g3121g3126g3123 g3118 g3404 g3117g3118 g3118 g34046
Mode • For some data sets, the mode may not be very meaningful. – Consider the following data 3, 3, 5, 5, 8, 8, 9 – Here, 3, 5, and 8 all occur twice and so all three are modes of the sample, and none of these (as a mode) are useful as a measure of central tendency
Data Set I Data Set II Means, medians, and modes of salaries in Data Set I and Data Set II Example 1: Weekly Salary Data
Numerical Measures of Central Tendency • Sometimes the median is a better measure of central tendency than the mean. – The median is a resistant measure that is to say it is not sensitive to the influence of extremely large or small values in the dataset. The mean is not a resistant measure. – Consider the previous example, 2, 4, 5, 7, 8, 100. The median, 6, is unaffected by the single large value of 100, but it causes the mean, 21, to lie to the right of all of the other measurements – Sometimes we use a trimmed mean to improve the resistance of the mean. This involves removing a certain percentage of the smallest and largest observations before calculating the mean.
You've reached the end of your free preview.
Want to read all 77 pages?