{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lesson 8 - Shapes - Lesson 8 More on Shapes and the Normal...

Info icon This preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Lesson 8: More on Shapes and the Normal Distribution Motivation and objective : The shape of data is an important aspect in statistical analysis. It will tell us what measures of center and spread are best to use and what analysis procedure can be used. In this lesson, we will discuss 1) some more ideas relating to measures of center and spread, 2) how shape influences which measures of center and spread are best to use to summarize the data, 3) what a density curve is and properties of density curve, and 4) the normal density curve (which is the most common shape of data) and properties of the normal density curve. More on Measures of Spread and Center 1. Effect of shifting or rescaling data : Go to page 6-4 in the Lesson Book : Learn About Changing the Baseline 2. How shape influences which measures of center and spread to use : Go to page 6-3 in the Lesson Book : Work with Dotplots to Compare Centers and Spreads Notes, comments, and other information: In previous lessons, some ActivStats tutorials mentioned the word resistant . Question: List 2 situations when the median is a better measure of the center than the mean. Answer: 1) when there are outliers and 2) when the data are skewed. The mean is affected more than the median by outliers and skewed data. With skewed data and outliers, the mean is “pulled” in the direction of the skewness or outliers. Because the median is not affected (or not affected as much), we say the median is resistant to skewed data and outliers. Example 8.1 : Calculate the mean and median of the following 5 numbers: 1, 1, 1, 1, 1000. Notice how the one outlier (1000) is affecting the mean (mean = 200.8) while it’s not affecting the median at all (median = 1). The median does a better job describing the center of the data while the mean is not describing any data in this data set. A statistic is resistant if it is not affected by unusual values or skewness.
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Question: Describe a situation when neither the median nor the mean should be used to describe the center of the data. Answer: when the data are symmetric and bimodal, such as this: Example 8.2: Suppose there are 6 people in a room with ages 8, 9, 10, 39, 40, and 41. Calculate the mean and the median. Clearly, neither the mean (mean = 24.5) nor the median (median = 24.5) are representative of any ages in the room. These ages form a bimodal shape with the first mode representing children and the second mode representing adults. Further more, this bimodal distribution is symmetric. (Note how the mean and median are equal.) With such a shape, neither the mean nor the median should be used to describe the center as these summary measures could mislead people in thinking that most people in the room are around 25 years old. A bimodal distribution is probably an indication of a third categorical variable having an influence on an analysis. In this case, that third variable is “age” which could be categorized into children and adults. So, instead of analyzing everyone together, it is best to report a mean (or median) for the children (mean = median = 9 years old) and a mean (or median) for the adults (mean = median = 40 years old).
Image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}