This preview shows page 1. Sign up to view the full content.
Unformatted text preview: M316 Chapter 1 Dr. Berg Picturing Distributions with Graphs Statistics is about analyzing data. We first need to set some terminology. Terminology Individuals: These are the people or other things being described by the data. Variables: These are characteristics of the individuals described by the data. Categorical variables: These put an individual in one of several categories. Examples The variable "gender" categorizes a person as male or female. The variable "college class" categorizes a person as freshman, sophomore, junior, or senior. Quantitative variables: These measure some numerical characteristic of the individual. Example The variable "height" takes real number values. The variable "number of siblings" takes whole number values. Distribution: This shows what values are taken by a variable and how often. It may be done graphically or in a table. Pie Charts and Bar Graphs Pie charts and bar graphs give us a graphic representation of the distribution of a categorical variable. In a pie chart, the size of each slice or sector represents a portion of the whole. Since the relative area of each sector is determined by the angle as a portion of 360 degrees, determine the angle by multiplying the percentage for that category by 360. Pie charts are very popular in magazines and newspapers, but are seldom used in scientific publications. 1 M316 Chapter 1 Dr. Berg Example This pie chart shows the distribution of native English speakers. In a bar chart, the length of each bar represents proportion or magnitude, but not necessarily a percentage of the whole. Example This bar chart shows percentages of people in various age groups who own an MP3 player. 2 M316 Chapter 1 Dr. Berg Note that a pie chart would not be suitable for this data since these categories are not a portion of a whole. Histograms A histogram is a (usually vertical) bar graph where the height of each bar represents the number (or percentage) of individuals that fall into an interval of values. We normally make the intervals the same size and choose enough intervals to give a useful representation of the data. Example Table 1.1 in chapter 1 of the textbook gives the percentage of the population aged 25 and older with a bachelor's degree for each state and the District of Columbia. We treat the states as individuals and the percentage of adults with a bachelor's degree as the variable of interest. Using intervals of width 5, we get this histogram. Remember to specify the intervals exactly. In this case they are 15.0< percent with bachelor's degree 20.0 20.0< percent with bachelor's degree 25.0 ... 40.0< percent with bachelor's degree 45.0. 3 M316 Interpreting Histograms Chapter 1 Dr. Berg Look for the overall pattern and any striking deviations from that pattern. The overall pattern is described by its shape (symmetric, skewed left, skewed right), center (average, midpoint, or most common value), and spread (bunched together or spread out). An important kind of deviation is an outlier, an individual value that falls outside the overall pattern. Example Using the same data as the previous example with narrower intervals, we get this histogram. This distribution is not very symmetric but is somewhat skewed to the right. One crude measure of spread is the range, which in this case is from 15 to 45 percent. Note that the District of Columbia is an outlier. Stemplots A stemplot is a quick and dirty way to generate a graphic representation of a distribution. The last digit is the leaf and the preceding digits are the stems. Write the stems in a vertical column with the smallest at the top (usually) and draw a line to the right of this column. Then write each leaf in a row to the right of its stem (usually in increasing order). It is sometimes advantageous to "split" stems into two intervals to get a better picture. 4 M316 Example Using the same data: Chapter 1 Dr. Berg Exercise Here are highway mileages for 2002 model twoseater cars. 24, 28, 31, 25, 27, 27, 21, 25, 23, 16, 23, 56, 26, 13, 28, 23, 19, 30, 26, 22, 27, 30 Make a stemplot and then a split stemplot for this data. Timeplots A timeplot plots each observation against the time it was measured. We look for trends in a timeplot. 5 M316 Chapter 1 Dr. Berg Example Here is a timeplot for the retail price of fresh oranges. 6 ...
View Full Document
- Fall '08