12 Lecture 2: CHAPTER 1: SAMPLING AND DESCRIPTIVE STATISTICS Graphical Summaries (Section 1.3, page 25) Stem-and-Leaf Plots Most of us already know what a histogram looks like: A histogram tells us how many observations fall into a particular class, but we lose information about the different values within a class, so a stem and leaf plot is used to overcome some of these difficulties. Let’s do a little digit review: recall that the position of an individual digit in a number tells us the value the digit represents. For example, consider the number 962.78, number 9 is 100’s digit, 6 is 10’s digit, 2 is 1’s digit, 7 is .1’s digit and 8 is .01’s digit. The stem and leaf plot uses selected digits (called stems) to group the sample into classes. The individual observations are then represented by the stem and the next significant digit. This is called the “truncation” method.

13 EXAMPLE: Consider the following sample data of English Scholastic Aptitude Test (SAT) scores. 638 574 627 621 705 690 522 612 594 581 640 653 638 760 491 The data here ranges from 491 to 760. Thus it is reasonable to group the sample using the 100’s digit. The number “638” is plotted as 6 3, (8 is truncated). “6” is called a stem. The stem unit (SU) is 100 since the stem digit is the 100’s digit. “3” is called a Leaf. The leaf unit (LU) is 10, since the leaf digit is 10’s digit. 6 3 = 630 approximates the actual value of 638. Note: The leaf always consists of a single digit [anything more is truncated]. To illustrate these ideas, we plot the sample as follows: 638 574 627 621 705 690 522 612 594 581 640 653 638 760 491 Initial Plot: Stem Leaf
14 Final Plot: Stem Leaf LU= 10, SU = 100. This plot is called one leaf category per stem plot (LCPS). The number of LCPS merely gives the number of lines for which the stem value is the same. Example: Two leaf category per stem plot: Stem Leaf LU= 10, SU=100. Note: The increment of a stem and leaf plot is the distance from one line to the next.

15 INCR= SU/#LCPS For the above example: Histograms A set of raw data gives us very little information as to how the observations are distributed. For example, consider the
