David S. Moore, George P. McCabe, Bruce Craig - Introduction to the Practice of Statistics, Instruct - Chapter 1 Solutions 1.1 Most students will prefer

David S. Moore, George P. McCabe, Bruce Craig - Introduction to the Practice of Statistics, Instruct

This preview shows page 1 out of 418 pages.

You've reached the end of your free preview.

Want to read all 418 pages?

Unformatted text preview: Chapter 1 Solutions 1.1. Most students will prefer to work in seconds, to avoid having to work with decimals or fractions. 1.2. Who? The individuals in the data set are students in a statistics class. What? There are eight variables: ID (a label, with no units); Exam1, Exam2, Homework, Final, and Project (in units in “points,” scaled from 0 to 100); TotalPoints (in points, computed from the other scores, on a scale of 0 to 900); and Grade (A, B, C, D, and E). Why? The primary purpose of the data is to assign grades to the students in this class, and (presumably) the variables are appropriate for this purpose. (The data might also be useful for other purposes.) 1.3. Exam1 = 79, Exam2 = 88, Final = 88. 1.4. For this student, TotalPoints = 2 · 86 + 2 · 82 + 3 · 77 + 2 · 90 + 80 = 827, so the grade is B. 1.5. The cases are apartments. There are five variables: rent (quantitative), cable (categorical), pets (categorical), bedrooms (quantitative), distance to campus (quantitative). 1.6. (a) To find injuries per worker, divide the rates in Example 1.6 by 100,000 (or, redo the computations without multiplying by 100,000). For wage and salary workers, there are 0.000034 fatal injuries per worker. For self-employed workers, there are 0.000099 fatal injuries per worker. (b) These rates are 1/10 the size of those in Example 1.6, or 10,000 times larger than those in part (a): 0.34 fatal injuries per 10,000 wage/salary workers, and 0.99 fatal injuries per 10,000 self-employed workers. (c) The rates in Example 1.6 would probably be more easily understood by most people, because numbers like 3.4 and 9.9 feel more “familiar.” (It might be even better to give rates per million worker: 34 and 99.) 1.7. Shown are two possible stemplots; the first uses split stems (described on page 11 of the text). The scores are slightly left-skewed; most range from 70 to the low 90s. 5 6 6 7 7 8 8 9 9 58 0 58 0023 5558 00003 5557 0002233 8 5 6 7 8 9 58 058 00235558 000035557 00022338 1.8. Preferences will vary. However, the stemplot in Figure 1.8 shows a bit more detail, which is useful for comparing the two distributions. 1.9. (a) The stemplot of the altered data is shown on the right. (b) Blank stems should always be retained (except at the beginning or end of the stemplot), because the gap in the distribution is an important piece of information about the data. 53 1 2 2 3 3 4 4 5 6 5568 34 55678 012233 8 1 1.10. Student preferences will vary. The stemplot has the advantage of showing each individual score. Note that this histogram has the same shape as the second histogram in Exercise 1.7. Chapter 1 Frequency 54 9 8 7 6 5 4 3 2 1 0 50 Frequency 1.11. Student preferences may vary, but the larger classes in this histogram hide a lot of detail. Looking at Data—Distributions 60 90 100 18 16 14 12 10 8 6 4 2 0 40 60 80 First exam scores 100 7 6 Frequency 1.12. This histogram shows more details about the distribution (perhaps more detail than is useful). Note that this histogram has the same shape as the first histogram in the solution to Exercise 1.7. 70 80 First exam scores 5 4 3 2 1 0 55 60 65 70 75 80 85 90 First exam scores 95 100 1.13. Using either a stemplot or histogram, we see that the distribution is left-skewed, centered near 80, and spread from 55 to 98. (Of course, a histogram would not show the exact values of the maximum and minimum.) 1.14. (a) The cases are the individual employees. (b) The first four (employee identification number, last name, first name, and middle initial) are labels. Department and education level are categorical variables; number of years with the company, salary, and age are quantitative variables. (c) Column headings in student spreadsheets will vary, as will sample cases. 1.15. A Web search for “city rankings” or “best cities” will yield lots of ideas, such as crime rates, income, cost of living, entertainment and cultural activities, taxes, climate, and school system quality. (Students should be encouraged to think carefully about how some of these might be quantitatively measured.) Solutions 55 1.16. Recall that categorical variables place individuals into groups or categories, while quantitative variables “take numerical values for which arithmetic operations. . . make sense.” Variables (a), (d), and (e)—age, amount spent on food, and height—are quantitative. The answers to the other three questions—about dancing, musical instruments, and broccoli—are categorical variables. 1.18. Student answers will vary. A Web search for “college ranking methodology” gives some ideas; in recent year, U.S. News and World Report used “16 measures of academic excellence,” including academic reputation (measured by surveying college and university administrators), retention rate, graduation rate, class sizes, faculty salaries, student-faculty ratio, percentage of faculty with highest degree in their fields, quality of entering students (ACT/SAT scores, high school class rank, enrollment-to-admission ratio), financial resources, and the percentage of alumni who give to the school. brown gray white red black blue yellow orange black red purple green 40 35 30 25 20 15 10 5 0 blue Percent 1.19. For example, blue is by far the most popular choice; 70% of respondents chose 3 of the 10 options (blue, green, and purple). Favorite color 30 25 Percent 1.20. For example, opinions about least-favorite color are somewhat more varied than favorite colors. Interestingly, purple is liked and disliked by about the same fractions of people. 20 15 10 5 white green gray yellow purple brown orange 0 Least favorite color 1.21. (a) There were 232 total respondents. The table that follows gives the percents; for 10 . = 4.31%. (b) The bar graph is on the following page. (c) For example, 87.5% example, 232 of the group were between 19 and 50. (d) The age-group classes do not have equal width: The first is 18 years wide, the second is 6 years wide, the third is 11 years wide, etc. Note: In order to produce a histogram from the given data, the bar for the first age group would have to be three times as wide as the second bar, the third bar would have to be wider than the second bar by a factor of 11/6, etc. Additionally, if we change a bar’s 56 Chapter 1 Looking at Data—Distributions width by a factor of x, we would need to change that bar’s height by a factor of 1/x. 70 and over 51 to 69 36 to 50 25 to 35 1 to 18 19 to 24 Percent 4.31% 41.81% 30.17% 15.52% 6.03% 2.16% Percent Age group (years) 1 to 18 19 to 24 25 to 35 36 to 50 51 to 69 70 and over 40 35 30 25 20 15 10 5 0 Age group (years) 1.22. (a) & (b) The bar graph and pie charts are shown below. (c) A clear majority (76%) agree or strongly agree that they browse more with the iPhone than with their previous phone. (d) Student preferences will vary. Some might prefer the pie chart because it is more familiar. Strongly disagree Response percent 50 40 30 Mildly disagree 20 Strongly agree Mildly agree 10 0 Strongly disagree 25 Replacement percent 20 15 10 5 W Previous phone model g thi n r he Ot No ian k mb kic Sy de Si ry er kB lm Pa Bl ow ind e bil o sM ac zr 0 Ra 1.23. Ordering bars by decreasing height shows the models most affected by iPhone sales. However, because “other phone” and ”replaced nothing” are different than the other categories, it makes sense to place those two bars last (in any order). ola Mildly disagree tor Mildly agree Mo Strongly agree Solutions 57 10 Paper Metals 5 Other Metals 15 Glass Food scraps 20 Wood 25 Glass Other Wood Rubber, leather, textile Rubber, leather, textiles Paper, paperboard Plastics 30 Yard trimmings Percent of total waste 1.24. (a) The weights add to 254.2 million tons, and the percents add to 99.9. (b) & (c) The bar graph and pie chart are shown below. Plastics Yard trimmings Food scraps 0 Source 60 60 50 50 Percent recycled 40 30 20 10 0 30 20 10 0 r pe s ng s im mi O Pa s tal Me Tr mi im W r the Tr Ru d oo ng r e bb s tic as Material 1.26. (a) The bar graph is shown on the right. (b) The graph clearly illustrates the dominance of Google; its bar dwarfs those of the other search engines. s as Gl r be b Ru Material Market share (%) G r s tal ape P Me Pl s las ps ra sc od Fo 40 d Pl Fo asti od cs sc ra ps Percent recycled 1.25. (a) & (b) Both bar graphs are shown below. (c) The ordered bars in the graph from (b) make it easier to identify those materials that are frequently recycled and those that are not. (d) Each percent represents part of a different whole. (For example, 2.6% of food scraps are recycled; 23.7% of glass is recycled, etc.) oo W r he Ot 80 70 60 50 40 30 20 10 0 Google Yahoo MSN AOL Microsoft Ask Live Search engine Other 58 Chapter 1 Looking at Data—Distributions Percent of all spam 1.27. The two bar graphs are shown below. 20 20 15 15 10 10 5 5 0 0 Adult Financial Health Leisure Products Scams Products Financial Adult Scams Leisure Health Type of spam Type of spam 10 8 6 4 2 rk Au ey str a Co lia lom bia Ch ile Fra nc No e rw a Sw y ed en Me Ve xico ne So zue uth la A Ho frica ng Ko ng Eg De ypt nm ark Sp ain Ind Ge ia rm an y Isr ae l Ita ly Tu do na Ca ing dK Un ite da 0 m Facebook users (millions) 1.28. (a) The bar graph is below. (b) The number of Facebook users trails off rapidly after the top seven or so. (Of course, this is due in part to the variation in the populations of these countries. For example, that Norway has nearly half as many Facebook users as France is remarkable, because the 2008 populations of France and Norway were about 62.3 million and 4.8 million, respectively.) Country 1.29. (a) Most countries had moderate (single- or double-digit) increases in Facebook usages. Chile (2197%) is an extreme outlier, as are (maybe) Venezuela (683%) and Colombia (246%). (b) In the stemplot on the right, Chile and Venezuela have been omitted, and stems are split five ways. (c) One observation is that, even without the outliers, the distribution is right-skewed. (d) The stemplot can show some of the detail of the low part of the distribution, if the outliers are omitted. 0 0 0 0 0 1 1 1 1 1 2 2 2 000 2333 4444 6 99 33 4 59 70 60 50 40 30 20 10 Theology M.B.A. M.D. Law Other M.S. Other Ph.D. Ed.D. Other M.A. 0 M.Ed. 1.30. (a) The given percentages refer to nine distinct groups (all M.B.A. degrees, all M.Ed. degrees, and so on) rather than one single group. (b) Bar graph shown on the right. Bars are ordered by height, as suggested by the text; students may forget to do this or might arrange in the opposite order (smallest to largest). Degrees earned by women (%) Solutions Yel low Oth er ld d /go Re e Blu y ite Wh rl Gra er pea ite Wh Silv Bla ck 0 Color 25 20 15 10 5 d /go l rl low ite d ite Wh pea Yel Color Re e Gra y Bla ck 0 Blu ld er Oth d /go Re low 10 Wh Color Yel Blu e y ite Wh rl Gra er pea ite Wh Silv Bla ck 0 15 er 5 Intermediate cars Oth 10 20 er 15 Luxury cars Silv 20 25 5 Percent of intermediate cars Percent of luxury cars 1.31. (a) The luxury car bar graph is below on the left; bars are in decreasing order of size (the order given in the table). (b) The intermediate car bar graph is below on the right. For this stand-alone graph, it seemed appropriate to re-order the bars by decreasing size. Students may leave the bars in the order given in the table; this (admittedly) might make comparison of the two graphs simpler. (c) The graph on the right is one possible choice for comparing the two types of cars: for each color, we have one bar for each car type. Percent Graduate degree 1.32. This distribution is skewed to the right, meaning that Shakespeare’s plays contain many short words (up to six letters) and fewer very long words. We would probably expect most authors to have skewed distributions, although the exact shape and spread will vary. 60 Chapter 1 Looking at Data—Distributions 1.33. Shown is the stemplot; as the text suggests, we have trimmed numbers (dropped the last digit) and split stems. 359 mg/dl appears to be an outlier. Overall, glucose levels are not under control: Only 4 of the 18 had levels in the desired range. 1.34. The back-to-back stemplot on the right suggests that the individual-instruction group was more consistent (their numbers have less spread) but not more successful (only two had numbers in the desired range). 0 1 1 2 2 3 3 Individual 22 99866655 22222 8 0 1 1 2 2 3 3 799 0134444 5577 0 57 5 Class 799 0134444 5577 0 57 5 1.35. The distribution is roughly symmetric, centered near 7 (or “between 6 and 7”), and spread from 2 to 13. 1.36. (a) Totals emissions would almost certainly be higher for 0 00000000000000011111 0 222233333 very large countries; for example, we would expect that even 0 445 with great attempts to control emissions, China (with over 0 6677 1 billion people) would have higher total emissions than the 0 888999 1 001 smallest countries in the data set. (b) A stemplot is shown; a 1 histogram would also be appropriate. We see a strong right 1 skew with a peak from 0 to 0.2 metric tons per person and a 1 67 smaller peak from 0.8 to 1. The three highest countries (the 1 9 United States, Canada, and Australia) appear to be outliers; apart from those countries, the distribution is spread from 0 to 11 metric tons per person. 1.37. To display the 0 000000000000000000000000000000000000011111111111111111111 0 2222222222222222233333333333333333333333 distribution, use 0 444444444444444444445555555555555555555 either a stemplot 0 666666666666666666667777777777777 or a histogram. DT 0 888888888888888999999999999999999 1 000000000000111111111 scores are skewed to 1 22222222222233333333333 the right, centered 1 444444455 near 5 or 6, spread 1 66666777 from 0 to 18. There 1 8 are no outliers. We might also note that only 11 of these 264 women (about 4%) scored 15 or higher. Solutions 61 Frequency 1.38. (a) The first histogram shows two modes: 5–5.2 and 5.6–5.8. (b) The second histogram has peaks in locations close to those of the first, but these peaks are much less pronounced, so they would usually be viewed as distinct modes. (c) The results will vary with the software used. 18 16 14 12 10 8 6 4 2 0 4.2 4.6 5 5.4 5.8 6.2 Rainwater pH 6.6 7 18 16 14 12 10 8 6 4 2 0 4.14 4.54 4.94 5.34 5.74 6.14 Rainwater pH 6.54 6.94 1.39. Graph (a) is studying time (Question 4); it is reasonable to expect this to be right-skewed (many students study little or not at all; a few study longer). Graph (d) is the histogram of student heights (Question 3): One would expect a fair amount of variation but no particular skewness to such a distribution. The other two graphs are (b) handedness and (c) gender—unless this was a particularly unusual class! We would expect that right-handed students should outnumber lefties substantially. (Roughly 10 to 15% of the population as a whole is left-handed.) 1.40. Sketches will vary. The distribution of coin years would be left-skewed because newer coins are more common than older coins. Women Men 1.41. (a) Not only are most responses multiples of 10; 0 033334 many are multiples of 30 and 60. Most people will 96 0 66679999 “round” their answers when asked to give an estimate 22222221 1 2222222 888888888875555 1 558 like this; in fact, the most striking answers are ones 4440 2 00344 such as 115, 170, or 230. The students who claimed 360 2 3 0 minutes (6 hours) and 300 minutes (5 hours) may have 6 3 been exaggerating. (Some students might also “consider suspicious” the student who claimed to study 0 minutes per night. As a teacher, I can easily believe that such students exist, and I suspect that some of your students might easily accept that claim as well.) (b) The stemplots suggest that women (claim to) study more than men. The approximate centers are 175 minutes for women and 120 minutes for men. 62 Chapter 1 Looking at Data—Distributions 1.42. The stemplot gives more information than a histogram (since all the original numbers can be read off the stemplot), but both give the same impression. The distribution is roughly symmetric with one value (4.88) that is somewhat low. The center of the distribution is between 5.4 and 5.5 (the median is 5.46, the mean is 5.448); if asked to give a single estimate for the “true” density of the earth, something in that range would be the best answer. 48 49 50 51 52 53 54 55 56 57 58 8 7 0 6799 04469 2467 03578 12358 59 5 1.43. (a) There are four variables: GPA, IQ, and self-concept are quantitative, while gender is categorical. (OBS is not a variable, since it is not really a “characteristic” of a student.) (b) Below. (c) The distribution is skewed to the left, with center (median) around 7.8. GPAs are spread from 0.5 to 10.8, with only 15 below 6. (d) There is more variability among the boys; in fact, there seems to be a subset of boys with GPAs from 0.5 to 4.9. Ignoring that group, the two distributions have similar shapes. 0 1 2 3 4 5 6 7 8 9 10 5 8 4 4689 0679 1259 0112249 22333556666666788899 0000222223347899 002223344556668 01678 Female 4 7 952 4210 98866533 997320 65300 710 1.44. Stemplot at right, with split stems. The distribution is fairly symmetric—perhaps slightly left-skewed—with center around 110 (clearly above 100). IQs range from the low 70s to the high 130s, with a “gap” in the low 80s. 0 1 2 3 4 5 6 7 8 9 10 Male 5 8 4 689 069 1 129 223566666789 0002222348 2223445668 68 7 7 8 8 9 9 10 10 11 11 12 12 13 13 24 79 69 0133 6778 0022333344 555666777789 0000111122223334444 55688999 003344 677888 02 6 Solutions 63 1.46. The time plot on the right shows that women’s times decreased quite rapidly from 1972 until the mid-1980s. Since that time, they have been fairly consistent: Almost all times since 1986 are between 141 and 147 minutes. Winning time (minutes) 1.45. Stemplot at right, with split stems. The distribution is skewed to the left, with center around 59.5. Most self-concept scores are between 35 and 73, with a few below that, and one high score of 80 (but not really high enough to be an outlier). 2 2 3 3 4 4 5 5 6 6 7 7 8 01 8 0 5679 02344 6799 1111223344444 556668899 00001233344444 55666677777899 0000111223 0 190 180 170 160 150 140 1970 1975 1980 1985 1990 1995 2000 2005 Year 1.47. The total for the 24 countries was 897 days, so with Suriname, it is 897 + 694 = 1591 days, and the mean is x = 1591 25 = 63.64 days. 1.48. The mean score is x = 821 = 82.1. 10 1.49. To find the ordered list of times, start with the 24 times in Example 1.23, and add 694 to the end of the list. The ordered times (with median highlighted) are 4, 11, 14, 23, 23, 23, 23, 24, 27, 29, 31, 33, 40 , 42, 44, 44, 44, 46, 47, 60, 61, 62, 65, 77, 694 The outlier increases the median from 36.5 to 40 days, but the change is much less than the outlier’s effect on the mean. 1.50. The median of the service times is 103.5 seconds. (This is the average of the 40th and 41st numbers in the sorted list, but for a set of 80 numbers, we assume that most students will compute the median using software, which does not require that the data be sorted.) 1.51. In order, the scores are: 55, 73, 75, 80, 80 , 85 , 90, 92, 93, 98 The middle two scores are 80 and 85, so the median is M = 80 + 85 = 82.5. 2 64 Chapter 1 Looking at Data—Distributions 1.52. See the ordered list given in the previous solution. The first quartile is Q 1 = 75, the median of the first five numbers: 55, 73, 75 , 80, 80. Similarly, Q 3 = 92, the median of the last five numbers: 85, 90, 92 , 93, 98. 1.53. The maximum and minimum can be found by inspecting the list. The sorted list (with quartile and median locations highlighted) is 1 19 55 75 104 140 201 372 2 25 56 76 106 141 203 386 2 30 57 76 115 143 211 438 3 35 59 77 116 148 225 465 4 40 64 80 118 148 274 479 9 44 67 88 121 157 277 700 9 48 68 89 126 178 289 700 9 51 73 90 128 179 290 951 11 52 73 102 137 182 325 1148 19 54 75 103 138 199 367 2631 This confirms the five-number summary (1, 54.5, 103.5, 200, and 2631 seconds) given in Example 1.26. The sum of the 80 numbers is 15,726 seconds, so the mean is x = 15,726 80 = 196.575 seconds (the value 197 in the text was rounded)...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes