Textbook Introduction to the Practice of Statistics 9th Edition- David S. Moore.pdf

This preview shows page 1 out of 1429 pages.

Unformatted text preview: Introduction to the Practice of Statistics NINTH EDITION David S. Moore George P. McCabe Bruce A. Craig Purdue University 2 Vice President, STEM: Ben Roberts Publisher: Terri Ward Senior Acquisitions Editor: Karen Carson Marketing Manager: Tom DeMarco Marketing Assistant: Cate McCaffery Development Editor: Jorge Amaral Senior Media Editor: Catriona Kaplan Assistant Media Editor: Emily Tenenbaum Director of Digital Production: Keri deManigold Senior Media Producer: Alison Lorber Associate Editor: Victoria Garvey Editorial Assistant: Katharine Munz Photo Editor: Cecilia Varas Photo Researcher: Candice Cheesman Director of Design, Content Management: Diana Blume Text and Cover Designer: Blake Logan Project Editor: Edward Dionne, MPS North America LLC Illustrations: MPS North America LLC Production Manager: Susan Wein Composition: MPS North America LLC Printing and Binding: LSC Communications Cover Illustration: Drawing Water: Spring 2011 detail (Midwest) by David Wicks “Look Back” Arrow: NewCorner/Shutterstock Library of Congress Control Number: 2016946039 Student Edition Hardcover: ISBN-13: 978-1-319-01338-7 ISBN-10: 1-319-01338-4 Student Edition Loose-leaf: ISBN-13: 978-1-319-01362-2 ISBN-10: 1-319-01362-7 Instructor Complimentary Copy: ISBN-13: 978-1-319-01428-5 ISBN-10: 1-319-01428-3 © 2017, 2014, 2012, 2009 by W. H. Freeman and Company All rights reserved Printed in the United States of America First printing W. H. Freeman and Company One New York Plaza Suite 4500 New York, NY 10004-1562 3 Brief Contents To Teachers: About This Book To Students: What Is Statistics? About the Authors Data Table Index Beyond the Basics Index PART I Looking at Data CHAPTER 1 Looking at Data—Distributions CHAPTER 2 Looking at Data—Relationships CHAPTER 3 Producing Data PART II Probability and Inference CHAPTER 4 Probability: The Study of Randomness CHAPTER 5 Sampling Distributions CHAPTER 6 Introduction to Inference CHAPTER 7 Inference for Means CHAPTER 8 Inference for Proportions PART III Topics in Inference CHAPTER 9 Inference for Categorical Data CHAPTER 10 Inference for Regression CHAPTER 11 Multiple Regression CHAPTER 12 One-Way Analysis of Variance 4 CHAPTER 13 Two-Way Analysis of Variance Tables Answers to Odd-Numbered Exercises Notes and Data Sources Index 5 Contents To Teachers: About This Book To Students: What Is Statistics? About the Authors Data Table Index Beyond the Basics Index PART I Looking at Data CHAPTER 1 Looking at Data—Distributions Introduction 1.1 Data Key characteristics of a data set Section 1.1 Summary Section 1.1 Exercises 1.2 Displaying Distributions with Graphs Categorical variables: Bar graphs and pie charts Quantitative variables: Stemplots and histograms Histograms Data analysis in action: Don’t hang up on me Examining distributions Dealing with outliers Time plots Section 1.2 Summary Section 1.2 Exercises 1.3 Describing Distributions with Numbers Measuring center: The mean Measuring center: The median Mean versus median 6 Measuring spread: The quartiles The five-number summary and boxplots The 1.5 × IQR rule for suspected outliers Measuring spread: The standard deviation Properties of the standard deviation Choosing measures of center and spread Changing the unit of measurement Section 1.3 Summary Section 1.3 Exercises 1.4 Density Curves and Normal Distributions Density curves Measuring center and spread for density curves Normal distributions The 68–95–99.7 rule Standardizing observations Normal distribution calculations Using the standard Normal table Inverse Normal calculations Normal quantile plots Beyond the Basics: Density estimation Section 1.4 Summary Section 1.4 Exercises Chapter 1 Exercises CHAPTER 2 Looking at Data—Relationships Introduction 2.1 Relationships Examining relationships Section 2.1 Summary Section 2.1 Exercises 2.2 Scatterplots Interpreting scatterplots The log transformation Adding categorical variables to scatterplots 7 Scatterplot smoothers Categorical explanatory variables Section 2.2 Summary Section 2.2 Exercises 2.3 Correlation The correlation r Properties of correlation Section 2.3 Summary Section 2.3 Exercises 2.4 Least-Squares Regression Fitting a line to data Prediction Least-squares regression Interpreting the regression line Facts about least-squares regression Correlation and regression Another view of r2 Section 2.4 Summary Section 2.4 Exercises 2.5 Cautions about Correlation and Regression Residuals Outliers and influential observations Beware of the lurking variable Beware of correlations based on averaged data Beware of restricted ranges Beyond the Basics: Data mining Section 2.5 Summary Section 2.5 Exercises 2.6 Data Analysis for Two-Way Tables The two-way table Joint distribution Marginal distributions Describing relations in two-way tables Conditional distributions 8 Simpson’s paradox Section 2.6 Summary Section 2.6 Exercises 2.7 The Question of Causation Explaining association Establishing causation Section 2.7 Summary Section 2.7 Exercises Chapter 2 Exercises CHAPTER 3 Producing Data Introduction 3.1 Sources of Data Anecdotal data Available data Sample surveys and experiments Section 3.1 Summary Section 3.1 Exercises 3.2 Design of Experiments Comparative experiments Randomization Randomized comparative experiments How to randomize Randomization using software Randomization using random digits Cautions about experimentation Matched pairs designs Block designs Section 3.2 Summary Section 3.2 Exercises 3.3 Sampling Design Simple random samples How to select a simple random sample Stratified random samples 9 Multistage random samples Cautions about sample surveys Beyond the Basics: Capture-recapture sampling Section 3.3 Summary Section 3.3 Exercises 3.4 Ethics Institutional review boards Informed consent Confidentiality Clinical trials Behavioral and social science experiments Section 3.4 Summary Section 3.4 Exercises Chapter 3 Exercises PART II Probability and Inference CHAPTER 4 Probability: The Study of Randomness Introduction 4.1 Randomness The language of probability Thinking about randomness The uses of probability Section 4.1 Summary Section 4.1 Exercises 4.2 Probability Models Sample spaces Probability rules Assigning probabilities: Finite number of outcomes Assigning probabilities: Equally likely outcomes Independence and the multiplication rule Applying the probability rules Section 4.2 Summary 10 Section 4.2 Exercises 4.3 Random Variables Discrete random variables Continuous random variables Normal distributions as probability distributions Section 4.3 Summary Section 4.3 Exercises 4.4 Means and Variances of Random Variables The mean of a random variable Statistical estimation and the law of large numbers Thinking about the law of large numbers Beyond the Basics: More laws of large numbers Rules for means The variance of a random variable Rules for variances and standard deviations Section 4.4 Summary Section 4.4 Exercises 4.5 General Probability Rules General addition rules Conditional probability General multiplication rules Tree diagrams Bayes’s rule Independence again Section 4.5 Summary Section 4.5 Exercises Chapter 4 Exercises CHAPTER 5 Sampling Distributions Introduction 5.1 Toward Statistical Inference Sampling variability Sampling distributions Bias and variability 11 Sampling from large populations Why randomize? Section 5.1 Summary Section 5.1 Exercises 5.2 The Sampling Distribution of a Sample Mean The mean and standard deviation of x̅ The central limit theorem A few more facts Beyond the Basics: Weibull distributions Section 5.2 Summary Section 5.2 Exercises 5.3 Sampling Distributions for Counts and Proportions The binomial distributions for sample counts Binomial distributions in statistical sampling Finding binomial probabilities Binomial mean and standard deviation Sample proportions Normal approximation for counts and proportions The continuity correction Binomial formula The Poisson distributions Section 5.3 Summary Section 5.3 Exercises Chapter 5 Exercises CHAPTER 6 Introduction to Inference Introduction Overview of inference 6.1 Estimating with Confidence Statistical confidence Confidence intervals Confidence interval for a population mean How confidence intervals behave Choosing the sample size 12 Some cautions Section 6.1 Summary Section 6.1 Exercises 6.2 Tests of Significance The reasoning of significance tests Stating hypotheses Test statistics P-values Statistical significance Tests for a population mean Two-sided significance tests and confidence intervals The P-value versus a statement of significance Section 6.2 Summary Section 6.2 Exercises 6.3 Use and Abuse of Tests Choosing a level of significance What statistical significance does not mean Don’t ignore lack of significance Statistical inference is not valid for all sets of data Beware of searching for significance Section 6.3 Summary Section 6.3 Exercises 6.4 Power and Inference as a Decision Power Increasing the power Inference as decision Two types of error Error probabilities The common practice of testing hypotheses Section 6.4 Summary Section 6.4 Exercises Chapter 6 Exercises CHAPTER 7 Inference for Means 13 Introduction 7.1 Inference for the Mean of a Population The t distributions The one-sample t confidence interval The one-sample t test Matched pairs t procedures Robustness of the t procedures Beyond the Basics: The bootstrap Section 7.1 Summary Section 7.1 Exercises 7.2 Comparing Two Means The two-sample z statistic The two-sample t procedures The two-sample t confidence interval The two-sample t significance test Robustness of the two-sample procedures Inference for small samples Software approximation for the degrees of freedom The pooled two-sample t procedures Section 7.2 Summary Section 7.2 Exercises 7.3 Additional Topics on Inference Choosing the sample size Inference for non-Normal populations Section 7.3 Summary Section 7.3 Exercises Chapter 7 Exercises CHAPTER 8 Inference for Proportions Introduction 8.1 Inference for a Single Proportion Large-sample confidence interval for a single proportion Beyond the Basics: The plus four confidence interval for a single proportion 14 Significance test for a single proportion Choosing a sample size for a confidence interval Choosing a sample size for a significance test Section 8.1 Summary Section 8.1 Exercises 8.2 Comparing Two Proportions Large-sample confidence interval for a difference in proportions Beyond the Basics: The plus four confidence interval for a difference in proportions Significance test for a difference in proportions Choosing a sample size for two sample proportions Beyond the Basics: Relative risk Section 8.2 Summary Section 8.2 Exercises Chapter 8 Exercises PART III Topics in Inference CHAPTER 9 Inference for Categorical Data Introduction 9.1 Inference for Two-Way Tables The hypothesis: No association Expected cell counts The chi-square test Computations Computing conditional distributions The chi-square test and the z test Beyond the Basics: Meta-analysis Section 9.1 Summary Section 9.1 Exercises 9.2 Goodness of Fit Section 9.2 Summary Section 9.2 Exercises 15 Chapter 9 Exercises CHAPTER 10 Inference for Regression Introduction 10.1 Simple Linear Regression Statistical model for linear regression Preliminary data analysis and inference considerations Estimating the regression parameters Checking model assumptions Confidence intervals and significance tests Confidence intervals for mean response Prediction intervals Transforming variables Beyond the Basics: Nonlinear regression Section 10.1 Summary Section 10.1 Exercises 10.2 More Detail about Simple Linear Regression Analysis of variance for regression The ANOVA F test Calculations for regression inference Inference for correlation Section 10.2 Summary Section 10.2 Exercises Chapter 10 Exercises CHAPTER 11 Multiple Regression Introduction 11.1 Inference for Multiple Regression Population multiple regression equation Data for multiple regression Multiple linear regression model Estimation of the multiple regression parameters Confidence intervals and significance tests for regression coefficients 16 ANOVA table for multiple regression Squared multiple correlation R2 Section 11.1 Summary Section 11.1 Exercises 11.2 A Case Study Preliminary analysis Relationships between pairs of variables Regression on high school grades Interpretation of results Examining the residuals Refining the model Regression on SAT scores Regression using all variables Test for a collection of regression coefficients Beyond the Basics: Multiple logistic regression Section 11.2 Summary Section 11.2 Exercises Chapter 11 Exercises CHAPTER 12 One-Way Analysis of Variance Introduction 12.1 Inference for One-Way Analysis of Variance Data for one-way ANOVA Comparing means The two-sample t statistic An overview of ANOVA The ANOVA model Estimates of population parameters Testing hypotheses in one-way ANOVA The ANOVA table The F test Software Beyond the Basics: Testing the equality of spread Section 12.1 Summary Section 12.1 Exercises 17 12.2 Comparing the Means Contrasts Multiple comparisons Power Section 12.2 Summary Section 12.2 Exercises Chapter 12 Exercises CHAPTER 13 Two-Way Analysis of Variance Introduction 13.1 The Two-Way ANOVA Model Advantages of two-way ANOVA The two-way ANOVA model Main effects and interactions 13.2 Inference for Two-Way ANOVA The ANOVA table for two-way ANOVA Chapter 13 Summary Chapter 13 Exercises Tables Answers to Odd-Numbered Exercises Notes and Data Sources Index 18 To Teachers: About This Book Statistics is the science of data. Introduction to the Practice of Statistics (IPS) is an introductory text based on this principle. We present methods of basic statistics in a way that emphasizes working with data and mastering statistical reasoning. IPS is elementary in mathematical level but conceptually rich in statistical ideas. After completing a course based on our text, we would like students to be able to think objectively about conclusions drawn from data and use statistical methods in their own work. In IPS, we combine attention to basic statistical concepts with a comprehensive presentation of the elementary statistical methods that students will find useful in their work. IPS has been successful for several reasons: 1. IPS examines the nature of modern statistical practice at a level suitable for beginners. We focus on the production and analysis of data as well as the traditional topics of probability and inference. 2. IPS has a logical overall progression, so data production and data analysis are a major focus, while inference is treated as a tool that helps us draw conclusions from data in an appropriate way. 3. IPS presents data analysis as more than a collection of techniques for exploring data. We emphasize systematic ways of thinking about data. Simple principles guide the analysis: always plot your data; look for overall patterns and deviations from them; when looking at the overall pattern of a distribution for one variable, consider shape, center, and spread; for relations between two variables, consider form, direction, and strength; always ask whether a relationship between variables is influenced by other variables lurking in the background. We warn students about pitfalls in clear cautionary discussions. 4. IPS uses real examples to drive the exposition. Students learn the technique of least-squares regression and how to interpret the regression slope. But they also learn the conceptual ties between regression and correlation and the importance of looking for influential observations. 5. IPS is aware of current developments both in statistical science and in teaching statistics. Brief, optional Beyond the Basics sections give quick overviews of topics such as density estimation, scatterplot smoothers, data mining, nonlinear regression, and meta-analysis. Chapter 16 gives an elementary introduction to the bootstrap and other computer-intensive 19 statistical methods. The title of the book expresses our intent to introduce readers to statistics as it is used in practice. Statistics in practice is concerned with drawing conclusions from data. We focus on problem solving rather than on methods that may be useful in specific settings. GAISE The College Report of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Project ( ) was funded by the American Statistical Association to make recommendations for how introductory statistics courses should be taught. This report and its update contain many interesting teaching suggestions, and we strongly recommend that you read it. The philosophy and approach of IPS closely reflect the GAISE recommendations. Let’s examine each of the latest recommendations in the context of IPS. 1. Teach statistical thinking. Through our experiences as applied statisticians, we are very familiar with the components that are needed for the appropriate use of statistical methods. We focus on formulating questions, collecting and finding data, evaluating the quality of data, exploring the relationships among variables, performing statistical analyses, and drawing conclusions. In examples and exercises throughout the text, we emphasize putting the analysis in the proper context and translating numerical and graphical summaries into conclusions. 2. Focus on conceptual understanding. With the software available today, it is very easy for almost anyone to apply a wide variety of statistical procedures, both simple and complex, to a set of data. Without a firm grasp of the concepts, such applications are frequently meaningless. By using the methods that we present on real sets of data, we believe that students will gain an excellent understanding of these concepts. Our emphasis is on the input (questions of interest, collecting or finding data, examining data) and the output (conclusions) for a statistical analysis. Formulas are given only where they will provide some insight into concepts. 3. Integrate real data with a context and a purpose. Many of the examples and exercises in IPS include data that we have obtained from collaborators or consulting clients. Other data sets have come from research related to these activities. We have also used the Internet as a data source, particularly for data related to social media and other topics of interest to undergraduates. Our emphasis on real data, rather than artificial data chosen to illustrate a calculation, serves to motivate students and help them see the usefulness of statistics in everyday life. We also frequently encounter interesting statistical issues that we explore. These include outliers and nonlinear relationships. All data sets are available from the text website. 20 4. Foster active learning in the classroom. As we mentioned earlier, we believe that statistics is exciting as something to do rather than something to talk about. Throughout the text, we provide exercises in Use Your Knowledge sections that ask the students to perform some relatively simple tasks that reinforce the material just presented. Other exercises are particularly suited to being worked on and discussed within a classroom setting. 5. Use technology for developing concepts and analyzing data. Technology has altered statistical practice in a fundamental way. In the past, some of the calculations that we performed were particularly difficult and tedious. In other words, they were not fun. Today, freed from the burden of computation by software, we can concentrate our efforts on the big picture: what questions are we trying to address with a study and what can we conclude from our analysis? 6. Use assessments to improve and evaluate student learning. Our goal for students who complete a course based on IPS is that they are able to design and carry out a statistical study for a project in their capstone course or other setting. Our exercises are oriented toward this goal. Many ask about the design of a statistical study and the collection of data. Others ask...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture