Statistics---math1020----red-river-college----version-2015revb----draft-2015-10-27-2.198.pdf

This preview shows page 1 out of 413 pages.

Unformatted text preview: 1 Statistics-- MATH1020 -- Red River College -- Version 2015RevB -- DRAFT 2015-10-27 Collection edited by: Claude Laflamme Content authors: OpenStax Business Statistics, Claude Laflamme, and OpenStax Based on: Introductory Statistics < ;. Online: < ; This selection and arrangement of content as a collection is copyrighted by Claude Laflamme. Creative Commons Attribution License 4.0 Collection structure revised: 2015/10/27 PDF Generated: 2020/01/10 18:11:11 For copyright and attribution information for the modules contained in this collection, see the "Attributions" section at the end of the collection. 2 This OpenStax book is available for free at Table of Contents Preface -- RRC MATH1020 adaptation -- Version 2015 Revision B . . . . . . . . . . . . . . . . . Chapter 1: Sampling and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Definitions of Statistics, Probability, and Key Terms . . . . . . . . . . . . . . . . . . . . . 1.2 Data, Sampling, and Variation in Data and Sampling . . . . . . . . . . . . . . . . . . . . 1.3 Levels of Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Experimental Design and Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 2: Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Display Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Measures of the Location of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Measures of the Center of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Sigma Notation and Calculating the Arithmetic Mean . . . . . . . . . . . . . . . . . . . . 2.5 Skewness and the Mean, Median, and Mode . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Measures of the Spread of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 3: Probability Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Independent and Mutually Exclusive Events . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Two Basic Rules of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Contingency Tables and Probability Trees . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 4: Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5: The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Using the Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6: The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Central Limit Theorem for Sample Means . . . . . . . . . . . . . . . . . . . . . . . 6.2 Using the Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 7: Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 A Confidence Interval for a Population Standard Deviation, Known or Large Sample Size . 7.2 A Confidence Interval for a Population Standard Deviation Unknown, Small Sample Case . 7.3 A Confidence Interval for A Population Proportion . . . . . . . . . . . . . . . . . . . . . . 7.4 Calculating the Sample Size n: Continuous and Binary Random Variables . . . . . . . . . Chapter 8: Hypothesis Testing with One Sample . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Null and Alternative Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Outcomes and the Type I and Type II Errors . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Distribution Needed for Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Full Hypothesis Test Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Rare Events, the Sample, Decision and Conclusion . . . . . . . . . . . . . . . . . . . . . Chapter 9: Hypothesis Testing with Two Samples . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Comparing Two Independent Population Means . . . . . . . . . . . . . . . . . . . . . . 9.2 Comparing Two Independent Population Proportions . . . . . . . . . . . . . . . . . . . . 9.3 Two Population Means with Known Standard Deviations . . . . . . . . . . . . . . . . . . 9.4 Matched or Paired Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: Statistical Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B: Mathematical Phrases, Symbols, and Formulas . . . . . . . . . . . . . . . . . . . Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 3 7 19 27 43 44 62 69 73 74 76 131 131 136 144 149 181 183 186 207 208 210 223 224 226 243 244 253 257 261 291 292 293 296 302 311 331 332 338 341 344 375 393 401 This OpenStax book is available for free at Preface 1 PREFACE -- RRC MATH1020 ADAPTATION -- VERSION 2015 REVISION B About Introductory Statistics This text has been adapted specifically for MATH1020 at Red River College. It is designed for the one-semester, introduction to statistics course, geared toward business students. This text assumes students have been exposed to intermediate algebra, and it focuses on the applications of statistical knowledge rather than the theory behind it. The foundation of this textbook is Introductory Statistics, by Barbara Illowsky and Susan Dean. Additional topics, examples, and ample opportunities for practice have been added to each chapter. The development choices for this textbook were made with the guidance of many faculty members who are deeply involved in teaching this course. These choices led to innovations in art, terminology, and practical applications, all with a goal of increasing relevance and accessibility for students. We strove to make the discipline meaningful, so that students can draw from it a working knowledge that will enrich their future studies and help them make sense of the world around them. Coverage and Scope Chapter 1 Sampling and Data Chapter 2 Descriptive Statistics Chapter 3 Probability Topics Chapter 4 Discrete Random Variables Chapter 5 The Normal Distribution Chapter 6 The Central Limit Theorem Chapter 7 Confidence Intervals Chapter 8 Hypothesis Testing with One Sample Chapter 9 Hypothesis Testing with Two Samples Pedagogical Foundation and Features • Examples are placed strategically throughout the text to show students the step-by-step process of interpreting and solving statistical problems. To keep the text relevant for students, the examples are drawn from a broad spectrum of practical topics; these include examples about college life and learning, health and medicine, retail and business, and sports and entertainment. • Practice, Homework, and Bringing It Together problems give the students problems at various degrees of difficulty while also including real-world scenarios to engage students. Ancillaries • Elementary Business Statistics Course janux.ou.edu ( ) About Our Team Senior Contributing Authors Barbara Illowsky De Anza College Susan Dean De Anza College University of Oklahoma Contributors Alexander Holmes Regent's Professor of Economics University of Oklahoma Kevin Hadley Analyst, Federal Reserve Bank of Kansas City Mathew Price Research Assistant, University of Oklahoma 2 Preface Red River College Contributors List contributors. Preface to OpenStax College's Introductory Statistics: Red River Custom Edition OpenStax College’s Introductory Statistics by senior contributing writers Barbara Illowsky and Susan Dean is a complete text in itself and thus the creation of a custom edition requires some rationale for all the effort that went into its creation. This custom edition for Red River College builds from the University of Oklahoma's adaptation of Introductory Statistics and maintains, for the most part, the structure of the material. Only does the order of the latter chapters on the Chi squared distribution and the F distribution change. The discrete probability density functions have been reordered in what is felt helps provide a logical development of probability density functions from simple counting formulas to more complex continuous distributions. What has been preserved and is a true foundation stone of both texts are the homework assignments and examples. Many additional homework assignments have been added and new examples that use a more mathematical approach are in the new text, but the wealth of examples, mostly with answers, are critical to student success and a keystone to this custom edition of Introductory Statistics. What differentiates this text from its foundation document grows out of a difference in philosophy toward the use of mathematical formulas. The significant and important work of the foundation text to help students master the Texas Instruments calculator has been discarded. All required calculations are within the capability of a $2.00 calculator, until regression, correlation and ANOVA, of course. It is my belief that students lose much if they do not see the formulas in action and develop a “feel” for what they are doing with the data. This requires additional material that helps students understand the combinatorial formula and factorials as well as sigma notation otherwise carried by the calculator. This difference in perspective then changes the acceptance/rejection rule for hypothesis testing to comparisons between calculated test statistics verse p-values. The terminology of confidence intervals, and the process of finding probabilities also changes including now the reliance upon statistical tables. Laying more emphasis on the development of the mathematical formulas requires a closer link to the fundamental theorem of inferential statistics, the Central Limit Theorem. This relationship is developed in the foundation text and given its proper critical role in statistical theory. This custom edition of Introductory Statistics repeats this link in each section for each test statistic developed; test for proportions, for differences in means and differences in proportions. This RRC custom edition Introductory Statistics owes much to the work of Dr. Illowsky and Ms. Dean in OpenStax College’s Introductory Statistics, and to its subsequent adaptation by Dr. Alexander Holmes and his team at University of Oklahoma. This OpenStax book is available for free at Chapter 1 | Sampling and Data 3 1 | SAMPLING AND DATA Figure 1.1 We encounter statistics in our daily lives more often than we probably realize and from many different sources, like the news. (credit: David Sim) Introduction You are probably asking yourself the question, "When and where will I use statistics?" If you read any newspaper, watch television, or use the Internet, you will see statistical information. There are statistics about crime, sports, education, politics, and real estate. Typically, when you read a newspaper article or watch a television news program, you are given sample information. With this information, you may make a decision about the correctness of a statement, claim, or "fact." Statistical methods can help you make the "best educated guess." Since you will undoubtedly be given statistical information at some point in your life, you need to know some techniques for analyzing the information thoughtfully. Think about buying a house or managing a budget. Think about your chosen profession. The fields of economics, business, psychology, education, biology, law, computer science, police science, and early childhood development require at least one course in statistics. Included in this chapter are the basic ideas and words of probability and statistics. You will soon understand that statistics and probability work together. You will also learn how data are gathered and what "good" data can be distinguished from "bad." 1.1 | Definitions of Statistics, Probability, and Key Terms The science of statistics deals with the collection, analysis, interpretation, and presentation of data. We see and use data in our everyday lives. In this course, you will learn how to organize and summarize data. Organizing and summarizing data is called descriptive statistics. Two ways to summarize data are by graphing and by using numbers (for example, finding an average). After you have studied probability and probability distributions, you will use formal methods for drawing conclusions from "good" data. The formal methods are called inferential statistics. Statistical inference uses probability to determine how confident we can be that our conclusions are correct. Effective interpretation of data (inference) is based on good procedures for producing data and thoughtful examination of the data. You will encounter what will seem to be too many mathematical formulas for interpreting data. The goal of statistics is not to perform numerous calculations using the formulas, but to gain an understanding of your data. The calculations can be done using a calculator or a computer. The understanding must come from you. If you can thoroughly grasp the basics of statistics, you can be more confident in the decisions you make in life. 4 Chapter 1 | Sampling and Data Probability Probability is a mathematical tool used to study randomness. It deals with the chance (the likelihood) of an event occurring. For example, if you toss a fair coin four times, the outcomes may not be two heads and two tails. However, if you toss the same coin 4,000 times, the outcomes will be close to half heads and half tails. The expected theoretical probability of heads in any one toss is 1 or 0.5. Even though the outcomes of a few repetitions are uncertain, there is a regular pattern 2 of outcomes when there are many repetitions. After reading about the English statistician Karl Pearson who tossed a coin 24,000 times with a result of 12,012 heads, one of the authors tossed a coin 2,000 times. The results were 996 heads. The fraction 996 is equal to 0.498 which is very close to 0.5, the expected probability. 2000 The theory of probability began with the study of games of chance such as poker. Predictions take the form of probabilities. To predict the likelihood of an earthquake, of rain, or whether you will get an A in this course, we use probabilities. Doctors use probability to determine the chance of a vaccination causing the disease the vaccination is supposed to prevent. A stockbroker uses probability to determine the rate of return on a client's investments. You might use probability to decide to buy a lottery ticket or not. In your study of statistics, you will use the power of mathematics through probability calculations to analyze and interpret your data. Key Terms In statistics, we generally want to study a population. You can think of a population as a collection of persons, things, or objects under study. To study the population, we select a sample. The idea of sampling is to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. Because it takes a lot of time and money to examine an entire population, sampling is a very practical technique. If you wished to compute the overall grade point average at your school, it would make sense to select a sample of students who attend the school. The data collected from the sample would be the students' grade point averages. In presidential elections, opinion poll samples of 1,000–2,000 people are taken. The opinion poll is supposed to represent the views of the people in the entire country. Manufacturers of canned carbonated drinks take samples to determine if a 16 ounce can contains 16 ounces of carbonated drink. From the sample data, we can calculate a statistic. A statistic is a number that represents a property of the sample. For example, if we consider one math class to be a sample of the population of all math classes, then the average number of points earned by students in that one math class at the end of the term is an example of a statistic. The statistic is an estimate of a population parameter, in this case the mean. A parameter is a numerical characteristic of the whole population that can be estimated by a statistic. Since we considered all math classes to be the population, then the average number of points earned per student over all the math classes is an example of a parameter. One of the main concerns in the field of statistics is how accurately a statistic estimates a parameter. The accuracy really depends on how well the sample represents the population. The sample must contain the characteristics of the population in order to be a representative sample. We are interested in both the sample statistic and the population parameter in inferential statistics. In a later chapter, we will use the sample statistic to test the validity of the established population parameter. A variable, or random variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Variables may be numerical or categorical. Numerical variables take on values with equal units such as weight in pounds and time in hours. Categorical variables place the person or thing into a category. If we let X equal the number of points earned by one math student at the end of a term, then X is a numerical variable. If we let Y be a person's party affiliation, then some examples of Y include Republican, Democrat, and Independent. Y is a categorical variable. We could do some math with values of X (calculate the average number of points earned, for example), but it makes no sense to do math with values of Y (calculating an average party affiliation makes no sense). Data are the actual values of the variable. They may be numbers or they may be words. Datum is a single value. Two words that come up often in statistics are mean and proportion. If you were to take three exams in your math classes and obtain scores of 86, 75, and 92, you would calculate your mean score by adding the three exam scores and dividing by three (your mean score would be 84.3 to one decimal place). If, in your math class, there are 40 students and 22 are men and 18 are women, then the proportion of men students is 22 and the proportion of women students is 18 . Mean and 40 proportion are discussed in more detail in later chapters. This OpenStax book is available for free at 40 Chapter 1 | Sampling and Data NOTE The words " mean" and " average" are often used interchangeably. The substitution of one word for the other is common practice. The technical term is "arithmetic mean," and "average" is technically a center location. However, in practice among non-statisticians, "average" is commonly accepted for "arithmetic mean." Example 1.1 Determine what the key terms refer to in the following study. We want to know the average (mean) amount of money first year college students spend at ABC College on school supplies that do not include books. We randomly surveyed 100 first year students at the college. Three of those students spent $150, $200, and $225, respectively. Solution 1.1 The population is all first year students attending ABC College this term. The sample could be all students enrolled in one section of a beginning statistics course at ABC College (although...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture