Unformatted text preview: M316 Chapter 13 Dr. Berg Binomial Distributions Many of the variables we are interested in have only two outcomes: heads or tails, yes or no, survive or don't survive, etc. A random process with only two outcomes is called a Bernoulli trial. The Binomial Setting 1 There are a fixed number n of observations (Bernoulli trials). 2 The n observations are all independent. 3 Each observation falls into one of just two categories, which for convenience we will call "success" and "failure." 4 The probability of success, call it p, is the same for each observation. We are often interested in the number of successes rather than the exact sequence. An example is tossing a coin ten times and noting the number of heads observed. Binomial Distribution The count X of successes in the binomial setting has the binomial distribution with parameters n and p. The parameter n is the number of observations, and p is the probability of success on any one observation. The possible values of X (the sample space) are the whole numbers from 0 to n. Example (13.1) Blood Types Genetics says that children receive genes from their parents independently. Each child of a particular pair of parents has probability 0.25 of having type O blood. If these parents have 5 children, the number having type O blood is the count X of successes in 5 independent observations with probability 0.25 of a success on each observation. So X has the binomial distribution with n=5 and p=0.25. Example (13.2) Dealing Cards Deal 10 cards from a shuffled deck and count the number X of red cards. There are 10 observations, and each gives either a red or a black card. A "success" is a red card. But the observations are not independent. If the first card is black, the second card is more likely to be red because there are more red cards than black cards left in the deck. The count X does not have a binomial distribution. Binomial Distributions in Statistical Sampling The binomial distributions are important in statistics when we wish to make inferences about the proportion p of "successes" in a population. Here is an example. 1 M316 Chapter 13 Dr. Berg Example (13.3) Choosing an SRS of CDs A music distributor inspects an SRS of 10 CDs from a shipment of 10,000 music CDs. Suppose that it is known that 10% of the CDs in the shipment have defective copyprotection schemes that will harm personal computers. Count the number X of bad CDs in the example. This is not a binomial setting. Just as removing one card in Example 13.2 changes the makeup of the deck, removing one CD from a shipment of 10,000 changes the proportion of bad CDs remaining in the shipment. But, the makeup of the remaining 9,999 CDs changes very little. In practice, the distribution of X is very close to the binomial distribution with n=10 and p=0.1. Sampling Distribution of a Count Choose an SRS of size n from a population with proportion p of successes. When the population is much larger than the sample, the count X of successes in the sample has approximately the binomial distribution with parameters n and p. Exercise (13.4) I Can't Relax Opinion polls find that 14% of Americans "never have time to relax." If you take an SRS of 500 adults, what is the approximate distribution of the number in your sample who say that they never have time to relax? Binomial Probabilities We can find a formula for the probability that a binomial random variable takes any value by adding probabilities for the different ways of getting exactly that many successes in n observations. Here is an example. Example (13.4) Inheriting Blood Type Each child born to a particular set of parents has probability 0.25 of having blood type O. If these parents have 5 children, what is the probability that exactly 2 of them have type O blood? The count of children with type O blood is a binomial random variable X with n=5 and probability p=0.25 of success on each try. We want P(X=2). We use S for success and F for failure. Any one arrangement of 2 S's 3 F's has probability (0.25) 2 (0.75) 3 = 0.026371875 . There are 10 arrangements of 2 S's 3 F's, namely: SSFFF SFSFF SFFSF SFFFS FSSFF FSFSF FSFFS FFSSF FFSFS FFFSS. Hence, the probability of getting exactly 2 of 5 successes is P(X = 2) = 10(0.25) 2 (0.75) 3 0.2637 . We need a method for counting the number of arrangements of k successes in n tries. 2 M316 Chapter 13 Dr. Berg Definition The factorial function is n!= n(n -1)(n - 2)...(3)(2)(1) for n>0 and 0!=1. The factorial function is used to count the number of arrangements of n distinct objects in a line. As an example, the number of ways to arrange 5 books on a shelf is 5!=(5)(4)(3)(2)(1)=120. Binomial Coefficient The number of ways of arranging k successes among n observations is given by the binomial coefficient n n! = k k!(n - k)! for k=0, 1, 2, 3, ..., n. Exercise Find the number of ways to get 4 successes in 6 tries. Binomial Probability If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2, ..., n. If k is any of these values, n P(X = k) = p k (1- p) n-k . k Example (13.5) Inspecting CDs The number X of CDs with defective copy protection in Example 13.3 has approximately the binomial distribution with n=10 and p=0.1. The probability that the sample contains no more than one defective CD is P(X 1) = P(X = 1) + P(X = 0) 10 10 = (0.1)1 (0.9) 9 + (0.1) 0 (0.9)10 1 0 10! 10! = (0.1)(0.3874) + (1)(0.3487) 1!9! 0!10! = (10)(0.03874) + (1)(0.3487) = 0.7361. Thus, about 74% of all samples will contain no more than one bad CD. In fact 35% of samples will have no bad CDs. A sample of size 10 cannot be trusted to alert the distributor to the presence of unacceptable CDs in the shipment. 3 M316 Chapter 13 Dr. Berg Using Technology The binomial probability formula is difficult to use if the event contains many outcomes. You can find tables of binomial probabilities P(X=k) and cumulative probabilities P(Xk) for selected values of k. Software can also be used for this. Here is a graphic of some outputs. 4 M316 Chapter 13 Dr. Berg The Mean and Standard Deviation of a Binomial Distribution If a basketball player makes 80% of her free throws, the mean number made in 10 tries should be 80% of 10, or 8. Standard deviation is more complicated. Binomial Mean and Standard Deviation If a count X has the binomial distribution with number of observations n and probability of success p, the mean and standard deviation of X are = np and = np(1- p) . Example (13.6) Inspecting CDs The count X o f bad CDs from Example 13.5 is binomial with n=10 and p=0.1. The mean of this distribution is = (10)(0.1) = 1 and the standard deviation is = (10)(0.1)(0.9) = 0.9 = 0.9487 . Here is a histogram for this distribution. Exercise Let X be the count of aces observed in 100 die rolls. What are the mean and standard deviation? 5 M316 Chapter 13 Dr. Berg The Normal Approximation to Binomial Distributions The formula for binomial probabilities becomes awkward as the number of observations n increases. When n is large, the binomial distribution is approximately normal, so it is common to use normal approximations. Example (13.7) Attitudes Towards Shopping Are attitudes toward shopping changing? Sample surveys show that fewer people enjoy shopping than in the past. A survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that "I like buying new clothes, but shopping is often frustrating and timeconsuming." The population that the poll wants to draw conclusions about is all U.S. residents aged 18 and over. Suppose that in fact 60% of all adult U.S. residents would say "Agree" if asked the same question. What is the probability that 1520 or more of the sample agree? Because the sample size 2500 is small compared to the 225 million adults in the United States, we can assume that the distribution is binomial with n=2500 and p=0.6. We must add the binomial probabilities of all outcomes from X=1520 to X=2500. Here are 3 methods to do this. 1 Using technology to do the calculations, the answer is P(X 1520) = 0.2131. This answer is correct to four decimal places. 2 Simulate a large number of samples. One such simulation counts X from 1000 sample of size 2500 finding that 221 of these have X at least 1520. The estimated probability is P(X 1520) = 221/1000 = 0.221. Here is the histogram. 6 M316 Chapter 13 Dr. Berg 3 Both of the previous methods require the use of software. We can use table A by treating the distribution as Normal with mean = np = (2500)(0.6) = 1500 and standard deviation = np(1- p) = (2500)(0.6)(0.4) = 24.49 . Example (13.8) Normal Calculation for a Binomial Probability Treating X as having the N(1500, 24.49) distribution, we estimate 1520 -1500 P(X 1520) = P Z = P(Z 0.82) = 1- 0.7939 = 0.2061. 24.49 Normal Approximation for Binomial Distributions Suppose the count X has the binomial distribution with n observations and success probability p. When n is large, the distribution of X is approximately Normal, N np, np(1- p) . As a rule of thumb, we will use the Normal approximation when n is so large that np 10 and n(1- p) 10 . Exercise (13.10) Using Benford's Law According to Benford's law, the probability that the first digit of the amount of a randomly chosen invoice is a 1 or a 2 is 0.477. You examine 90 invoices from a vendor and find that 29 have first digits 1 or 2. If Benford's law holds, the count will have the binomial distribution with n=90 and p=0.477. Too few 1s and 2s suggest possible fraud. What is the approximate probability that 29 or fewer of the invoices follow Benford's law? Exercise What is the approximate probability that in 100 rolls of a fair die you would see not more than 10 aces? ( ) 7 ...
View Full Document
This note was uploaded on 09/14/2009 for the course CH 310 N taught by Professor Blocknack during the Fall '08 term at University of Texas.
- Fall '08