Unformatted text preview: M316 Chapter 14 Dr. Berg Confidence Intervals: The Basics We use the results of a sample to make an inference about the population. How likely is it that our inference is approximately correct? We will now concentrate on the reasoning behind statistical inference. Simple Conditions for Inference About a Mean 1 We have an SRS from the population of interest. There is no nonresponse or other practical difficulty. 2 The variable we measure has a perfectly Normal distribution N(, ) in the population. 3 We don't know the population mean , but we do know the population standard deviation . These conditions are of course unrealistic, but we have to start with a simple case. Estimating With Confidence We start with an example involving The National Assessment of Educational Progress (NAEP) Young Adult Literacy Assessment Survey based on a nation wide sample of households. Example (14.1) NAEP Quantitative Scores The NAEP survey includes a short test of quantitative skills, covering basic arithmetic and the ability to apply it to realistic problems. Scores on the test range from 0 to 500. A person who scores 233 can add the amounts of two checks appearing on a bank deposit slip; someone scoring 325 can determine the price of a meal from a menu: a person scoring 375 can transform a price in cents per ounce into dollars per pound. In a recent year, 840 men 21 to 25 years of age were in the NAEP sample. Their mean quantitative score was x = 272 . On the basis of this sample, we want to estimate mean score in the population of more than 10 million young men of these ages. To meet the "simple conditions," we will treat the NAEP sample as a perfect SRS of young men and NAEP scores in the population of all young men as having an exactly Normal distribution with standard deviation =60. Here is the reasoning. 1 To estimate the unknown population mean , use the mean x = 272 of the random sample. We want to know how accurate this estimate is. 1 M316 2 Chapter 14 Dr. Berg We know that the sampling distribution is Normal with mean and standard 60 deviation = 2.1. n 840 3 The 95 part of the 689599.7 rule says that x and its mean are within 4.2 standard deviations of each other. Thus, if we estimate that lies somewhere in the interval from x - 4.2 to x + 4.2 , we will be right 95% of the time. For this particular example the interval is from 272 - 4.2 = 267.8 to 272 + 4.2 = 276.2 . We say that x 4.2 is a 95% confidence interval. If the sampling is done many times, about 95% of the m will fall into this interval. Definition A level C confidence interval for a parameter has two parts: 1 An interval calculated from the data, usually of the form estimate margin of error 2 A confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples. That is, the confidence level is the success rate for the method. The following illustrates the idea for 25 samples. 2 M316 Chapter 14 Dr. Berg Exercise (14.2) Losing Weight A Gallup Poll found that 51% of the people in its sample said "Yes" when asked, "Would you like to lose weight?" Gallup announced: "For results based on the total sample of national adults, one can say with 95% confidence that the margin of sampling error is 3 percentage points." a) What is the 95% confidence interval for the percent of all adults who want to lose weight? b) What does it mean to say that we have "95% confidence" in this interval? Confidence Intervals for the Mean To find the 95% confidence interval, we went two standard deviations in both directions from the sample mean. To find the level C confidence interval, we capture the area under the Normal curve with area C marked off by two points z* and z*. These numbers are called critical values. Values of z* for many choices of C appear in the bottom row of table C in the back of the book. Here are the most common values. Confidence Level C 90% 95% 99% Critical value z* 1.645 1.960 2.576 3 M316 Chapter 14 Dr. Berg Confidence Interval for the Mean of a Normal Population Draw an SRS from a Normal population having unknown mean and known standard deviation . A level C confidence interval for is x z* n where z* is found in table C. Confidence Intervals: The Four Step Process State: What is the practical question that requires estimating a parameter? Formulate: Identify the parameter and choose a level of confidence. Solve: Carry out the work in two phases: a) Check the conditions for the interval you plan to use. b) Calculate the confidence interval. Conclude: Return to the practical question to describe your results in this setting. Example (14.3) Healing of Skin Wounds State: Biologists studying the healing of skin wounds measured the rate at which new cells closed a razor cut made in the skin of an anesthetized newt. Here are the data from 18 newts, measured in micrometers (millionths of a meter) per hour. 29 27 34 40 22 28 14 35 26 35 12 30 23 18 11 22 23 33 This is one of several sets of measurements made under different conditions. We want to estimate the mean healing rate for comparison with rates under other conditions. 4 M316 Chapter 14 Dr. Berg Formulate: We sill estimate the mean rate for all newts of this species by giving a 95% confidence interval. Solve: We should start by checking the conditions for inference. For this first example, we will find the interval, then discuss how statistical practice deals with conditions that are never perfectly satisfied. The mean of the sample is 29 + 27 + 34 + ... + 33 x= = 25.67 . 18 As part of the "simple conditions," suppose that from past experience with this species of newts we know that the standard deviation of healing rates is 8 micrometers per hour. For 95% confidence, the critical value is z*=1.960. A 95% confidence interval for is therefore 8 x z* = 25.67 1.960 = 25.67 3.70 , n 18 so the interval is from 21.97 to 29.37. Conclude: We are 95% confident that the mean healing rate for all newts of this species is between 21.97 and 29.37 micrometers per hour. In practice, the first part of the Solve step is to check the conditions for inference. The "simple conditions" are: 1 SRS: We don't have an actual SRS from the population of all newts of this species. These 18 were randomly selected from among newts raised for laboratory experiments. 2 Normal distribution: The biologists expect from past experience that measurements like this will follow approximately a Normal distribution. A stemplot of the measurements shows no outliers or strong skewness. 5 M316 Chapter 14 Dr. Berg 3 Know : It really is unrealistic to suppose that we know that =8. In chapter 18 we will see how to estimate the standard deviation. Exercise (14.5) Analyzing Pharmaceuticals. A manufacturer of pharmaceutical products analyzes each batch to verify the concentration of the active ingredient. The chemical analysis is not perfect. In fact, repeated measurements follow a Normal distribution with mean equal to the true concentration and standard deviation =0.0068 grams per liter. Three analyses of one batch give concentrations 0.8403, 0.8363, and 0.8447 grams per liter. To estimate the true concentration, give a 95% confidence interval for . Follow the four step process. How Confidence Intervals Behave Because the confidence interval has the form x z * , we can make the n interval narrower by making the margin of error z * by making 1 z* smaller (which makes the confidence level smaller) 2 smaller (over which we have no control) 3 n larger (which means a bigger sample size.) Example (14.4) Changing the Margin of Error Example 14.3 gives the 95% confidence interval 25.67 3.70 for the mean healing rate of newt skin. The 90% confidence interval based on the same data replaces the 95% critical value z*=1.960 by the 90% critical value z*=1.645. The 8 interval is 25.67 1.654 = 25.67 3.10 . Lower confidence results in a smaller 18 margin of error, 3.10 in place of 3.70 . Similarly, the margin of error for a 99% confidence interval is larger 4.86 . Were we to cut the sample size from 18 to 9, the margin of error would increase from 3.70 to 5.23 . Exercise (14.8) SAT Scores High school students who take the SAT mathematics exam a second time generally score higher than on the first try. The change in score has a Normal distribution with standard deviation = 50 . A random sample of 1000 students gains an average of x = 22 . Find the 90%, 95%, and 99% confidence intervals. smaller. This can be done n 6 M316 Choosing the Sample Size Chapter 14 Dr. Berg It is prudent to take the inference into account when planning a statistical study. You can arrange to have both high confidence and a small margin of error by taking enough observations. Sample Size for Desired Margin of Error The confidence interval for the mean of a Normal population will have a specified margin of error m when the sample size is at least z * 2 n = m Example (14.5) How Many Observations? The biologists in example 14.3 would like to estimate the mean healing rate within no more than 3 micrometers with 90% confidence. How many newts must they measure? For 90% confidence, table C gives z*=1.645. We know that =8. Therefore, z * 2 1.645 8 2 n = = = 19.2 . m 3 Thus the biologists must measure 20 newts. Exercise (14.9) Improving SAT Scores How large a sample of high school students in exercise 14.8 would be needed to estimate the mean change in SAT score to within 2 points with 95% confidence? 7 ...
View Full Document
This note was uploaded on 09/14/2009 for the course CH 310 N taught by Professor Blocknack during the Fall '08 term at University of Texas.
- Fall '08