{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

CH6 Spring2010 - Chapter 6 Sections 6.1 6.2 6.3 Statistical...

Info icon This preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
Image of page 9

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 6 Sections 6.1, 6.2, 6.3 Statistical inference is the next topic we will cover in this course. We have been preparing for this by 1) describing and analyzing data (graphs and plots, descriptive statistics) 2)discussing the ways to find and/0r generate data (studies, samples, experiments) and 3) we have studied sampling distributions. We are now ready for statistical inference. The purpose of statistical inference is to draw conclusions about population parameters based on data which came from a random sample or from a randomized experiment. Data (statistics) are used to infer population parameters. We will learn in this chapter the two most prominent types of statistical inference, o confidence intervals for estimating the value of a population mean and 0 tests of significance which weigh the evidence for a claim concerning the population mean. Section 6.1 Estimating the population mean, u, with a stated confidence: A confidence interval for the population mean, M, includes a point estimate and a margin of error. 0 The point estimate is a single statistic calculated from a random sample of units. For example, x, the sample mean, is a point estimate of ,u , the population mean. However sample means fluctuate, so we need to adjust our point estimate by adding and subtracting a margin of error, thus creating a confidence interval. The following general formula may be used when the population sigma is knOWn: % Confidence Interval: Xi margin of error , where *0" 7—,"; Margin of error— — 5‘ Lecture 7, Chapter 6 Page 1 For the Normal Distribution the following values for Z* apply: For a 90% Confidence Interval Z* = 1.645 For a 95% Confidence Interval Z* = 1.960 For a 99% Confidence Interval Z* = 2.576 Example: You want to estimate the population mean SAT Math score for the high school seniors in California. You give a test to a simple random sample of 500 high school seniors in CA. The mean score for your sample, Y 2461. The population standard deviation is known to be 6 = 100. For the SAT Math scores, a 95 % confidence interval for [1. would be : 461 + / - 1.960 *100/ \l (500) which works out to: (452.2, 469.8 ) This Confidence Interval was calculated using a procedure which gives a “correct” interval (Contains the Population Mean), 95% of the times it is used. Another example: 1. Suppose we wish to estimate ,u, the population mean driving time between Lafayette and Indianapolis. We select a SRS of n = 25 drivers. The observed sample mean is x1 = 1.10 hours. Let’s assume that we know the population standard deviation of X is 0' = 0.5 hours. A 95 % confidence interval for ,Lt would be: 1.10 + / — 1.96”< 0.5/V (25) which works out to: (0.904, 1.296) Again, the procedure gives a “correct” interval 95% of the times it is used. Lecture 7, Chapter 6 Page 2 Suppose we selected another sample of 25 driving times and obtained an E = 1.00 hours. If we calculate a 95% confidence interval for ,u based on }2 we get (0.804, 1.196) which is a different interval estimate. Which interval is correct? We don’t know. i If we repeatedly selected SRS of 25 drivers, and for each SRS, we calculated a 95% confidence interval for ,u, the population mean, in the long-run, 95% of the intervals will contain the true value of ,u . They will all be different, but 95 % of them will include the true mean and Will therefore be considered correct. And the other 5% will not include the true mean . We have no way to know whether a given Confidence Interval is “correct” or “incorrect”. Confidence level vs Width of Confidence Interval: Suppose X, Bob’s golf scores, have a normal distribution with unknown population mean but we believe the population standard deviation 0' = 3. A SRS of n=16 units is selected and a sample mean of E = 77 is observed. a. Calculate a 90% confidence interval for ,u. Use Z* = 1.645 We CI: Pom :2 77 3: l.é#5(%) : 77 2!: 1.23 1:. (75377 3 78.2 3) b. Calculate a 95 % confidence interval for ,u. Use Z* = 1.960 £257. :17: Fara z: 77 5.: manta} == 774:. W7 :2 (75.537 WW7) 0. Calculate a99% confidence interval for ,u. Use Z2: 2.576 am CI 53:3? .44 :1 2?“? 1*: 2.5% m“ a 77 :3: W3 £75.57 79'93) As you can see from these calculati ns, raising th confidence interval requires a larger Z* value, which increases the margin of error and produces a wider confidence interval. There is a trade—off between the precision of our estimate and the confidence we have in the result. Higher confidence level requires a wider interval. Lecture 7, Chapter 6 Page 3 The margin of error also depends on sample size. A larger sample size will result in a smaller margin of error. In fact, quadrupling the sample size will cut the margin of error in half. Calculating the sample size for a desired margin of error: The confidence interval for a population mean will have a specified margin of error m when the sample size is 2 >1: n=z0 m 1. You are planning a survey of starting salaries for recent liberal arts major graduates from your college. From a pilot study you estimate that the population standard deviation is about $8000. What sample size do you need to develop a 95% confidence interval with a‘ margin of error of $500? Example: 2. Always round a sample size number with any decimals up to the next whole number. Never drop the decimals and round down. Some Cautions: o The above formulas do not correct the data for any unknown bias. Consequently, if the data are biased, then ANY inferences based on those data are also biased. This includes biases arising from nonresponse, undercoverage , response error or hidden bias in experiments. 0 Because the sample mean is not resistant, confidence intervals are not resistant to outliers. 0 When the population being sampled is not normally distributed, the sample size needs to be at least 30 in order to have the sample mean be normally distributed. This is the Central Limit Theorem. Always plot the data to check normality. Lecture 7, Chapter 6 Page 4 0 Typically we do not know the population standard deviation, 0'. When 0' is not known we will use the t procedures which will be introduced in Chapter 7. Interpretation Of A Confidence Interval: Any value in a confidence interval is considered a possible value for [1, including the end points. Any value not included in the confidence interval is considered an unlikely value for u. Section 6.2 TESTS OF SIGNIFICANCE (HYPOTHESIS TESTING) The second type of statistical inference is a significance test which assesses evidence provided by data regarding some claim about the population mean. Based on a random sample from the population, we want to determine if a the population mean has changed upward, downward, or in either direction. Because the null hypotheSes represents the established or accepted mean value, we want to use the data to determine, statistically, if we can reject the null hypothesis in favor of the alternative hypothesis. The four steps for a Test of Significance/Hypothesis Tests: Step 1. State the Null and Alternative Hypothesis: Null Hypothesis H 0 : The statement being tested in a statistical test is called the null hypothesis. The test is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference” or “status quo”. H 0 :1“ 2 [“0 Alternative Hypothesis H a : The claim about the population mean that we are trying to find evidence for. Choose one of the following hypotheses. H a :,u > #0 one side right H a :,u < #0 one side left Ha :,u¢,uo two side Step 2. Find the test statistic: If [“0 is the value of the population mean [1 specified by the null Lecture 7, Chapter 6 Page 5 ‘ai hypothesis, the one-sample z statistic is = x ‘fl0 0'/ x/fi Z Step 3. Calculate the p-value. For one sided left tests: the P Value = P(Z 5 z), the area in the left tail. For one side right tests: the P Value = P(Z 2 z), the area in the right tail. For two sided tests: the P Value = 2P(Z Z |z|) , the area in the right and left tail Step 4. State conclusions in terms of the problem. The value of or defines how much evidence we require to reject Ho. Then, compare the p-value to the Qt level. This is usually stated in the problem. If p-value = or < a, then reject H 0 . Strong evidence exists against Ho. If p-value > a, then fail to reject H 0. Insufficient evidence exists..... Generally the value chosen for or is one of the following three: 0 OL = .01 strictest burden of proof of these three values. 0 OL = .05 0 or = .10 easiest burden of proof of these three values. Your conclusion should be in this form: If the p—value =< 0t we say that we have sufficient evidence to reject the null hypotheSis in favor of the alternate hypothesis, using the words of the original problem. If the p-value > 0t we say that we do not have sufficient evidence to reject the null hypothesis, using the words of the original problem. Even though H a is what we hope or believe to be true, our test gives evidence for or against H 0 only. We never prove H 0 true; we can only state whether we have enough ' evidence to reject H 0 (which is evidence in favor of H a , but not proof that H z is true.) or that we don’t have enough evidence to reject H 0 . Lecture 7, Chapter 6 Page 6 .5? ’. Example 1, A one sided hypothesis test: Bob’s golf scores are historically normally distributed with ,u = 77 strokes and o = 3 strokes. Bob has recently made two “improvements” to his game, and he thinks his scores should be lower. Bob has played 9 rounds since these improvements. His scores are: 77 73 74 78 78 75 75 74 71 Sample Mean=75 :2. Does this data provide sufficient evidence to conclude that Bob’s population mean is reduced, ie, [1. < 77? Null Hypothesis: Ho: u = 77 Status Quo 0r Established Norm Alternate Hypothesis: Ha: u < 77 Improvement Since we know that Bob’s golf scores are normally distributed, the sampling distribution of the sample mean of 9 rounds must also be normally distributed. The standard deviation of 3c— is 0-} =3/J9 = 1 stroke. The logic of the hypothesis test: If H 0 is true, then X ~ N (77,1) If H 0 is false and H a is true, then 7 ~ N (,u, 1) for some value ,u < 77 . 0 Values of X close to 77 would tend to support H 0 and values that are much lower than 77 would provide evidence against H 0 and in favor of Ha. 0 From the sample of Bob’s last 9 scores we get a sample mean of X = 75. Can we conclude that we should reject H 0 in favor of H a ? 0 We need to calculate the P-value. Assuming that H0 is true, we calculate the probability P(Y< 75) = P[Z < 75177] = P(Z < —2.00) 20.0228 0 The calculation says that if H0 is true, the probability that Xbar would be < 75 solely due to random chance is .0228, or 2.28%. 0 For this example let us use a = .05. Lecture 7, Chapter 6 Page 7 he X 0 Because the probability of obtaining a sample E < 75 is less than a we would reject H 0 and conclude it is more likely that u < 77. Example 2. What if Bob had only obtained the first 5 scores. In this case, we get a sample mean of )_C = 76 and 07¢ = 3/ «[5- 21.342. Then the P Value 76 — 77 1.342 meaning that an Xbar value of 76 or lower could occur by chance alone 22.66% of the time when H0 is true. This is not strong evidence that the population mean has changed. POT— < 76) = P [Z < ]= 0.2266 which is much greater than 0L, 0 So we would fail to reject HO . Example 3, A two sided hypothesis test: Bob has a driver’s license that gives his weight as 190 pounds. Bob’s license is coming up for renewal. Let’s test Whether Bob’s weight is different from 190 pounds using a test of significance. Let’s assume that Bob’s weight is approximately normally distributed with a population standard deviation of 3 pounds. Bob’s last four weekly weights are: 193 194 192 191 Sample Mean = 192.5: 3(— We ask if this data provides sufficient evidence to say that Bob’s weight has changed. (Two side hypothesis wording because direction is not implied) Null Hypothesis: Ho: [1. = 190 No change Alternate Hypothesis: Ha: [1. 75 190 Changed. Two side hypothesis Again, the parameter value specified in the null hypothesis usually represents no change, or the status quo. The suspected change in the parameter value is stated by the alternative hypothesis. Calculate Z: (192.5—190) / (3/sqrt(4)) = 1.67 Lecture 7, Chapter 6 Page 8 Calculate P Value: Tail probability = .0475 For two side tests we must double the tail probability. P Value = 2 ( .0475) = .0950 which is the probability in both tails. Reach a conclusion using OL = .05: Since the P Value is greater than 0t, we have insufficient evidence to reject H0. We lack sufficient evidence to say it has changed. Bob’s weight could still be 190. The above sample mean could have occurred by random chance with a probability of .0950 or 9.50% of the time. What if 0t had been .10 instead of .05? We would then have sufficient evidence to reject H0 and say it has changed. Example 4: l. A shipment of machined parts has a critical dimension that is normally distributed with mean 12 centimeters and standard deviation 0.1 centimeters. The acceptance sampling team suspects that the dimension is less than 12 centimeters. They take a simple random sample of 25 of these parts and obtain a mean of 11.99. Is the acceptance sampling team correct in their assertions? Use an or level of 0.01. {4 :fim ‘2 h .2 2.5. SE .5? IL??? HA: a <2 :2 fl law-:2. m “30' was nil .02.» W5 PVq/UE ‘53 .3085 Ccmm'i rejficl' Hm Confidence Intervals and Two-Sided Tests: A level 0t two-sided significance test rejects a hypothesis H 0 : ,u = #0 exactly when the value :“0 falls outside a level 1—(1 confidence interval for ,u. Example using the weight on Bob’s drivers license: a, Bob’s data: 193 194 192 191 Sample Mean = 192.5 == X It was assumed that 6 = 3 for the population of Bob’s weights. Calculate 95% Confidence Interval: 192.5 + / — 1.960 (.3 / sqrt 4) Lecture 7, Chapter 6 Page 9 (189.56 , 195.44) From Page 8 and 9, a 2 side hypothesis test using 0. = .05 failed to reject Ho, meaning that Bob’s weight could still be 190. You can see that this result is consistent with the 95% confidence interval above, since 190 is included in the confidence interval. If we repeated this example and calculated a 90% confidence interval we would get (190.03, 194.97) If 0! = .10 the hypothesis would reject Ho meaning that Bob’s weight is different from 190. You can see that this result is consistent with the 90% confidence interval since 190 is not included in the interval. Two side hypotheses will reject H0 when a confidence interval does not include pro, provided that a and the confidence level are equivalent. A 99% confidence level is equivalent to a =.01 A 95% confidence level is equivalent to a: .05 A 90% confidence level is equivalent to a: .10 Section 6.3 Use and Abuse of Tests: Choosing a Level of Significance: If we want to make a decision based on our test, we choose a level of significance in advance. We do not have to do this, however, if we are only interested in describing the strength of our evidence. If we do choose a level of significance in advance, we must choose a by asking how much evidence is required to reject H 0 . The choice of or depends on the type of study we are doing. If the value for a is not given, use a = .05 Some Cautions about Statistical tests: Lecture 7, Chapter 6 Page 10 “is As with CI’s, badly designed surveys or experiments often produce invalid results. Formal statistical inference cannot correct basic flaws in data collection. As with CI’s, tests of significance are based on laws of probability. Random sampling or random assignment of subjects to treatments ensures that these laws apply. Statistical significance is not the same thing as practical significance. There is no sharp border between “significant” and “non significant”, only increasingly strong evidence as the P—Value gets smaller. It is possible that a non-significant result is due to the sample size being too small. Larger sample sizes are capable of detecting smaller shifts. Lecture 7, Chapter 6 Page 11 ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern