M 562 Section-10-3

Course: MTH 562, Spring 2008
562 MTH/STA COMMON LARGE-SAMPLE STATISTICAL TESTS

562 MTH/STA COMMON LARGE-SAMPLE STATISTICAL TESTS In the preceding section, we have laid out a ground work for hypothesis testing which speci...cally includes four basic components null hypothesis, alternative hypothesis, test statistic, and rejection region. In practice, the null hypothesis is usually stated so as to specify an exact value of the population parameter , namely H0 : = 0 , where 0 is a speci...ed value of . However, there are in general three dierent kinds of alternative hypotheses associated with this null hypothesis; they are upper-tail alternative Ha : > 0 , lower-tail alternative Ha : < 0 , and two-tailed alternative Ha : 6= 0 . In the present section, we shall formally develop the testing procedure with respect to each of the above three alternatives for hypothesis testing on the basis of large samples. For any single population, the sample mean Y and sample proportion p are known to be b respective unbiased estimators of the population mean and population proportion p, with standard errors given as r p (1 p) and ; p = b Y = p n n respectively, where is the population standard deviation and n is the sample size. Similarly, for the sake of comparing two populations, the dierence between two sample means, Y 1 Y 2 , is an unbiased estimator for the dierence between two population means, 1 2 , with standard error s Y1 Y2 = 2 1 n1 + 2 2 n2 ; whereas the dierence between two sample proportions, p1 p2 , is an unbiased estimator for b b the dierence between two population proportions, p1 p2 , with standard error s p1 (1 p1 ) p2 (1 p2 ) + ; p1 p2 = b b n1 n2 where 2 and 2 are respective population variances and n1 and n2 are respective sample 1 2 sizes. In view of the Central Limit Theorem, it follows that each of the above four unbiased estimators has an approximately normal distribution when samples are large. If the unknown population parameter is set to be one of the four population parameters, , p, 1 2 , or p1 p2 , then for large samples, the random quantity Z= b (10.3.1) b has approximately a standard normal distribution. 1 Consider the problem of testing the hypothesis that the unknown population parameter is equal to a speci...ed value 0 on the basis of a large random sample Y1 , Y2 , , Yn . Throughout this section, it is assumed that the unbiased estimator b has an approximately r normal distribution with mean and standard error assumption as explained in the preceding paragraph. The statistic de...ned in (10:3:1) will serve as (at least approximately) a test statistic in the following procedures for hypothesis testing on the basis of large samples. I. Upper-Tail Test For testing the null hypothesis H0 : = 0 versus the upper-tail alternative hypothesis Ha : > 0 , where 0 is a speci...ed value of . We need ...rst to decide the form of the rejection region. Intuitively, it makes sense that large values of b tend to support the alternative hypothesis Ha . Thus, we would consider the test statistic b (an unbiased estimator of ) with the rejection region b > K for some choice of K. By de...nition, = P frejecting H0 when H0 is trueg n o = P b > K when = 0 ) ( b K 0 0 > = P P Thus, K or K= b 0 b = V ar b . This is a reasonable Z> b K 0 b 0 b =z +z where z is the value of the standard normal random variable Z such that P fZ > z g = as shown in Figures 10.3.1 and 10.3.2. b 2 Hence, an equivalent form of the upper-tail test of hypotheses with a level of signi...cance is given as follows: Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : H0 : = 0 Ha : > 0 b Z0 = b 0 Z0 > z The subscript of the test statistic Z0 is speci...cally designed to emphasize that the test statistic is computed under the null hypothesis. According to this general rule, now four special upper-tail tests are presented as follows: (1) For Estimating : Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where 0 H0 : = 0 Ha : > 0 0 Z0 = Y =pn Z0 > z is a speci...ed value of . (2) For Estimating p: Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where p0 is a speci...ed value of p. (3) For Estimating 1 2: H0 : p = p 0 H a : p > p0 b Z0 = p p p0 Z0 > z p0 (1 p0 )=n 3 Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where d0 is a speci...ed value of (4) For Estimating p1 p2 : 1 2. H0 : Ha : Z0 = 1 1 Z0 > z (Yr 1 = d0 2 > d0 Y 2 ) d0 2 2 2 1+ 2 n1 n2 Null hypothesis : H0 : p1 p2 = d0 Alternative hypothesis : Ha : p1 p2 > d0 p b Test statistic : Z0 = r (b1 p2 ) d0 Rejection region : Z0 > z p(1 p) b b 1 1 +n n1 2 Numerical computations for the above hypothesis testing procedures are simple and straightforward. For the purpose of demonstration, one numerical example should be enough. Example 1. Suppose that we wish to test the claim that the average daily yield of the chemical manufactured in a particular chemical plant is more than 660 tons per day. To check on this claim, a sample of n = 36 measurements on the daily yield of the chemical produced is obtained. The mean and standard deviation of the 50 observations were found to be 670 and 24 tons, respectively. Does the evidence contradicts the claim? Use a level of signi...cance = 0:05. Solution. Let hypotheses are The test statistic is Z0 = Y 670 660 p 0 = p = 2:5 = n 24= 36 denote the average daily yield of the chemical. The null and alternative H0 : = 660 versus Ha : > 660 where d0 is a speci...ed value of p1 p2 and p = b Y1 +Y2 n1 +n2 (a pooled proportion from two samples). where is replaced by S = 24. The rejection region, with = 0:05, is given by Z0 > z0:05 = 1:645. Hence, we reject H0 : = 660 in favor of Ha : > 660. That is, the evidence is su cient to indicate that the claim is incorrect and that the average daily yield of the chemical exceeds 660. II. Lower-Tail Test For testing the null hypothesis H0 : = 0 versus the lower-tail alternative hypothesis Ha : < 0 , where 0 is a speci...ed value of . To determine a proper form for the rejection region in this case, an intuitive thinking seems to suggest that the smaller the value of b, the stronger is the evidence in support of the lower-tail alternative hypothesis Ha . Thus, we 4 shall use the test statistic b (an unbiased estimator of ) with the rejection region b < K for some choice of K. By de...nition, = P frejecting H0 when H0 is trueg n o = P b < K when = 0 ) ( b K 0 0 = P < P Thus, K or b 0 Z< b K 0 b 0 b = z z K= where z is the value of the standard normal random variable Z such that P fZ < (see Figures 10.3.3 and 10.3.4 ). b z g= An form equivalent of the lower-tail test of hypotheses with a level of signi...cance given as follows: 5 is thus Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : H0 : = 0 Ha : < 0 b Z0 = b 0 Z0 < z In accordance with this general rule, four common lower-tail tests are summarized as follows: (1) For Estimating : Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where 0 H0 : = 0 Ha : < 0 0 Z0 = Y =pn Z0 < z is a speci...ed value of . (2) For Estimating p: Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where p0 is a speci...ed value of p. (3) For Estimating 1 2: H0 : p = p 0 H a : p < p0 b Z0 = p p p0 Z0 < z p0 (1 p0 )=n Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where d0 is a speci...ed value of (4) For Estimating p1 p2 : Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where d0 is a speci...ed value of p1 p2 and p = b 1 2. H0 : Ha : Z0 = Z0 < 1 1 (Yr 1 z = d0 2 < d0 Y 2 ) d0 2 2 2 1+ 2 n1 n2 H0 : p1 p2 = d0 H a : p 1 p 2 < d0 p b Z0 = r (b1 p2 ) d0 Z0 < z p(1 p) b b 1 1 +n n1 2 Y1 +Y2 n1 +n2 (a pooled proportion from two samples). 6 Again, only one numerical example will be presented for lower-tail tests. Example 2. A hospital claims that 12% of all appointments are canceled. Over a six-week period, 21 of the hospital' 200 appointments were canceled. Determine whether s the true proportion of all appointments that are canceled is less than 12%. Use a level of signi...cance = 0:01. Solution. Let Y denote the number of appointments that were canceled and p denote the true proportion of all appointments that are canceled. Then we want to test H0 : p = 0:12 versus Ha : p < 0:12 Since p = Y =n = 21=200 = 0:105, the test statistic is b Z0 = p p0 (1 p b p0 p0 ) =n =p 0:105 0:12 (0:12) (0:88) =200 = 0:65 The rejection region, with = 0:01, is given by Z0 > z0:01 = 2:33. Hence, we cannot reject H0 : p = 0:1. That is, the evidence does not support the claimed proportion. III. Two-Tailed Test For testing the null hypothesis H0 : = 0 versus the two-tailed alternative hypothesis Ha : 6= 0 , where 0 is a speci...ed value of . For the two-tailed test, it intuitively seems that either a large or a small value of b tends to be in support of the alternative hypothesis. As a result, we would consider the test statistic b (an unbiased estimator of ) with the two-sided rejection region b > K for some choice of K. By de...nition, = P frejecting H0 when H0 is trueg n o = P b > K when = 0 ( ) b K 0 0 = P > P Thus, K or K= b 0 0 jZj > b K 0 b b =z =2 +z where z =2 is the value of the standard normal random variable Z such that P Z < P Z > z =2 = =2 as illustrated in Figures 10.3.5 and 10.3.6. =2 b z =2 = 7 Accordingly, an equivalent form of the test of hypotheses with a level of signi...cance given as follows: Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : H0 : = 0 Ha : 6= 0 b Z0 = b 0 jZ0 j < z =2 is Similarly, there are four particular two-tailed tests in accordance with this rule. (1) For Estimating : Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where 0 H0 : = 0 Ha : 6= 0 0 Z0 = Y = p n jZ0 j > z =2 is a speci...ed value of . 8 (2) For Estimating p: Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where p0 is a speci...ed value of p. (3) For Estimating 1 2: H0 : p = p 0 Ha : p 6= p0 b Z0 = p p p0 jZ0 j > z =2 p0 (1 p0 )=n Null hypothesis : Alternative hypothesis : Test statistic : Rejection region : where d0 is a speci...ed value of (4) For Estimating p1 p2 : 1 2. H0 : Ha : Z0 = 1 1 jZ0 j > z (Yr 1 = d0 2 6= d0 Y 2 ) d0 2 2 2 1+ 2 n1 n2 =2 Null hypothesis : H0 : p1 p2 = d0 Alternative hypothesis : Ha : p1 p2 6= d0 p b Test statistic : Z0 = r (b1 p2 ) d0 Rejection region : jZ0 j > z p(1 p) b b =2 1 1 +n n1 2 where d0 is a speci...ed value of p1 p2 and p = b Y1 +Y2 n1 +n2 (a pooled proportion from two samples). Finally, we will close up the discussion by presenting two more numerical examples. Example 3. Two work methods are being compared for assembly of a certain machine. For a random sample of 64 employees working by the standard method, the mean production was 68:8 units with standard deviation of 5:2 units. For a random sample of 80 employees who used a new method, the mean production was 70:5 units with standard deviation of 5:6 units. Does the data present su cient evidence to suggest a dierence in production between the two methods? Use a level of signi...cance = 0:05. Solution. Let 1 and 2 denote the true mean productions for standard and new methods, respectively. We must test H0 : The test statistic is 1 2 =0 against Ha : 1 2 6= 0 where 1 and 2 are replaced by S1 = 5:2 and S2 = 5:6. The rejection region, with = 0:05, is given by jZ0 j > z =2 = z0:025 = 1:96. Hence, we cannot reject H0 : 1 2 = 0. That 9 68:8 70:5 = Z0 = q (5:6)2 (5:2)2 + 80 64 1:89 is, there is not su cient evidence to show a signi...cant dierence in production between the two methods. Example 4. A large personal computer company has major outlets in several cities, including Omaha and Des Moines. The top management of the company is considering entering into an exclusive dealership arrangement with a manufacturer of printers. They want to know whether potential customers in Omaha and Des Moines have the same preference for a certain type of printers. For a random sample of 220 Omaha customers, 62 prefer this type of printers, whereas a random sample of 180 Des Moines customers, 46 prefer this type of printers. Does the data present su cient evidence to suggest a dierence in proportion preferring this type of printers in Omaha and Des Moines? Use a level of signi...cance = 0:05. Solution. Let p1 and p2 denote the true proportions preferring this type of printers in Omaha and Des Moines, respectively. We need to test H0 : p 1 p2 = 0 versus Ha : p 1 p2 6= 0 We ...rst calculate the sample proportion of each city and the pooled proportion p1 = b Then the test statistic Y1 62 = = 0:282; n1 220 p= b p2 = b Y2 46 = = 0:256; n2 180 Y1 + Y 2 62 + 46 = = 0:27: n1 + n2 220 + 180 0:282 0:256 1 220 The rejection region with a level of signi...cance = 0:05 is obtained as jZ0 j > z =2 = z0:025 = 1:96. Hence, we cannot reject H0 : p1 p2 = 0. That is, the sample data does not provide enough evidence to support a dierence in proportion preferring this type of printers in Omaha and Des Moines. Z0 = q = 0:58: 1 180 (0:27) (0:73) + 10
