{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

stats 363 Chapter-10a notes

stats 363 Chapter-10a notes - CHAPTER 10 INFERENCE FROM...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Background image of page 12
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CHAPTER 10 INFERENCE FROM SMALL SAMPLES 10.2 Student’s t Distribution From the discussion of the sampling distribution of E in the preceding chapters, a few points had been discovered: * When the original sampled population is normal, the statistics T and z-$__’u — U/x/fi are both normally distributed regardless of the sample size. * When the original sampled population is not normal, the statistics E—u E—u a/x/fi s/x/fi all have approximately normal distributions as long as the sample size is large. and 2% E 2: * When the sample size n is small, the statistic E—u S/x/fi does not have a normal distribution. In 1908, W. S. Gosset derived a complicated formula for the density function of E — u 3/ W for random samples of size n from a normal population, and he published his results under the pen name “student”. t: Definition 1. The density function of the statistic s/x/fi is call the Student ’5 t distribution With n —— 1 degrees offreedom (df t The Student’s t distribution has the following characteristics: 1. It is mound—shape and symmetric about t = 0, just like the standard normal 2. 1 2. It is more variable than 2, with “heavier tails”; that is, the t curve does not approach the horizontal axis as quickly as z does. This is because the t statistic involves two random quantities, E and 5, whereas the z statistic involves only the sample mean f. 3. The shape of the 75 distribution depends on the sample size n. As n increases, the variability of 75 decrease because the estimate 3 of a is based on more and more information. Eventually, when n is infinitely large, the t and z distributions are identical. / Normal distribution Nisfiibution ! \ 0 . The table of probabilities for the standard normal 2 distribution is no longer useful in calculating critical values or p—value for the t statistic. Instead, we will use Table 4 in Appendicc I. When we index a particular number of degrees of freedom, the table records ta, a value of t that has tail area a to its right, as shown in the figure below. f '(f) Example 1. For a t distribution with 5 degrees of freedom, the value of t that has area 0.05 to its right is found in row 5 in the column marked t0‘05. For this particular It distribution, the area to the right of t = 2.015 is 005; only 5% of all values of the t statistic will exceed this value] Example 2. Suppose we have a sample of size n = 10 from a normal distribution. Find a value of t such that only 1% of all values of 15 will be smaller. Solution. The df is n — l = 10 —— 1 = 9. The necessary t-value must be in the lower portion of the 75 distribution with area 0.01 to its left, as shown in the figure below. Since the t distribution is symmetric about 0, this value is simply the negative of the value on the right—hand side with area 0.01 to its right, or —to 05 = -—2.821. f(t) Requirements for Student’s t Distribution * The sample must be selected at random from the population. * The population from which the sample is drawn must be normally distributed. 10.3 Small-Sample Inferences Concerning A Population Mean Just as the (standard) normal distribution was the underlying probability distribution for making large-sample inferences, the Student’s 75 distribution plays a central role in small— sample inferences Small-Sample Hypothesis Test for u l. Null hypothesis: H0 : u = no 2., Alternative hypothesis: Ha : u > Mo (Upper—Tailed Test) H a : ,u <1 ,uo (Lower—Tailed Test) Ha : ,u 75 Mo (Two-Tailed Test) 3.. Test Statistic: _ _w~m TSNQ t 4. Rejection Region: Reject H0 when t > ta for Ha : ,u > no (Upper-Tailed Test) 75 < "ta for Ha : ,u < Mo (Lower—Tailed Test) t > tat/2 or t < —ta/2 for Ha : ,u % no (Two—Tailed Test) or when pvalue <i a. Here the critical values of t, to and tat/2, are based on n — 1 degrees of freedom from the Student’s 25 probability table. tioxt Assumptions: The sample of size n is selected at random from a normally distributed population. Small—Sample (1 — a) 100% Confidence Interval for [L1, (a where s/fi is the estimated standard error of '23, often referred to as the standard error of the mean. Example 3. A new process for producing synthetics diamonds can be operated at a profitable level only if the average weight of the diamonds is greater than 05 karat. To evaluate the profitability of the process, six diamonds are generated, with recorded weight 0.46, 0.61, 0.52, 0.48, and 0.54 karat. Do the six measurements present sufficient evidence to indicate that the average weight of the diamonds produced by the process is in excess of 0.5 karat? Solution. 1. We want to test the null hypothesis H0 I [J = 2. An appropriate one—tailed alternative hypothesis is Haza>05 3. The test statistic is E —- a0 _ 0.53 — 0.5 t=—m— S/x/fi ——(10559/\/6 4. Rejection Region: If we choose a 5% level of significance (a = 0.05), the right—tailed rejection region is found using the critical value of t from Table 4 of Appendix I With df = n — 1 = 5, we can reject H0 if t j> to 05 : 2.015, as shown in the figure below. f0) = 1.32. 0 1.32 2.015 ' 7 t |—> Reject H0 5. Since 132 < 2.015, we cannot reject H0. The data do not present sufficient evidence to indicate that mean diamond weight exceeds 05 karat. 6. From Table 4 of Appendix I with df = 5, we read to 10 = 1.476; that is Pfi>1fl®=0m. Since 132 < 1.476, we have p—value = P{t > 1.32} > P{t > 1.476} = 0.10. The results are not significant. I Example 4. Construct a 90% confidence interval for ,a using the data in Example 5’. Solution. From Table 4 of Appendix I with df = 5, we read 73005 = 2.015. A 90% confidence interval for a is “it —i x a /2 x/I“; 4 0.53 d: 2.015 x/é 0.53 i- 0.046 01‘ 0.484 < M < 0.576.. I Example 5. Labels on l—gallon cans of paint usually indicate the drying time and the area that can be covered in one coat. Most brands of paint indicate that, in one coat, a gallon will cover between 250 and 500 square feet, depending on the texture of the surface to be painted. One manufacturer, however, claims that a gallon of its paint will cover 400 square feet of surface area. to test this claim, a random sample of ten l-gallon cans of white paint were used to paint ten identical areas using the same kind of equipment. The actual areas (in square feet) covered by these 10 gallons of paint are given here: 310, 311, 412, 368, 447, 376, 303, 410, 365, 350 Do the data present sufficient evidence to indicate that the average coverage differs from 400 square feet? Find the p—value for the test, and use it to evaluate the statistical significance of the results? Solution. 1. We want to test the null hypothesis H0 : ,u = 400. 2. An appropriate two—tailed alternative hypothesis is H, : ,u 7E 400. 3. The test statistic is Z a --— M0 = 365.2 —— 400 = 427. s/fi 48.417/fi’0 4. Rejection Region: We will reject H0 if the test statistic t is either greater than ta/g or smaller than —ta/2. The p~value is t pvalue = P{t > 2.27} + P {t < —2.27} = 2P {t > 2.27} .. Thus, 1 5 (p~value) 2: P {t > 2.27} . From Table 4 of Appendix I with df = n — 1 = 9, we read to 025 = 2.821 and to 01 = 2.262; that is, P{t > 2.821} = 0.025 and P {t > 2.262} = 0.01. 5 Thus, 0.01 < P{t > 2.27} < 0025 Hence, 1 0.01 < i (p—value) < 0.025 or 0.02 < p—value < 0.05. f '(t) 1383 1.833 I: 2.821 3,250 "I 2.27 5. The data present sufficient evidence to indicate that the average coverage differs from 400 square feet. I Example 6. Construct a 95% confidence interval for M using the data in Emample 5. Solution. From Table 4 of Appendix I with df = 9, we read to 025 = 2.262. A 95% confidence interval for ,u is — i t i 37 a /2 wt 365.2 :l: 2.262 V10 365.2 d: 34.63 or 3306 < ,u, < 399.8. Notice that the entire confidence interval is to the left of 400. This reconfirms that we must reject H0 : ,u = 400 in Example 5... HOMEWORK: pp.397 —- 399 10.1, 10.2, 10.3, 10.5, 10.7, 10.13 10.4 Small—Sample Inferences for the Difference Between Two Population Means: Independent Random Samples The mean and standard error of 531 — 52 are 2 2 a 0 #1 — M2 and “1 +‘ “2 n1 712 respectively. we can use the standard error of E1 — 52; 2 2 2 2 a a . s s —1 + —2 estlmated by —1 + i 711 n2 n1 712 Unfortunately, when the sample sizes are small, the statistic (the standardization of El — 52) (fl—E2)-(M1-M) 2 2 iL+£2 "1 "2 does not have an approximately normal distribution — nor does it has a Student’s t dis— tribution. In order to form a statistic with a sampling distribution that can be derived theoretically, we must make one more assumption. Assume that 0% = 0% = 02. Then the standard error of the difference in the two sample means is __ ‘ 2 2 a a 1 1 _1 _2 = 0-2 _ + _ 77,1 n2 R1 712 Where 0'2 is estimated by the pooled sample variance 82: ("1—1)5i+(n2*1)53 nl+n2—2 The resulting test statistic (51— 552) - (M1 — H2) 2 i _1_ stung) has a Student’s 75 distribution with (m — 1) + (n2 — 1) = m + 712 — 2 degrees of freedom, 15: Test of Hypothesis Concerning the Difference p1 — #2 Between TWO Means: Independent Random Samples 1. Null hypothesis: Ho = M1 — M2 = Do where D0 is some specified difference that we wish to test, (Often, we will hypothesize that there is no difference between ,ul and #2; that is, D0 = 0.) 2. Alternative hypothesis: Ha : M1 — M2 > Do (Upper—Tailed Test) Ha : M1 —— M2 < D0 (Lower-Tailed Test) Ha : ,ul — p2 7é DO (Two—Tailed Test:) 3,. Test Statistic: _ _ (IE1 — i132) * D0 where 2 (n1 — 1)8i+(n2—1)8§ 8 2 m + n2 —- 2 4.. Rejection Region: Reject H0 when t > ta for H, : ,ul —- M2 > Do (Upper—Tailed Test) t < —ta for Ha : M1 — M2 < D0 (Lower—Tailed Test) t > ta/g or t < —ta/2 for Ha : M1 —— [1,2 # D0 (Two—Tailed Test:) or when p—value < a. Here the critical values of t, ta and ta/Z, are based on m +712 — 2 degrees of freedom from the Student’s 15 probability table. 0/2 2.2. (1/2 Assumptions: 1. The samples are randomly and independently selected from normally distributed populations. 2. The variances of the populations 0% and 0% are equal; that is,.o2 2 2 1:02:0-. Small—Sample (1 —- oz) 100% Confidence Interval for lul ~v ,u2 Based on Indepen— dent Random Samples / 1 l - _,- ita 2 __ _ m (1131 £132) /2 S (n1 i— n2) where s2 is the pooled estimate of 02; that is, 2 (n1—1>s%+ (712—1) 8 —- ~..—..__..... fl 711+7’L2 "-2 Example 7. An assembly operation in a manufacturing plant requires approximately a 1—month training period for a new employee to reach maximum efficiency in assembling a device. A new method of training is suggested, and a test was conducted to compare the new method with the standard procedure. Two groups of nine new employees were trained for a period of 3 weeks, one group using the new method and the other following the standard training procedure. The length of time (in minutes) required for each employee to assemble the device was recorded at the end of the 3—week period (see the table below). These measurements appears in the following table. Do the data present sufficient evidence to indicate that the mean time to assemble at the end of a 3—week training period is less for the new training procedure? _Standard Procedure New Procedure_ 32 35 37 31 35 29 28 25 41 34 44 40 35 27 31 32 34 31 Solution. 1. We want to test the null hypothesis Haral=n2 or Hot/Al—pQZO‘. 2. An appropriate right-tailed alternative hypothesis is Hazul>u2 or Hazal—u2>0‘. 3.. The pooled sample variance is 2 2 (n1 —- 1) 8i + (712 — 1) 33 _ 8 “BMWE = 22.2361” n1+n2—2 ‘ l i 32 (g. + .1.) 22.2361 (5 + 5) n1 “2 which has a Student’s 23 distribution with n1 + 712 —‘ 2 = 16 degrees of freedom. 4. Rejection Region: We will reject H0 if the test statistic t is greater than ta. 5. The critical value approach: If we choose oz = 0.05, we read from Table 4 of Appendix I with df = 16 that to 05 = 1,746. The null hypothesis H0 will be rejected if t > 1.746.. Since 165 < 1746.. we cannot reject H0. Hence, there is insufficient evidence to indicate that the new training procedure is superior at the 5% level of significance. f0) 6. The p—rvalue approach: The p—value is pvvalue = P {t > 1,65}. 9 From Table 4 of Appendix I with df = 16, we read 75005 = 1.746 and to 10 = 1.337; that is, P{t>1.746}= 0.05 and P{t > 1.337} = 0.10. Thus, 0.05 < P{t > 1.65} < 010 Hence, 0.05 < p—value < 0.10 Because the p—value is greater than 0.05, the results are not significant.- Example 8. Construct a 95% confidence interval for M1 — [12 in Example 7. Using the confidence interval, can we conclude that there is a difference in the population means for the two groups of employees? Solution. The point estimate of M1 — M2 is El -— 52 = 35.22 — 31.56 = 3.66 and the standard error is 1 _____ s2 —— + —L = 22.2361 l +1 = 6.039. n1 n2 9 9 Since 1 —- 04 = 0.95 or a = 0.05, we have tat/2 = 2.120 with 16 degrees of freedom. Thus, the 95% confidence interval is 1 1 - —— ita 2 — — ($1 332) /2 S (n1 '1‘ “2) 3.66 :1: (2.120) (6.039) 3.66 :t 4.71 01' —1..05 < a, —— #2 < 8.37. This interval gives us a range of possible values of the difference in the population means. Since the hypothesized difference, pl — M2 = 0, is contained in the confidence interval, we should not reject H0.- Unequal Population Variances If the population variances are far from equal, there is an alternative procedure for estimation and testing that has an approximate 75 distribution in repeated sampling. As a rule of thumb, we should use this procedure if the ratio of the two sample variances, Larger 52 Smaller s2 > 3" 10 For example, the ratio of two sample variances in Example 5 is (4.94)2 (4 48),, = 1.22 < 3 which makes the pooled method appropriate. Since population variances are not equal, the pooled estimator s2 is no longer appropriate, and each population variance must be estimated by its corresponding sample variance. The resulting test statistic is (57—1 — E2) — D0 2 2 ir+£a n1 712 When the sample sizes are small, critical values for this statistic are found using degrees of freedom approximated by the formula 82 S2 2 (#1- + 2%) ’ng—l dfz The degrees of freedom are taken to be the integer part of this result. Example 9. A study was conducted to estimate the difference in the amount of the chemical orthophosphorus measured at two different stations on a major river. Orthophos— phorus is measured in milligrams per liter. Fifteen samples were collected from station 1 and 12 samples were obtained from station 2. The 15 samples from station 1 had a mean orthophosphorus content of 384 milligrams per liter and a standard deviation of 3.07 mil— ligrams per liter, while the 12 samples from station 2 had a mean orthophosphorus content of 149 milligrams per liter and a standard deviation of 080 milligrams per liter. Test the null hypothesis of equal population means against the alternative that the mean of the first population is greater than that of the second population; that is; H0 : ,u1 = ,uZ versus Hazy1 >,u2. Solution. Based on sample information, we have m = 351 Z 81 = n1 2 E2 = 82 = Since the ratio of two sample variances is (3.07)? (0 80), = 14726 :> 3, the pooled method is not appropriate. To use the alternative method, we need the degrees of freedom approximated by (3.07)2 (0.80)2 2 i 15 + 12 i (3 07)2(15 2 + [(0 80)2/12 2 15—1 12~—1 df z = 16.3 m 16. 11 The test statistic is 3 84 1 49 t = = 2.847. (3.07) (0.80) 15 + 12 Since we reject H0 if t > ta, the p—value is pvalue : P {t > 2.847} From Table 4 of Appendix I with df = 16, we read tom = 2.583 and t0‘005 = 2.921; that is, P {t > 2.583} = 0.01 and P{t > 2.921} = 0.005. Thus, 0005 < P{t > 2847} < 0.01 or 0005 < p—value < 001. Hence, the results are highly significant and we conclude the the alternative hypothesis.- HOMEWORK: pp.406 — 409 10.19, 10.21, 10.23, 10.25, 10.27, 10.29 10.5 Small-Sample Inferences for the Difference Between Two Population Means: A Paired-Difference Test In many situations, the observations in the two samples come in pairs so that the two observations are related. For instance, if we run a test on a new diet using 15 individuals, the weights before and after completion of the test form our two samples. Observations in the two samples made on the same individual are related and hence form a pair. To determine if the diet is effective, we must consider the differences, d1, d2, . - - , dn, of paired observations. The sample mean of differences, d1, d2, - - - , d”, is calculated as _ 1” d=ggdr LGt #d = #1 — H2- Paired—Difference Test of Hypothesis Test for ,ud = ,ul —,u2: Dependent Samples 1. Null hypothesis: H0 : ad 2 0 12 ...
View Full Document

{[ snackBarMessage ]}

Page1 / 12

stats 363 Chapter-10a notes - CHAPTER 10 INFERENCE FROM...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon bookmark
Ask a homework question - tutors are online