Note that the answers are slightly di ff erent because in R the function states

# Note that the answers are slightly di ff erent

• Notes
• 146

This preview shows page 74 - 77 out of 146 pages.

Note that the answers are slightly di ff erent because in R the function states: “this method does not use the concept of adding 2 successes and 2 failures,”but rather uses the formulas explicitly described in [the paper]”. Hence we recommend and encour- age the use of software . However, the software doesn’t compute one sided so that has to be done by manually. Example 3.8 A map and GPS application for a smartphone was tested for accuracy. The experiment yielded 26 error out of the 74 trials. Find the 90% C.I. for the propor- tion of errors. Since n = 74 and x = 26, then ˜ n = 74 + 4 and ˜ p = (26 + 2) / 78 = 0 . 359. Hence the 90% C.I. for p is 0 . 359 z 1 - 0 . 05 | {z } 1 . 645 r 0 . 359(1 - 0 . 359) 78 (0 . 2696337 , 0 . 4483151) or in R: > library(binom) #may need to first install package > binom.confint(26,74,0.90,methods="ac") method x n mean lower upper 1 agresti-coull 26 74 0.3513514 0.2666357 0.4465532 3.2.2 Large sample hypothesis test Let X be the number of successes in n Bernoulli trials with probability of success p , then X Bin( n,p ). We know by the the C.L.T. that under certain regularity conditions, then ˆ p N p, p (1 - p ) n ! . To test (i) H 0 : p p 0 vs H a : p > p 0 (ii) H 0 : p p 0 vs H a : p < p 0 (iii) H 0 : p = p 0 vs H a : p , p 0 The test statistic equivalent to the Agresti-Coull method is T .S. = ˜ p - p 0 q ˜ p (1 - ˜ p ) ˜ n H 0 N (0 , 1) Reject the null if Chapter 3. Inference for One Population 75 (i) p-value= P ( Z T .S. ) < α (ii) p-value= P ( Z T .S. ) < α (iii) p-value= P ( | Z | ≥ | T .S. | ) < α Example 3.9 In example 3.8 , if we wished to test whether the proportion of errors is less than half the time then, H a : p < 0 . 5. T .S. = 28 / 78 - 0 . 5 q 28 / 78(1 - 28 / 78) 78 = - 2 . 596426 with p-value = 0.00470996 < α = 0 . 10, so reject the null. In a way, we kind of knew from the previous C.I. since the upper limit of the interval was 0.4483 which is less than than 0.5. R code 3.3 Multiple methods and (software) functions exists for performing such inference such as prop.test , but this does not perform the Agresti-Coull method. 3.3 Inference for Population Variance The sample statistic s 2 is widely used as the point estimate for the population variance σ 2 , and similar to the sample mean it varies from sample to sample and has a sampling distri- bution. Let X 1 ,...,X n be i.i.d. r.v.’s. We already have some tools that help us determine the distribution of ¯ X = 1 n n i =1 X i , a function of the r.v.’s, and hence ¯ X is a r.v. itself and once a sample is collected a realization ¯ X = ¯ x is observed. Similarly, let S 2 = 1 n - 1 n X i =1 ( X i - ¯ X ) 2 be a function of the r.v.’s X 1 ,...,X n and hence is a r.v. itself. A realization of this r.v. is the sample variance s 2 . From Lemma 2.9 if X 1 ,...,X n are i.i.d. N ( μ,σ ) then ( n - 1) S 2 σ 2 χ 2 n - 1 , 76 3.3. Inference for Population Variance 3.3.1 Confidence interval 0 χ 2 distribution χ 1 2 2 α 2 χ α 2 2 α 2 1 - α Figure 3.4: χ 2 distribution and critical value. Consequently, 1 - α = P χ 2 ( α/ 2; n - 1) < ( n - 1) S 2 σ 2 < χ 2 (1 - α/ 2; n - 1) ! = P ( n - 1) S 2 χ 2 (1 - α/ 2; n - 1) < σ 2 < ( n - 1) S 2 χ 2 ( α/ 2; n - 1) which implies that on the long run this interval will contain the true population variance parameter 100(1 - α )% of the time. Thus, the 100(1 - α )% C.I. for σ 2 is ( n - 1) s 2 χ 2 (1 - α/ 2; n - 1) , ( n - 1) s 2 χ 2 ( α/ 2; n - 1) .  • • • 