I let x t be the sum of the observed values of t i if

This preview shows page 11 - 13 out of 18 pages.

i. Let Xtbe the sum of the observed values of tiIf the null hypothesis is true, the ntobserved values of tiare like a random sample from a 0-1 box of Ntickets of which Gare labeled 1. Thus Xthas an hypergeometric distribution with parameters N, G, and nt. Fisher’s exact test uses Xtas the test statistic, and this hypergeometric distribution to select the rejection region. If the alternative hypothesis is that pt>pc, then if the alternative hypothesis is true Xtwould tend to be larger than it would be if the null hypothesis is true, so the hypothesis test should be of the form {Reject if Xt>x0}, with x0chosen so that the test has the desired significance level. If the sample sizes are large, it can be difficult to calculate the rejection region for Fisher's exact test; then the normal approximation to the hypergeometric distribution can be used to construct a test with approximately the correct significance level. In the normal approximation to Fisher's exact test, the rejection region for approximate significance level a uses the threshold for rejection x0=nt×G/N+ z1-a×f×nt½×(G/N×(1- G/N))½where fis the finite population correction (N-nt)½/(N-1)½and zais the 1- a quantile of the normal curve. The a quantile of the normal curve, a, is the number for which the area under the normal curve from minus infinity to aequals a. For example, z0.05=-1.645, and z0.95=1.645. A Z-statistic is a test statistic whose probability histogramcan be approximated well by a normal curve if the null hypothesis is true. The observed value of a Z-statistic is called the z-score. In Fisher's exact test, Z= (Xt-nt× G/N)/(f×nt½×(G/N×(1- G/N))½is a Zstatistic. Suppose one wants to test the null hypothesis that two population percentages are equal, pt=pcon the basis of independent random samples with replacement from the two populations. This is the population modelfor comparing two population percentages. Let ntdenote the size of the random sample from the first population; let ncbe the size of the sample from the second population; and let N=nt+ncbe the total sample size. Let Xtdenote the sample sum of the first sample; let Xcdenote the sample sum of the second sample; and let . , 1-zz) ,
G=Xt+Xcdenote the sum of the two samples. Conditional on G, the probability distribution of Xthypergeometric with parameters N, G, and nt, so Fisher's exact test can be used to test the null hypothesis. There is a different approximate approach based on the normal approximation to the probability distribution of the sample percentages: Let φtdenote the sample percentage of the sample from the first population; let φcdenote the sample percentage of the sample from the second population; and let φdenote the overall sample percentage of the two samples pooled together, φ=(total number of "1"s in the two samples)/(total sample size) = G/NThen, if the null hypothesis is true, E(φt-φc)=0. If in addition ntand ncare large, SE(φt-φc) is approximately s*×(1/nt+ 1/nc)½, where s*=(φ×(1-φ))½is the pooled bootstrap estimate
is
.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture