# Now we need to look at table of the 2 distribution on

• Notes
• 20
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 2 - 5 out of 20 pages.

. Now we need to look at table of the 2 - distribution on page 594 with df = 5 and try to find 5.22 on that line. You note that it is not on that line. However, we also note that 2 (5) 6.63 0.25 P . A simple graph tells us that 2 (5) 5.22 p value P > 0.25.
STA 6126 Chap 8, Page 3 of 20 5. Decision: Do not Reject Ho since p-value > any reasonable α. 6. Conclusion: The observed data strongly indicate that the die is not loaded. B) Test of Homogeneity Observe that in Section 7.2 we had two populations, two random samples from these populations and a categorical random variable with only two categories. Gender Belief in Afterlife Yes No or Undecided Total Female 435 147 582 Male 375 134 509 Total 810 281 1091 We have decided that there is no significant difference between the males and females in their belief in afterlife. Hence we say that the two populations are homogeneous with respect to their belief in afterlife. Such a test is known as the test of homogeneity. In this section we will extend the above ideas to the case where the categorical variable has two or more categories (s ay r ≥ 2) and the number of populations are two or more (say c ≥ 2). We summarize the sample data in an r by c (denoted as r×c) contingency table, i.e., a table with r rows and c columns. Categories Total Samples 1 2 c 1 O 11 O 12 O 1c n 1. 2 O 21 O 22 O 2c n 2. . . . . . . . . . . . . . . . r O r1 O r2 . n r. Total n .1 n .2 . n .. We test the hypothesis that the populations are homogeneous with respect to the (categorical) variable of interest.
STA 6126 Chap 8, Page 4 of 20 The basic idea of obtaining a pooled sample proportion in the case of two-population, two- category problem (data summarized in a 2×2 contingency table as above) is used in the general case of where we have a c-population, r-category problem (data summarized in an r×c contingency table). If the assumption of homogeneity (Ho) is true, then π ij = π j for all of the j populations then we need to estimate only one parameter ( j ) for the proportion in each category that applies to all of the populations. The parameter, j is estimated by dividing the total of each category in the sample with the total sample size ( . .. ˆ j j n n ). Then, based on these estimates, we calculate the expected number of observations in each category of each sample (i.e., for each cell in the table) . . . .. .. ˆ j i j ij i j i n n n Rowtotal Column total E n n n n Grand total Next, we compare the observed values (Oij) with the expected values (E ij ) in each cell of the r×c contingency table with the following test statistic: The test statistic is 2 2 2 ( ) ~ ij ij df all cells ij O E E If the hypothesis of homogeneity is true, we expect the calculated value of the test statistic ( 2 cal ) to be small. Large values of 2 cal leads to the rejection of Ho. How large depends on the degrees of freedom and α, so that P( 2 ( ) df 2 cal ) = p- value ≤ α. In such problems the variable of interest is called the response (also called the dependent) variable and the code for the populations is called the predictor (or the independent) variable.