1Chapter 9Analysis of Two-Way Tables9.1 Inference for Two-Way Tables9.2 Goodness of Fit
29.1 Inference for Two-Way TablesTwo-way tablesExpected cell countsThe chi-square statisticThe chi-square distributionsThe chi-square testComputationsComputing conditional distributionsModels for two-way tables
3Two-Way TablesWhen the data are obtained from random sampling, two-waytables of counts can be used to formally test the hypothesisthatthe two categorical variables are independent in thepopulation from which the data were obtained.In Chapter 2, we discussed how two-way tables can be used todescribe the joint distribution of two categorical variables.Suchtables can be used to describe therelationshipbetween twocategorical variables.A relationship exists when the distribution ofone variable depends on the value of the other variable.
Two-way tablesWe call education therow variableand age group thecolumnvariable.Each combination of values for these two variables is called acell.For each cell, we can compute a proportion by dividing the cell entryby the total sample size. The collection of these proportions would bethejoint distributionof the two variables.
5GenderOngoing FrightSymptomsMaleFemaleYes729No3150Total3879GenderOngoing FrightSymptomsMaleFemaleYes18%37%No82%63%Total100%100%If the null hypothesis of no relationbetween gender and ongoing fright istrue, we expect this overall percent toapply to both men and womenThe Hypothesis: No Association
Expected Cell CountsThe expected count in any cell of a two-way table whenH0is true is:6The rows of a two-way table are the values of one categoricalvariable and the columns are the valuesof the other variable.Thecount in any particular cell of the table equals the number of subjectswho fall into that cell. We want to test the hypothesis (H0) that there isno relationship between the two categorical variables.To test this hypothesis, we compareactual countsfrom the sampledata withexpected counts,where the latter counts are thoseexpected when there is no relationship between the two variables.
7The Chi-Square StatisticAssume there arerrows in the two-way table andccolumns, whichmeans there arerccells.The test statistic that makes the comparison is thechi-square statistic.Thechi-square statisticis a measure of how far the observed countsare from the expected counts. The formula for the statistic is:where“observed”represents an observed cell count,“expected”represents the expected count for the same cell, and the sum is over allrccells in the table.2(Observed - Expected)2ExpectedTo see if the data give convincing evidence against the nullhypothesis, we compare the observed counts from our sample withthe expected counts assumingH0is true.
8The Chi-Square DistributionsWhen the observed counts are very different from the expected counts, a large valueof2will result, providing evidence against the null hypothesis.