Associations Between Categorical Variables Case where both explanatory (independent) variable and response (dependent) variable are qualitative (Chapter 7 includes case where both are binary (2 levels) Association: The distributions of responses differ among the levels of the explanatory variable (e.g. Party affiliation by gender)

Contingency Tables Cross-tabulations of frequency counts where the rows (typically) represent the levels of the explanatory variable and the columns represent the levels of the response variable. Numbers within the table represent the numbers of individuals falling in the corresponding combination of levels of the two variables Row and column totals are called the marginal distributions for the two variables
Example - Cyclones Near Antarctica Period of Study: September,1973-May,1975 Explanatory Variable: Region (40-49,50-59,60-79) (Degrees South Latitude) Response: Season (Aut(4),Wtr(5),Spr(4),Sum(8)) (Number of months in parentheses) Units: Cyclones in the study area Treating the observed cyclones as a “random sample” of all cyclones that could have occurred Source: Howarth(1983), “An Analysis of the Variability of Cyclones around Antarctica and Their Relation to Sea-Ice Extent”, Annals of the Association of American Geographers, Vol.73,pp519-537

Example - Cyclones Near Antarctica Region\Season Autumn Winter Spring Summer Total 40 ° -49 ° S 370 452 273 422 1517 50 ° -59 ° S 526 624 513 1059 2722 60 ° -79 ° S 980 1200 995 1751 4926 Total 1876 2276 1781 3232 9165 For each region (row) we can compute the percentage of storms occuring during each season, the conditional distribution . Of the 1517 cyclones in the 40-49 band, 370 occurred in Autumn, a proportion of 370/1517=.244, or 24.4% as a percentage. Region\Season Autumn Winter Spring Summer Total% ( n ) 40 ° -49 ° S 24.4 29.8 18.0 27.8 100.0 (1517) 50 ° -59 ° S 19.3 22.9 18.9 38.9 100.0 (2722) 60 ° -79 ° S 19.9 24.4 20.2 35.5 100.0 (4926)
Example - Cyclones Near Antarctica 40-49S 50-59S 60-79S region Bars show Means Autum n Winter Spring Sum m er season 10.00 20.00 30.00 40.00 r e g p c t Graphical Conditional Distributions for Regions

Guidelines for Contingency Tables Compute percentages for the response (column) variable within the categories of the explanatory (row) variable. Note that in journal articles, rows and columns may be interchanged. Divide the cell totals by the row (explanatory category) total and multiply by 100 to obtain a percent, the row percents will add to 100 Give title and clearly define variables and categories. Include row (explanatory) total sample sizes
Independence & Dependence Statistically Independent: Population conditional distributions of one variable are the same across all levels of the other variable Statistically Dependent: Conditional Distributions are not all equal When testing, researchers typically wish to demonstrate dependence (alternative hypothesis), and wish to refute independence (null hypothesis)

Pearson’s Chi-Square Test Can be used for nominal or ordinal explanatory and response variables Variables can have any number of distinct levels Tests whether the distribution of the response
