# lect07 - Categorical Data Analysis Lei Sun 1 CHL 5210...

Unformatted text preview: Categorical Data Analysis - Lei Sun 1 CHL 5210 - Statistical Analysis of Qualitative Data Topic: Stratified Samples and Matched Samples Outline • Odds ratio inference from stratified samples. • Models for matched pairs. Categorical Data Analysis - Lei Sun 2 Odds ratio inference for stratified samples. • One of the common problems in applied statistics is that of comparing two proportions from stratified samples (i.e. sets of 2 × 2 tables: K × 2 × 2 tables ). • Such data may be obtained, for example, – in cross-sectional studies where subjects are sampled separately by region of a city (i.e. stratified sampling). – in randomized trials when subjects are assigned to treatment groups separately by institution (stratified random assignment), and – when comparing a binary exposure (e.g. smoking status) across sam- ples created by modelling confounders (e.g. race) using dummy vari- ables, – in matched-pair case-control studies. • Summary estimates of association can be constructed using odds ratios, risk ratios or risk differences. • We will limit attention to inferences about odds ratios. Categorical Data Analysis - Lei Sun 3 • Inferences need to distinguish between two asymptotic cases: – Fixed-number-of-strata (e.g. the effect of smoking on the risk of hav- ing a low weight infant stratifying by mother’s race). – Increasing-number-of-strata (e.g. data from matched-pair case-control studies). • We will begin by considering methods of analysis applicable when there are a fixed number of strata. • Is mother’s smoking status associated with her risk of having a low weight baby after stratifying by mother’s race? Conditional independence: when X (SMOKE) and Y are independent at every level of Z (RACE). (Conditional independence does not imply marginal independence.) Is the odds ratio of having a low weight baby for smokers relative to non-smokers the same regardless of a mother’s race? Homogeneous association: when the effect of X on Y is the same at each level of Z . (Conditional independence is a special case of homogeneous association.) Categorical Data Analysis - Lei Sun 4 ## race = 1 low smoke | 0| 1| Total---------+--------+--------+ 0 | 40 | 33 | 73---------------------------+ 1 | 4 | 19 | 23---------+--------+--------+ Total 44 52 96 Type of Study Value 95% Confidence Limits----------------------------------------------------------------- Case-Control (Odds Ratio) 5.7576 1.7823 18.5992 ## race = 2 low smoke | 0| 1| Total---------+--------+--------+ 0 | 11 | 4 | 15---------------------------+ 1 | 5 | 6 | 11---------+--------+--------+ Total 16 10 26 Type of Study Value 95% Confidence Limits----------------------------------------------------------------- Case-Control (Odds Ratio) 3.3000 0.6346 17.1602 ## race = 3 low smoke | 0| 1| Total---------+--------+--------+ 0 | 35 | 7 | 42---------------------------+ 1 | 20 | 5 | 25---------+--------+--------+ Total 55 12 67 Type of Study Value 95% Confidence Limits----------------------------------------------------------------- Case-Control (Odds Ratio) 1.2500 0.3502 4.4616 Categorical Data Analysis - Lei Sun...
