This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Categorical Data Analysis  Lei Sun 1 CHL 5210  Statistical Analysis of Qualitative Data Topic: Logistic Regression ctd. Outline Logistic regression. Single predictor ctd. Multiple predictors. * Confounding. * Interaction. * Model building (model fit and selection). Categorical Data Analysis  Lei Sun 2 Logistic regression with single covariate. A brief review of last lecture. low ftv  0 1 2 3 4 6 Total+++++++ 0  64  36  23  3  3  1  130+++++++ 1  36  11  7  4  1  0  59+++++++ Total 100 47 30 7 4 1 189 Modeling the number of physician visits as a categorical covariate. logit ( ( X i )) = + 1 X i 1 + 2 X i 2 + 3 X i 3 + 4 X i 4 , where X i = ( X i 1 , X i 2 , X i 3 , X i 4 ), and X ij = 1 if i th woman had j visits, j = 1 , 2 , 3 , 4 and X ij = 0 otherwise ( j = 0 visits). * Order of the categories is not important: nominal covariate. * Inference of 1 needs only the first and second columns, and inference of 2 needs only the first and third columns, etc. * Inference is essentially the inference for contingency tables. * Grouping the last two categories: avoid zero and small counts. Categorical Data Analysis  Lei Sun 3 * Allows for different parameters (logodds ratios) for comparing different categories. j : logodds ratio of having low weight babies comparing women who had j visits with women who had 0 visits, or j j 1 : logodds ratio of having low weight babies comparing women who had j visits with women who had j 1 visits. Modeling the number of physician visits as a quantitative covariate. logit ( ( FV T i )) = + FTV i . * Inference of now takes into account all the data. * It is more parsimonious than the previous model: reduce the number of parameters needed, and inference is more efficient. * However, there is a strong restriction on the pattern of odds ra tios for comparing different categories: : logodds ratio of having low weight babies comparing women who had a visits with women who had a 1 visits, i.e. logodds ratio is the same for comparing 1 visits to 0 visits, and comparing 2 visits to 1 visits, etc. * Problem of model fit. Categorical Data Analysis  Lei Sun 4 Effect of units of measurement. Quantitative variables can be reported using different units, e.g. drinks per day, drinks per week. Changes in the units affect the estimated regression coefficients, the estimated odds ratio and the associated 95% CI. The estimated odds ratio for having a low weight baby is .874 for a one unit difference in the number of physician visits with an associ ated 95% wald based CI: (.64, 1.19)....
View
Full
Document
This note was uploaded on 02/23/2012 for the course CHL 5210H taught by Professor Leisun during the Fall '11 term at University of Toronto Toronto.
 Fall '11
 LeiSun

Click to edit the document details