This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Lecture 9: GLM, Quasilikelihood and GEE for categorical response data Laurent Briollais ([email protected]) Samuel Lunenfeld Research Institute, Mount Sinai Hospital Dalla Lana School of Public Health, UofT November 21, 2011 Outline • Review of previous lecture • GLM and Quasilikelihood estimation for binary/categorical data • GEE for repeated binary/categorical responses • Examples in R and SAS 1 CHL 5210 Fall 2009 Categorical data analysis Lecture #9 2 Previous lecture • There are many ways to specify the logit model for ordinal/nominal re sponses. It depends on – The nature of the response (nominal/ordinal) – The research question – Preference for effects that refer to individual response categories (ad jacent logits) or to the entire scale of responses (cumulative logits) • GLMM generalizes fixed effects logit models by adding a random com ponent (intercept, time effect,...) • Alternatively we can use GEE or MLE with constraints – i.e. marginal vs. conditional estimation – note it is also possible to get marginal estimates with the random effects approach by integrating over the random effect • Transitional modeling can help to understand the transition from one state to another • Once a model has been chosen, need to check its fit to the data, its sensitivity to the specification of the correlation structure • Most of the approaches studied are robust against mispecification of the correlation structure • Sometime the research interest is in the correlation structure parameters (like in genetic studies) CHL 5210 Fall 2009 Categorical data analysis Lecture #9 3 1 GLM: Generalized Linear Models 1.1 Overview A GLM generalizes ordinary regression by • Permitting Y to have a distribution other than normal • Permitting modeling of g ( μ ) rather than μ • The same ML (maximum likelihood) estimation method applies to all GLMs. This is the basis of software such as PROC GENMOD in SAS and glm in R . A GLM can be characterized by three components • Random component Identify response variable Y with independent observations ( y 1 , ...., y N ) from a distribution in the natural exponential family such as Poisson or binomial This family has probability density function or mass function of form f ( y i ; θ i ) = a ( θ i ) b ( y i ) exp [( y i Q ( θ i )] (1) The value of the parameter θ i may vary for i = 1 , ..., N , depending on values of the explanatory variables. The term Q ( θ ) is called the natural parameter. CHL 5210 Fall 2009 Categorical data analysis Lecture #9 4 A dispersion is also sometime added to this model. • Systematic component Defines the set of explanatory variables x 1 , ..., x k for linear predictor η i = α + β 1 x 1 + β 2 x 2 + ... + β k x k = X j β j x ij , i = 1 , ...., N (2) • Link function Connects the random and systematic components η i = g ( μ i )....
View
Full Document
 Fall '11
 LeiSun
 Normal Distribution, GLMs

Click to edit the document details