This preview shows pages 1–12. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: PUBH 7430 Lecture 10 J. Wolfson Division of Biostatistics University of Minnesota School of Public Health October 6, 2011 Fitting GLMs  applied example Time trial data Original data Outcomes: 5 reaction time trials for 113 individuals Predictors: Age and gender Time trial  modified data Outcomes: Derived continuous variable TotalTime equal to sum of timings across five trials Derived count variable FastTrials equal to number of trials with time < . 1 s Derived binary variable Improved equal to 1 if a regression line fitted to the five trials has a negative slope Predictors: Age and gender Constructing a GLM Scientific question How do age and gender affect the outcome(s)? 1 Investigate available data, decide whether mean is parameter of interest 2 Construct linear predictor 3 Select distribution in exponential family which is most similar to hypothesized data generating process (fixes variance function) 4 Choose an appropriate link function g Investigate the data Eliminate extreme outlier (observation 35, 2 yearold) with TotalTime more than twice as large as any observation TotalTime (Continuous) TotalTime Density 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 FastTrials (Count) FastTrials Density 1 2 3 4 5 0.0 0.1 0.2 0.3 0.4 0.5 Improved (Binary) Improved Density 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 Construct linear predictor We wlll consider the following models: Main effects model g ( ) = + 1 Sex + 2 Age Interaction model g ( ) = + 1 Sex + 2 Age + 3 Sex Age Quadratic age model g ( ) = + 1 Sex + 2 Age + 3 Age 2 where Gender is a twolevel factor and Age is treated as continuous. Continuous outcome: TotalTime Select modeling distribution(s) TotalTime is both positive and continuous Two common distributions from exponential family are close: Normal distribution Gamma distribution Histogram w/ Normal Density TotalTime Density 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 Histogram w/ Gamma Density TotalTime Density 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 Gaussian (Normal) GLM Link function Options include identity (canonical), log, inverse When in doubt (as here), best to start with canonical, but try others Other assumptions Observations are independent V ( ) = 1 Fitting the Gaussian GLM In R glm(total~Sex+Age,family=gaussian(link=identity)) In SAS PROC GENMOD DATA=TIMETRIAL; CLASS SEX; MODEL TOTAL = SEX AGE / DIST=NORMAL LINK=IDENTITY; Gaussian GLM output > summary(glm(total~Sex+Age,family=gaussian)) Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 0.865968 0.073019 11.859 < 2e16 *** SexM0.128187 0.0719551.781 0.077620 ....
View
Full
Document
This note was uploaded on 11/21/2011 for the course PUBH 7430 taught by Professor Prof.eberly during the Fall '04 term at Minnesota.
 Fall '04
 Prof.Eberly

Click to edit the document details