L7handout - PUBH 7430 Lecture 7 J. Wolfson Division of...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: PUBH 7430 Lecture 7 J. Wolfson Division of Biostatistics University of Minnesota School of Public Health September 27, 2011 Accounting for covariate effects • Sometimes, we want to summarize correlations after accounting/adjusting for covariate effects. • eg. • Do individuals’ beta-carotene trajectories differ after accounting for amount of beta-carotene supplementation? • How does FEV vary over time after adjusting for height? Accounting for covariate effects Leads to another kind of residual: ˆ r (Yi ) = Yi − Xi β • Y i is the vector of observations for subject i ˆ • β is the coefficient estimate derived from the linear model Y = Xβ + Accounting for covariate effects Standard summary plots/statistics can be computed using these “covariate-adjusted” residuals 0.6 q 6 8 10 12 Age 14 16 18 q q q q qq q q q qq qq q q qqqq q qq q q q qq q q q q qq q q q q qq q qqqq qq q qq qq q q q qq q qq q q q q qq q q qq qqqqqqqq q q qq q q q q q qq q q q q q q qqqqqqqq q q q q q q qq qqqq q q qq qqqq q q q qq q q q qq qq qq qqqqqqq q q q qq q qqqqqqqq qqqq q q q qq qqqq q qq q q q qq q qqqqqqqqq qq qqqqqqqqqqq q q qqqq q q qqqqqqqq qq q q q qqqq q q qq q qqqqqqqqqqqqqqq qq q qqqqqqqqqqqqqq q q qqqq q q q qq qq qqq q q q q q q qq qqqqq q qqq qq qqqqqq qq q qq qq q q q qqqq qqq q q q q qqqqqqqqqqqqqqqqqqqqqqq q q q q q qq q qq qqqqqqqqqqqq qqqq qq q q qqqqqqqqqq qqqq qqqq q q q qq qqqqqqqq qqqqqqqqqqq q q q qq qqqq q q qq qq q qq qqqqq qq q qqqq qqqqqqqqq q qqqqqqqqqqq q qqqqqqqqqqqqqqq qqqqqqqqqqqq q q q qqq q qqq qqqqqqqqq q q q qq qqqq q qqq q qq q q qq qq qqq q q qqq q qq qqqqqq qqqqqqqqq qq qq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq q qqqqqqqqqqqqqqqqqqqqqqqqqqq q qq q q qqqqqq qq q qq qqqqqqqq q qqqqqqqqqq q qqqqqqqqqqqqq q q q qqq q q q q qq q q q qqqqqq q qqqqqqqq qqqqq q qqq qqqqqqq q qqq q qqq qqqqqq q qqqqqqqq qqqq qqq qqqqqqqqqq q qqqqq qqq q qqq q qqqqqq q q q q qq qqqqqq qqqqq q q q qqqqq qq qq q q q q qq qq q q q qqqqqqqqqqqqqqqqqqqqqqqq qqqqq qqqqqqqqq q q q qqqqqqqqqqq q qq q qq qq qqq q q q qqqqqqq qqq q qq qqqqqqqqqqqqqqqqqqqqqqqq q q qqq q q q q qq q qqqqq qqqqq qqqqqqqqqqq qqqqqqqq qqqq qqqqqqqqqqqq q q q qq q q qq q q q q qqq q qqqqqqq q qqqqqqqqqq q qq qqqq q q qqqqqqqqqq qq qqq qqqqqqqq qq qqqqqq q qq q q q qq q q qqq qq qq qq q qq qqq qqqqq q qqq qq qq q qqqqq qqqqq qqq q q qqqq qq q q qqq q q q qqqqqqqq q qq q qq qqqq qqqq q q qq q qq qqq q q q q q qqq qq qq q q q q q q q qqq q q q q q qq q q q q qq qqq q q q q q qq qq qq q q q qqq q qqqq qq qq q q qq qq qq qq q qq qqq qq qq qq qqq q q q qq q qq qq q q q q qq q q q q q q qq q 0.0 0.2 0.4 q −0.2 −0.2 0.0 0.2 q q q q qq q q q q qq q q q qq q q qq q qq q q qqq qq q q q q qq q q q q q q qqqq qqqqq qq qq q q q q q q qqq q q qq q q qq q q q qq qq q qq qqqq qqqq q qq q q q q qqqqqq qqqqq qq q q qqq qqq qqq q qq qqqqqqq q qq qq qqqqqqqqqqqqqqq q qqq qqqq q q q q q qqqq q qq q qq qq qq qq q q q qq q q q qqq q qq qq qq q qq q q q q qqq qq qq qq q q q q q qqq qq q q qqq qq qqq qqqqqqq q qq q qq qq q qq q q qqqq q qq q q q qqqqqqqqqqqqq q q q qq q qq q q qq q q q q q qq qqq q q q q qq qqq q q q qqqqq q qq qq qqq qqq qqqqqqq qqqqq q qq q qq q qq qq q q qq q q qqqqq q qqq qqqqqq qq qqqqq qqqqqq qqq qq qqqqq q qq q q q q q q q qq qq q q q q qq q q q q qq q qqqq q qq qqq qq q qq q q q q qqqqq qqqq qq qqqqqq qq qqqqqq qqqqq q q qqqqq qq qqqqqq qqq qq qqqqq qqqqqq q q qq q qqq qq qq q qq q q q q qq q q q q q qqq q q qqq q qqq qqq q qq qqqq q q qq qqqqq q q q qq q q q qqq qq q q q qq q q q qqqqqqq qqqq q q qqqqqqq q qqqqqq qqqq qq q qq q q q qq q qq q q q q qq q q q qq q q qq q q qq q qqqq q q q q qq q qqqqqqqq q qqq q qq qqq qqq qqqqqq q q qq qqqqqqq qq q qqqqqqqqqq qqqq qqqqqqq qq q q qq qqq q q qqqq q qq q qq q q qqqqqqq qqqqqq qq qqqqqq qq qq qqq qqqq q qqqq qqq q qq qq qqqqqq qqq q qq q qq q q q qqqq q qqq q q qq qqqq q qq q q q q q q q qq q q q qq qq q qqqqqq qq qqqq qqqqqqqqqq qqqqqqq q q q q q q qqqq qqqq qq qqqqqq qq q q q q qqqq q q qq q q q qqqq qqq q q qqq qqq qqq q q q q q qq q qq qq q q qq qq q qq qq q q qqq qq q qq q q q q q q qq q qq qq q qqq q qqqqqq qq q q qq q q qq q q qqq q qqq qqqq qq q qqq qq qqq qqqq q qq qqqqq q q qqq qqq q qq q q q qq q q qq qq qqqqqq qqqq q qqqqqqqqqqqqqqqq q q q q q q q q q q qq qq qqqqqq qq q q qqq q q qq qq qq q q qq qq q q qqq qq qq q qq qqqqq qqqqqqqqqq qq qq q q q q q q q q q q q q qq q q q q qq q q qq q q q qq q q qqqq qqq q q qqq q q qq q qq q qqqqq q q qqqqqq qqqq qqqqq qq qqq q qqq q qq q q qq q q q q q qq qq q q qq q q q q q qq q q q q q qqqq qq qqq qq q qq q q qq q q qq q q qqq q qqq q qq q q q qq qq q qqq qq qqqqq qqqqqq qqq qqqq q q qq q q q q qq qqqq q qqqq qqq q q qq q qq qq qq q qq q q q q qq q q q q q qq qqq q q q q q q qqq q q q qq q q qq qq q q qqq qqqq q qqqqqq qqqq qq q qq q q qq qq qq q q q q q q q q qq qq qq q q q q q q qq q q q qqqq q q q q q qqq q q q q q q qq q qq q qqq qqq qq q qq q qq qq q qq q q q q qq q qq q qq q qq q q q q q q qq qq q q q q q q qq q qq q q qq qq q q q q q q qq q qq q q q q q q qq q q q q q q q q q Age−adjusted log(FEV) residuals q q q q q Age−adjusted FEV vs. Height −0.4 0.4 q q q q −0.4 Height−adjusted log(FEV) residuals Height−adjusted FEV vs. Age qq 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Height Accounting for covariate effects Notes • Within-time residuals arise from the linear model with observation time as the sole covariate • Within-person residuals arise from the linear model with subject ID as the sole (categorical) covariate • Residuals can be similarly defined for • More than one covariate (use an X matrix with more columns and a correspondingly longer β ) • Models other than the typical linear model • Can serve as useful diagnostic tools for models... A review (or introduction to!) Generalized Linear Models Models for independent data For now, back to the simpler world of independent observations Y = [Y1 , Y2 , . . . , Yn ] with corresponding covariate x11 . X= . . xn 1 matrix x12 . . . . . ... . xn 2 . . . x1p . . . xnp Regression models: Two goals When fitting regression models, we generally have one of two goals in mind: outcome prediction or estimation of covariate effects Outcome prediction • Predict future Y values given covariates X • Often, best prediction is the conditional mean E (Y | X) = µ(X) • Focus on obtaining accurate predictions: • Model diagnostics of great interest • Interpretation of regression coefficients less important Regression models: Two goals Estimation of covariate effects • Focus on estimating the effects of covariates on µ(X) by estimating entries of coefficient vector β • More common task in practice, and in this class • Quantify uncertainty, perform inference on effect ˆ estimates β • Confidence intervals • p-values ˆ • Interpretation of β is fundamental • Model construction should be primarily driven by scientific question, NOT model fit Basic linear model To accomplish the aforementioned goals, with continuous independent outcomes we often use the simple linear model: Y i = xi β + i , i ∼ N (0, σ 2) for i = 1, . . . , n or equivalently, Y ∼ MVN (Xβ , σ 2I ) Linear model: Assumptions Y i = xi β + i , i ∼ N (0, σ 2) Assumption 1 The mean of Yi is a linear combination of the covariates: E (Yi | xi ) = xi β Yields standard coefficient interpretation for coefficient βj of covariate xj : “A one-unit increase/decrease in xj is associated with a β -unit change in the mean of Y ” Linear model: Assumptions Y i = xi β + i , i ∼ N (0, σ 2) Assumption 2 The variance of Yi is a constant for all i , and does not depend on the mean (nor the covariates): Var (Yi | xi ) = σ 2 for all i Linear model: Assumptions Y i = xi β + i , i ∼ N (0, σ 2) Assumption 3 Yi has a Normal distribution: Yi ∼ N (xi β , σ 2) Linear model: Assumptions Do we need all three assumptions? • For predicting outcomes? YES • For estimating covariate effects? NO Linear model: Assumptions ˆ Suppose we want to estimate covariate effects via β , and that 1 2 The mean of Y is a linear combination of covariates The variance of Y is a constant Then we can: • Get good (unbiased) estimates of covariate effects • Get good estimates of standard errors (p-values, confidence intervals) in “reasonable” samples (in linear model, need n ≈ 30) Linear model: Assumptions Crucial observation We did NOT assume that Yi is Normally distributed Take-home message We can make approximately correct inferences about covariate effects when only the mean and variance of the outcome are correctly specified (and the sample size is large enough). Beyond the linear model Recall the two key assumptions of the linear model: 1 2 Mean is a linear combination of predictors Variance is a constant not depending on mean Limitation Not all types of outcomes satisfy these assumptions ...
View Full Document

This note was uploaded on 11/21/2011 for the course PUBH 7430 taught by Professor Prof.eberly during the Fall '04 term at Minnesota.

Ask a homework question - tutors are online