Multiple Regression Diagnostics and Remedial Measures KNNL Chapter 10.1,10.5

Overview of today’s lecture Checking for information contained in additional variables Identifying the symptoms of colinearity Weighted least squares
Model Adequacy Are added variables needed? Added Variable Plots Marginal importance of a variable Nature of variable’s relationship

Consider Liver Surgery Data Assume we have Prognostic.Index and Enzyme.Test already in the model Now we wish to examine what if anything Liver.Test will contribute Importance Relationship
BloodClot.Score 20 40 60 80 100 1 2 3 4 5 6 4 6 8 10 20 40 60 80100 Prognostic.Index Enzyme.Test 1 2 3 5 Liver.Test 4 6 8 10 20 40 60 80 100 120 5.5 6.0 6.5 7.0 7.5 5.5 6.5 7.5 LogSurv.Time

Liver Test 1 2 3 4 5 6 5.5 6.0 6.5 7.0 7.5 Log(Survival Time) i i X Y 1 1 0 β β+ =
Liver Test 1 2 3 4 5 6 5.5 6.0 6.5 7.0 7.5 Log(Survival Time) i i X Y 1 1 0 β β+ =

Residuals Residuals are what remain left after accounting for the two main predictor variables Prognostic.index and Enzyme.test ) ˆ ˆ ˆ ( ) , ( ˆ ) , | ( 3 3 2 2 0 3 2 3 2 i i i i i i i i i i X X Y X X Y Y X X Y e β + + - = - =
Liver Test 1 2 3 4 5 6 -500 0 500 1000 Residuals i i i i X X X Y e 1 1 0 3 2 ) , | ( β β+ =

Liver Test 1 2 3 4 5 6 -500 0 500 1000 Residuals i i i i X X X Y e 1 1 0 3 2 ) , | ( β β+ =
Added-Variable Plots Kutner et al. suggest that this isn’t enough, that we need to account for the effect on both response and predictor The second set of starred regression ) ˆ ˆ ˆ ( ˆ ) , | ( 3 * 3 2 * 2 * 0 1 1 1 3 2 1 i i i i i i i i X X X X X X X X e β + + - = - =

Liver Test Residuals -2 -1 0 1 2 3 -500 0 500 1000 Survival.Time Residuals ) , | ( ) , | ( 3 2 1 1 0 3 2 i i i i i i X X X e X X Y e β β+ =
Liver Test Residuals -2 -1 0 1 2 3 -500 0 500 1000 Survival.Time Residuals ) , | ( ) , | ( 3 2 1 1 0 3 2 i i i i i i X X X e X X Y e β β+ =

Three predictors included Surg4c.lm = lm(Survival.Time ~ Liver.Test +  Prognostic.Index + Enzyme.Test, data = Surgical) summary(Surg4c.lm) Coefficients:                      Value Std. Error   t value  Pr(>|t|)       (Intercept) -630.3873  103.1037    -6.1141    0.0000       Liver.Test  120.6833   24.5072     4.9244    0.0000 Prognostic.Index    6.9037    1.2815     5.3871    0.0000      Enzyme.Test    7.4328    1.1084     6.7057    0.0000
AIC while dropping one term library(MASS) stepAIC(Surg4c.lm) Start:  AIC= 1153.89   Survival.Time ~ Liver.Test + Prognostic.Index +  Enzyme.Test                     Df Sum of Sq     RSS      AIC

