Lecture 21 slides(Influence)

# Lecture 21 slides(Influence) - Inuence vs. Deviation...

This preview shows pages 1–3. Sign up to view the full content.

Influence vs. Deviation Methods of Chapter 5 examine potential extreme residuals for evidence of model misspecification or non-normality or other model violations Does not deal explicitly with the notion of influence Outliers are not necessarily influential points in the regression! Very important to separate the idea of extreme outlying residuals and influential points Lecture 19 – p. 1/32 Influence via residuals Previously, we looked at the externally studentized residual, i.e. t i = e i s - i 1 - h ii Although there is an adjustment for potential influence through the value h ii , we simply know here that a large error in fit has occured We don’t know the extent of the influence of the point or which coefficients it may have affected Lecture 19 – p. 2 / Influence vs. distance Another natural criterion to look at then is the i -th diagonal element of the hat matrix itself, h ii However, h ii , as discussed before, is a measure of distance from the center of the observed covariate space We can think of a point with a large value of h ii as having high leverage A point that is far from the covariate space, BUT follows closely the same trend of other points, will not be necessarily influential in the model (i.e. if we remove the observation, it will not change our estimates or model-fitting) Lecture 19 – p. 3/32 Influence vs. Deviation Therefore must remember 3 things about diagnostics for influence and model violations Not all outliers are necessarily influential Not all high leverage points are necessarily influential Influential obsevations are not necessarily outliers Lecture 19 – p. 4 /

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Initial criteria for influence Know two coarse ways to detect potentially influential points already Standardized PRESS (or studentized) residuals Magnitude of the hat diagonal What are values of the criteria that should make one suspicious? Lecture 19 – p. 5/32 Initial criteria for influence Outliers in fit: Already discussed tests for t i and how to adjust for multiple comparisons High leverage points: difficult to judge what “reasonable” values should be for h ii However, remember that n i =1 h ii = p (from Chapter 3) Therefore, we can think of p/n as a norm or an average amount of leverage that we should expect Rule of thumb: if h ii > 2 p n , one should be worried that this is a high leverage point (and possibly then highly influential) Be careful for large p Lecture 19 – p. 6 / Influencing the fitted value Reasonable diagnostic for influence would be looking at the difference between what we would get from a fitted value estimated with all n observations and the fitted value obtained from leaving out the i th observation The DFFITS measure then for the i - th observation would be: ( DFFITS ) i = ˆ y i - ˆ y i, - i s - i h ii The denominator standardizes the residual from the jackknifed estimate of the variance of the residual (which standardizes the interpretation of its magnitude) Lecture 19 – p. 7/32 Influencing the fitted value
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 01/15/2010 for the course MATH 423 taught by Professor Steele during the Spring '06 term at McGill.

### Page1 / 8

Lecture 21 slides(Influence) - Inuence vs. Deviation...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online