5 Diagnostics and Influence

5 Diagnostics and Influence - Statistics 191: Introduction...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Statistics 191: Introduction to Applied Statistics Diagnostics & Influence Jonathan Taylor Department of Statistics Stanford University January 26, 2010 1 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Diagnostics in multiple linear model Outline Diagnostics – again Different residuals Influence Outlier detection Residual plots: partial regression (added variable) plot, partial residual (residual plus component) plot 2 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Scottish hill races data Description Variable Description Time Record time to complete course Distance Distance in the course Climb Vertical climb in the course 3 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Scottish hill races data R code 4 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Diagnostics What can go wrong? Regression function can be wrong: maybe regression function should be quadratic (see R code ). Model for the errors may be incorrect: may not be normally distributed. may not be independent. may not have the same variance. Detecting problems is more art then science , i.e. we cannot test for all possible problems in a regression model. Basic idea of diagnostic measures: if model is correct then residuals e i = Y i- b Y i , 1 ≤ i ≤ n should look like a sample of (not quite independent) N (0 ,σ 2 ) random variables. 5 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Standard diagnostic plots R code 6 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Problems with the errors Possible problems & diagnostic checks Errors may not be normally distributed or may not have the same variance – qqnorm can help with this. This may not be too important in large samples....
View Full Document

This document was uploaded on 03/16/2010.

Page1 / 28

5 Diagnostics and Influence - Statistics 191: Introduction...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online