This preview shows pages 1–7. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chapter 10: More diagnostics Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 29 PRESS p criterion PRESS p = n X i =1 ( Y i Y i ( i ) ) 2 = n X i =1 e i 1 h ii 2 ! , where Y i ( i ) is the fitted value at x i with the ( x i , Y i ) omitted. This is leaveoneout prediction error. The smaller, the better. Having PRESS p SSE p supports the validity of the model with p predictors (p. 374). Note that always PRESS p > SSE p , but when theyre (reasonably) close, that means that there are not just a handful of points driving all inference. 2 / 29 9.5 Caveats for automated procedures proc reg can give you the the, say, three best subsets according to C p containing one variable, two variables, etc. Need to define interactions & quadratic terms by hand. Cannot do it heirarchically. Best to do when number of predictors is small to moderate. proc glmselect does a great job with stepwise procedures but cannot do best subsets. Good to use when theres lots of predictors. There is no best way to search for good models. There may be several good models. If you use the same data to estimate the model and choose the model, the regression effects are biased ! This leads to the idea of data splitting; one portion of the data is the training data and the other portion is the validation set (Section 9.6, p. 372). PRESS p can also be used. 3 / 29 Diagnostics we have already discussed Residuals e i vs. each x 1 ,..., x k and e i vs. Y i . Normal probability plot of e 1 ,..., e n . Y i vs. Y i . What to look for? VIF j for j = 1 ,..., k . Now well discuss added variable plots, leverages, dffits, and Cooks distance. 4 / 29 10.1 Added variable plots Residuals e i versus predictors can show whether a predictor may need to be transformed or whether we should add a quadratic term. We can omit the predictor from the model and plot the residuals e i versus the predictor to see if the predictor explains residual variability. Your book suggests doing this for interactions. An added variable plot refines this idea. Answers question: Does x j explain any residual variability once the rest of the predictors are in the model? 5 / 29 10.1 Added variable plots Consider a pool of predictors x 1 ,..., x k . Lets consider predictor x j where j = 1 ,..., k . Regress Y i vs. all predictors except x j , call the residuals e i ( Y  x j ). Regress x j vs. all predictors except x j , call the residuals e i ( x j  x j ). The added variable plot for x j is e i ( Y  x j ) vs. e i ( x j  x j ). The least squares estimate b j obtained from fitting a line (through the origin) to the plot is the same as one would get from fitting the full model Y i = + 1 x i 1 + k x ik + i (Christensen, 1996)....
View Full
Document
 Fall '11
 Staff
 Statistics

Click to edit the document details