{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

diagnostics - Chapter 10 More diagnostics Timothy Hanson...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 10: More diagnostics Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 29
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
PRESS p criterion PRESS p = n X i =1 ( Y i - ˆ Y i ( i ) ) 2 = n X i =1 e i 1 - h ii 2 ! , where ˆ Y i ( i ) is the fitted value at x i with the ( x i , Y i ) omitted. This is leave-one-out prediction error. The smaller, the better. Having PRESS p SSE p supports the validity of the model with p predictors (p. 374). Note that always PRESS p > SSE p , but when they’re (reasonably) close, that means that there are not just a handful of points driving all inference. 2 / 29
Background image of page 2
9.5 Caveats for automated procedures proc reg can give you the the, say, three best subsets according to C p containing one variable, two variables, etc. Need to define interactions & quadratic terms by hand. Cannot do it heirarchically. Best to do when number of predictors is small to moderate. proc glmselect does a great job with stepwise procedures but cannot do best subsets. Good to use when there’s lots of predictors. There is no “best” way to search for good models. There may be several “good” models. If you use the same data to estimate the model and choose the model, the regression effects are biased ! This leads to the idea of data splitting; one portion of the data is the training data and the other portion is the validation set (Section 9.6, p. 372). PRESS p can also be used. 3 / 29
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Diagnostics we have already discussed Residuals e i vs. each x 1 , . . . , x k and e i vs. ˆ Y i . Normal probability plot of e 1 , . . . , e n . Y i vs. ˆ Y i . What to look for? VIF j for j = 1 , . . . , k . Now we’ll discuss added variable plots, leverages, dffits, and Cook’s distance. 4 / 29
Background image of page 4
10.1 Added variable plots Residuals e i versus predictors can show whether a predictor may need to be transformed or whether we should add a quadratic term. We can omit the predictor from the model and plot the residuals e i versus the predictor to see if the predictor explains residual variability. Your book suggests doing this for interactions. An added variable plot refines this idea. Answers question: Does x j explain any residual variability once the rest of the predictors are in the model? 5 / 29
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
10.1 Added variable plots Consider a pool of predictors x 1 , . . . , x k . Let’s consider predictor x j where j = 1 , . . . , k . Regress Y i vs. all predictors except x j , call the residuals e i ( Y | x - j ). Regress x j vs. all predictors except x j , call the residuals e i ( x j | x - j ). The added variable plot for x j is e i ( Y | x - j ) vs. e i ( x j | x - j ). The least squares estimate b j obtained from fitting a line (through the origin) to the plot is the same as one would get from fitting the full model Y i = β 0 + β 1 x i 1 + · · · β k x ik + i (Christensen, 1996). Gives an idea of the functional form of x j : a transformation in x j should mimic the pattern seen in the plot; the methods of Section 3.9 apply.
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}