{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

STP452topic5

# STP452topic5 - STAT 512 Applied Regression Analysis Topic 5...

This preview shows pages 1–3. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STAT 512: Applied Regression Analysis Topic 5 Spring 2008 Topic Overview (Ch9 & 10) • Model Selection • Diagnostics & Remedial Measures • In uential Observations & Outliers Chapter 9: Variable Selection and Model Building We usually want to choose a model that includes a subset of the available explanatory variables. Two separate but related questions: • How many explanatory variables should we use (i.e., subset size)? Smaller sets are more convenient, but larger sets may explain more of the variance (SS) in the response. • Given the subset size, which variables should we choose? Criteria for Model Selection To determine an appropriate subset of the predictor variables, there are several di erent criteria available. We will go through them one at a time, noting their bene ts and draw- backs. They include R 2 , adjusted R 2 , Mallow's C p , MSE, PRESS, AIC, SBC . SAS will provide these statistics, so you should pay more attention to what they are good for than how they are computed. To obtain them from SAS, place after the model statement /selection = maxr adjrsq cp . Note that the di erent criterion may not lead to the same model in every case. R 2 and Adjusted R 2 (or MSE) Criterion 1 • The text uses R 2 p = R 2 = 1- SSE/SST . Their subscript is just the number of variables in the associated model. • The goal in model selection is to maximize this criterion. One MAJOR drawback to R 2 is that the addition of any variable to the model (signi cant or not) will increase R 2 (perhaps not enough to notice depending on the variable). At some point, added variables just get in the way! • The Adjusted R 2 criterion penalizes the R 2 value based on the number of variables in the model. Hence it eventually starts decreasing as unnecessary variables are added R 2 a = 1- n- 1 n- p SSE SST (we end up substracting o more as p is increased) • Maximizing the Adjusted R 2 criterion is one way to select a model. As the text points out this is equivalent to minimizing the MSE since R 2 a = 1- n- 1 n- p SSE SST = 1- MSE SST/ ( n- 1) = 1- MSE constant . Mallow's C p Criterion • The basic idea is to compare subset models with the full model. • The full model is good at prediction, but if there is multicollinearity our interpretations of the parameter estimates may not makes sense. A subset model is good if there is not substantial bias in the predicted values (relative to the full model). • The C p criterion looks at the ratio of error SS for the model with p variables to the MSE of the full model, then adds a penalty for the number of variables. C p = SSE p MSE ( Full )- ( n- 2 p ) • SSE is based on a speci c choice of p- 1 variables ( p is the number of regression coe cients including the intercept); while MSE is based on the full set of variables....
View Full Document

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern