Lecture8

# Lecture8 - Lecture 8. Model Selection Model Building can be...

This preview shows pages 1–3. Sign up to view the full content.

Lecture 8. Model Selection Model Building can be thought of as a multi-step process: 1. Data collection and preparation. 2. Model estimation. 3. Model reﬁnement and selection. 4. Model validation. We have already discussed a few techniques how to choose important predictors, e.g. t -, F -statistics and avoiding multicollinearity. Here is a list of most popular model selection procedures: R-squared R 2 p = 1 - SSE p SST Adjusted R-squared R 2 adj . p = 1 - SSE p / (n - p - 1) SST / (n - 1) = 1 - n - 1 n - p - 1 (1 - R 2 p ). The adjusted R 2 p has a penalty term for each regressor and does not necessarily increase with adding a new regressor. Hence, the adjusted R 2 p is preferred over R 2 p . Mallows C p Criterion C p = SSE p s 2 - [ n - 2( p + 1)] If the candidate model is adequate, SSE p is an estimate of ( n - p - 1) σ 2 . Hence, C p p + 1 in this case. If the model is inadequate, then SSE p > (n - p - 1) σ 2 and C p > p + 1. Hence, we search for models with C p value being small and C p p + 1. When C p is small, the mean squared error is small. Also when C p p + 1, bias of the regression model is small. Akaike Information Criterion (AIC) AIC p = nlog SSE p / n + 2(p + 1) Rule of thumb: smaller AIC is better. Bayesian Information Criterion (AIC) BIC p = n log SSE p / n + 2 log n(p + 1) Rule of thumb: smaller BIC is better. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Prediction Sum of Squares (PRESS) The prediction sums of squares is a measure of how well the model can predict the observed responses y i . Recall that e ( i ) = y i - y * ( i ) , where y * ( i ) = x 0 i ˆ β ( i ) . Then PRESS p = n i=1 e (i) = n i=1 ( e i 1 - h ii ) 2 . Rule of thumb: smaller PRESS is better. There exists a number of automatic procedures for model selection: Forward Stepwise (FS) Regression This procedures goes through a step-by-step process of adding variables until the best model is produced based on your search criteria. At each step, an F test, AIC or BIC are performed to determine if that variable is appropriate. In particular, 1. begin with the SLR model with that single predictor that has the highest sample correlation with the response Y ; 2. add to the model that predictor that meets three equivalent criteria: (a) it has the highest sample partial correlation in absolute value with response, adjusting for the predictors in the equation already, (b) adding the variable will increase R 2 more than any other single variable; (c) the variable added would have the largest t - or F -statistic of any of the variables that are not already in the model; 3. continue until a stopping rule is met, where possible rules are: (a) stop with a subset of a predetermined size p * ; (b) stop if the absolute value of the t -statistic (or alternatively F -statistic) is less than some predetermined number κ (or κ 2 for F -statistic); (c) stop when multicollinearity occurred; Note: you can use AIC, BIC or C p instead of F . Backward Stepwise (BS) Regression.
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 01/12/2012 for the course STAT 331 taught by Professor Yuliagel during the Spring '08 term at Waterloo.

### Page1 / 8

Lecture8 - Lecture 8. Model Selection Model Building can be...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online