lecture8 - ISYE6414 Summer 2010 Lecture 8 Multiple...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
ISYE6414 Summer 2010 Lecture 8 Multiple Regression Even More Features Dr. Kobi Abayomi June 24, 2010 1 Introduction 2 Model Selection Often a statistician will have data on a larger number of predictors and then wish to build a regression model involving a subset of the predictors. The use of a reduced subset yields a parsimonious model that can be easier to interpret and understand. The fundamental concerns in subset selection for a reduced model are the selection criteria, and a model search procedure. 2.1 Testing Procedures For Variable Selection Remember that we used k as an index for the number of predictors in the linear model. For a fixed value of k , it is reasonable to identify the best model as one having the minimum SSE k . The more difficult issue concerns comparison of SSE k ’s for different values of k . Three common criteria are: 2.1.1 Stepwise Regression When the number of predictors is too large to allow for explicit examination of all possible subsets, we look to alternative selection procedures. The simplest is 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2.1.2 Backward Elimination We start with the largest model with all m predictors. We use each t statistic as the criteria for exclusion of the model. If the t statistic with the smallest absolute values is less than a pre-specified constant (set by our α level) then it corresponding predictor is deleted from the model. That is, when min i abs ( ˆ β i s.e. β i ) < t * we delete the predictor and fit the new model and repeat. Often, t * is set to 2 (Why?). 2.1.3 Forward Selection Forward selection starts with no predictors in the model and enters predictors x j into the model one at a time. For example, suppose x 1 enters the model at iteration 1. Then models ( x 1 ,x 2 ) ,..., ( x 1 ,x m ) are fit and then max i abs ( ˆ β i s.e. β i ) > t * is used to enter the next predictor in the model. Forward Backward A combination of backward elimination and forward selection. The procedure starts by adding variables to the model, but after each addition examines all variables for elimination. The rule is max i abs ( ˆ β i s.e. β i ) > t * * to add a predictor and then min j 6 = i abs ( ˆ β i s.e. β i ) < t * to remove one. Here, t * and t * * are not necessarily equal. 2
Background image of page 2
The idea behind Forward Backward (FB) selection is that a single variable may be more strongly related to the response y than either of two or more other variables individually, but the combination of these variables may make the single variable subsequently redundant. ####census data from the 1970s statedata<-read.csv(file="statedata.csv") names(statedata) g <- lm(Life.Exp ~ ., data=statedata) summary(g) ### /why isn’t income significant? ###let’s illustrate backward selection ?update g <- lm(Life.Exp ~ ., data=statedata) summary(g) g <- update(g, . ~ . - Illiteracy) summary(g) g <- update(g, . ~ . - Income) summary(g) g <- update(g, . ~ . - Population) summary(g) ###R^2 only went down to .713 in final model
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/01/2011 for the course ISYE 6414 taught by Professor Staff during the Fall '08 term at Georgia Institute of Technology.

Page1 / 12

lecture8 - ISYE6414 Summer 2010 Lecture 8 Multiple...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online