Chapter09

# Chapter09 - Variable Selection and Model Building STAT 563...

• Notes
• 56

This preview shows pages 1–13. Sign up to view the full content.

Variable Selection and Model Building STAT 563 Spring 2007

This preview has intentionally blurred sections. Sign up to view the full version.

Selection of final equation Two opposing views To make the equation useful for prediction purposes, we want to include as many predictors (original, transformed etc) as possible Because of the costs involved in obtaining information on a large number of transformed variables and subsequently monitoring them, we should like the equation to include as few predictors as possible There is no unique statistical procedure for doing this Compromise via “selecting the best regression equation”
Dataset Supervisor Performance Data Y: overall rating of job being done by supervisor X1: handles employee complaints X2: does not allow special privileges X3: opportunity to learn new things X4: raises based on performance X5: too critical of poor performance X6: rate of advancing to better jobs

This preview has intentionally blurred sections. Sign up to view the full version.

Criteria for evaluating equations Mean square error (MSE) Between two equations, usually the one with smaller MSE is preferred especially the objective is forecasting Recall that MSE related to R 2 . We can use the original R 2 or the adjusted R 2 for judging the adequacy of a fit Recall 1 , + = - = k p p n SSE MSE p p
R-square Recall Note that R 2 (adjusted) is more appropriate when comparing models with different number of predictors because it adjusts (penalizes) for the number of predictors in the model ) 1 ( 1 1 ) 1 ( 1 ) ( 1 2 2 2 p p adj p p R p n n SST MSE n R SST MSE p n R - - - - = - - = - - =

This preview has intentionally blurred sections. Sign up to view the full version.

Mallows C p Recall from the earlier chapter that the mean square error of a fitted value as Define the total squared bias for a p-term equation as ) ˆ ( )] ˆ ( ) ( [ )] ( ˆ [ 2 2 i i i i i y Var y E y E y E y E + - = - Squared of bias Variance component 2 1 )] ˆ ( ) ( [ ) ( i n i i B y E y E p SSE - = =
Mallows C p Define the standardized total mean square error as Recall that we have shown ) ˆ ( 1 ) ( ) ˆ ( )] ˆ ( ) ( [ 1 1 2 2 1 1 2 2 = = = + = + - = Γ n i i B n i n i i i i p y Var p SS y Var y E y E σ σ σ 2 2 1 ) ( ) ( ] [ ) ˆ ( σ σ p n p SS SSE E p y Var B p n i i - + = = =

This preview has intentionally blurred sections. Sign up to view the full version.

Mallows C p Substituting we get Suppose is a good estimate of σ 2 . Then replacing E[SSE p ] by the observed value SSE p produces an estimate of Γ p , say, { } p n SSE E p p n SSE E p p p 2 ] [ ) ( ] [ 1 2 2 2 2 + - = + - - = Γ σ σ σ σ 2 ˆ σ p n SSE C p p 2 ˆ 2 + - = σ
Note If the p-term model has negligible bias, then SS B (p)=0. As a result, Plot C p versus p and look for models with C p values falling near the line C p =p (models with little bias) Generally, small values of C p are desirable Sometimes, it might be desirable to accept some bias in p p n p n Bias C E and p n SSE E p p = + - - = = - = 2 ) ( ] 0 | [ ) ( ] [ 2 2 2 σ σ σ

This preview has intentionally blurred sections. Sign up to view the full version.

This preview has intentionally blurred sections. Sign up to view the full version.

Information Criteria
This is the end of the preview. Sign up to access the rest of the document.
• Spring '07
• Unknown
• Regression Analysis, AIC, Stepwise regression, backward elimination, possible regression, Supervisor Performance Data

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern