This preview shows pages 1–10. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Variable Selection and Model Building STAT 563 Spring 2007 Selection of final equation Two opposing views To make the equation useful for prediction purposes, we want to include as many predictors (original, transformed etc) as possible Because of the costs involved in obtaining information on a large number of transformed variables and subsequently monitoring them, we should like the equation to include as few predictors as possible There is no unique statistical procedure for doing this Compromise via selecting the best regression equation Dataset Supervisor Performance Data Y: overall rating of job being done by supervisor X1: handles employee complaints X2: does not allow special privileges X3: opportunity to learn new things X4: raises based on performance X5: too critical of poor performance X6: rate of advancing to better jobs Criteria for evaluating equations Mean square error (MSE) Between two equations, usually the one with smaller MSE is preferred especially the objective is forecasting Recall that MSE related to R 2 . We can use the original R 2 or the adjusted R 2 for judging the adequacy of a fit Recall 1 , + = = k p p n SSE MSE p p Rsquare Recall Note that R 2 (adjusted) is more appropriate when comparing models with different number of predictors because it adjusts (penalizes) for the number of predictors in the model ) 1 ( 1 1 ) 1 ( 1 ) ( 1 2 2 2 p p a d j p p R p n n S S T M S E n R S S T M S E p n R = = = Mallows C p Recall from the earlier chapter that the mean square error of a fitted value as Define the total squared bias for a pterm equation as ) ( )] ( ) ( [ )] ( [ 2 2 i i i i i y Var y E y E y E y E + = Squared of bias Variance component 2 1 )] ( ) ( [ ) ( i n i i B y E y E p SSE = = Mallows C p Define the standardized total mean square error as Recall that we have shown ) ( 1 ) ( ) ( )] ( ) ( [ 1 1 2 2 1 1 2 2 = = = + = + = n i i B n i n i i i i p y Var p SS y Var y E y E 2 2 1 ) ( ) ( ] [ ) ( p n p SS SSE E p y Var B p n i i + = = = Mallows C p Substituting we get Suppose is a good estimate of 2 . Then replacing E[SSE p ] by the observed value SSE p produces an estimate of p , say, { } p n SSE E p p n SSE E p p p 2 ] [ ) ( ] [ 1 2 2 2 2 + = + = 2 p n SSE C p p 2 2 + = Note If the pterm model has negligible bias, then SS B (p)=0. As a result, Plot C p versus p and look for models with C p values falling near the line C p =p (models with...
View
Full
Document
This note was uploaded on 03/08/2009 for the course 960 563 taught by Professor Unknown during the Spring '07 term at Rutgers.
 Spring '07
 Unknown

Click to edit the document details