Criterion based procedures 1 information criterion

Info icon This preview shows pages 53–56. Sign up to view the full content.

Criterion-based procedures 1. information criterion statistics: Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC) (or called Schwartz’s Bayesian Criterion (SBC)), and Amemiya’s Prediction Criterion (APC) AIC = n ln( SSE ) n ln n + 2 p BIC = n ln( SSE ) n ln n + p ln( n ) APC = n + p n ( n p ) SSE PAGE 53
Image of page 53

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

2.6 Model Building c circlecopyrt HYON-JUNG KIM, 2017 where n : sample size and p : number of regression coefficients in the model being evaluated including the intercept. Notice that the only difference between AIC and BIC is the multiplier of p . Each of the information criteria is used in a similar way in comparing two models, the model with the lower value is preferred. The BIC places a higher penalty on the number of parameters in the model so will tend to reward more parsimonious (smaller) models. This stems from one criticism of AIC in that it tends to overfit models. 2. PRESS (prediction residual sum of squares): PRESS = n summationdisplay i =1 ( e ( i ) ) 2 where e ( i ) ’s are the residuals calculated omitting i th observation in the regression fit. The model with the lowest PRESS value is selected. This tends to pick larger models (which may be desirable if prediction is the objective). 3. Mallows’ C p -statistic It estimates the size of the bias that is introduced into the predicted responses by having an underspecified model. The bias in predicted response: Bias i = E ( ˆ Y i ) E ( Y i ) and the average MSE of prediction: 1 σ 2 E ( ˆ Y i E ( Y i )) 2 which can be estimated by the C p statistic: C p = p + ( MSE p MSE all )( n p ) MSE all = SSE p MSE all + 2 p n where MSE p is the mean squared error from fitting the model containing the subset of p 1 predictors ( p parameters with the intercept). and MSE all is the mean squared error obtained from fitting the model containing all of the candidate predictors. PAGE 54
Image of page 54
2.6 Model Building c circlecopyrt HYON-JUNG KIM, 2017 Using C p to identify ‘best’ models: - Identify subsets of predictors for which the C p value is near p (if possible). The full model always yields C p = p , so don’t select the full model based on C p . - If all models, except the full model, yield a large C p not near p , it suggests some important predictor(s) are missing from the analysis. - If a number of models have C p near p , choose the model with the smallest C p value, thereby insuring that the combination of the bias and the variance is at a minimum. - When more than one model has a small value of C p value near p , in general, choose the simpler model or the model that meets your research needs. Cross-validation It is a model validation technique with which the regression equation of the model fit is used to the original dataset to make predictions for the new dataset. Then, we can calculate the prediction errors (differences between the actual response values and the predictions) and summarize the predictive ability of the model by the mean squared prediction error (MSPE). This gives an indication of how well the model will predict in the future.
Image of page 55

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 56
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern