Case the gini splitting rule reduces to 2p1 p which is

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: es have the same predicted value No and the split will be discarded since the error  of misclassi cations with and without the split is identical. In the regression context the two predicted values of .05 and .33 are di erent | the split has identi ed a nearly pure subgroup of signi cant size. This setup is known as odds regression, and may be a more sensible way to evaluate a split when the emphasis of the model is on understanding explanation rather than on prediction error per se. Extension of this rule to the multiple class problem is appealing, but has not yet been implimented in rpart. 8 Poisson regression 8.1 De nition The Poisson splitting method attempts to extend rpart models to event rate data. The model in this case is  = f x where  is an event rate and x is some set of predictors. As an example consider hip fracture rates. For each county in the United States we can obtain number of fractures in patients age 65 or greater from Medicare les population of the county US census data potential predictors such as socio-economic indicators number of days below freezing ethnic mix physicians 1000 population 35 etc. Such data would usually be approached by using Poisson regression; can we nd a tree based analogue? In adding criteria for rates regression to this ensemble, the guiding principle was the following: the between groups sum-of-squares is not a very robust measure, yet tree based regression works very well. So do the simplest thing possible. Let ci be the observed event count for observation i, ti be the observation time, and xij ; j = 1; : : : ; p be the predictors. The y variable for the program will be a 2 column matrix. Splitting criterion: The likelihood ratio test for two Poisson groups Dparent , Dleft son + Dright son Summary statistics: The observed event rate and the number of events. P ^ =  events = P ci  ti total time Error of a node: The within node deviance. D= X "  ! ^ ci log ^ci , ci , ti ti  ^ Prediction error: The deviance contribution for a new observation, using  of the node as the predicted rate. 8.2 Improving the method There is a problem with the criterion just proposed, however: cross-validation of a model often produces an in nite value for the deviance. The simplest case where this occurs is easy to understand. Assume that some terminal node of the tree has 20 subjects, but only 1 of the 20 has experienced any events. The cross-validated error deviance estimate for that node will have one subset | the one where the ^ subject with an event is left out | which has  = 0. When we use the prediction for the 10 of subjects who were set aside, the deviance contribution of the subject with an event is : : : + ci logci =0 + : : : ^ which is in nite since ci 0. The problem is that when  = 0 the occurrence of an event is in nitely improbable, and, using the deviance measure, the corresponding model is then in nitely bad. 36 One might expect this phenomenon to be fairly rare, but unfortu...
View Full Document

This document was uploaded on 09/26/2013.

Ask a homework question - tutors are online