Unformatted text preview: ceparent , Devianceleft + Devianceright , which is
the likelihood ratio test for comparing two Poisson samples.
The cross-validated error has been found to be overly pessimistic when describing how much the error is improved by each split. This is likely an e ect
of the boundary e ect mentioned earlier, but more research is needed.
The variation xstd is not as useful, given the bias of xerror.
fit.prune - prunefit,cp=.15
textfit.prune,use.n=T 40 The use.n=T option speci es that number of events total N should be listed
along with the predicted rate number of events person-years. The function prune
trims the tree fit to the cp value 0:15. The same tree could have been created by
specifying cp = .15 in the original call to rpart. 8.4 Example: Stage C prostate cancer survival method One special case of the Poisson model is of particular interest for medical consulting
such as the authors do. Assume that we have survival data, i.e., each subject has
either 0 or 1 event. Further, assume that the time values have been pre-scaled so as
to t an exponential model. That is, stretch the time axis so that a Kaplan-Meier
plot of the data will be a straight line when plotted on the logarithmic scale. An
approximate way to do this is
temp - coxphSurvtime, status ~1
newtime - predicttemp, type='expected' and then do the analysis using the newtime variable. This replaces each time value
by t, where is the cumulative hazard function.
A slightly more sophisticated version of this which we will call exponential scaling
gives a straight line curve for logsurvival under a parametric exponential model.
The only di erence from the approximate scaling above is that a subject who is
censored between observed death times will receive credit" for the intervening interval, i.e., we assume the baseline hazard to be linear between observed deaths. If
the data is pre-scaled in this way, then the Poisson model above is equivalent to
the local full likelihood tree model of LeBlanc and Crowley 3 . They show that this
model is more e cient than the earlier suggestion of Therneau et. al. 6 to use the
martingale residuals from a Cox model as input to a regression tree anova method.
Exponential scaling or method='exp' is the default if y is a Surv object.
Let us again return to the stage C cancer example. Besides the variables explained previously we will use pgtime, which is time to tumor progression.
fit - rpartSurvpgtime, pgstat ~ age + eet + g2 + grade +
gleason + ploidy, data=stagec
node, split, n, deviance, yval
* denotes terminal node
1 root 146 195.30 1.0000
2 grade 2.5 61 44.98 0.3617
4 g2 11.36 33
9.13 0.1220 *
5 g2 11.36 28 27.70 0.7341 * 41 3 grade 2.5 85 125.10 1.6230
6 age 56.5 75 104.00 1.4320
12 gleason 7.5 50 66.49 1.1490
24 g2 13.475 25 29.10 0.8817 *
25 g2 13.475 25 36.05 1.4080
50 g2 17.915 14 18.72 0.8795 *
51 g2 17.915 11 13.70 2.1830 *
13 gleason 7.5 25 34.13 2....
View Full Document
This document was uploaded on 09/26/2013.
- Fall '13