This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 0 missing
Surrogate splits:
gleason
5.5
to the left, agree=0.8630, 0 split
ploidy splits as LRR, agree=0.6438, 0 split
g2
9.945 to the left, agree=0.6301, 0 split
age
66.5 to the right, agree=0.5890, 0 split
Node number 2: 61 observations
predicted class= No expected loss= 0.1475
class counts: 52 9
probabilities: 0.8525 0.1475
Node number 3: 85 observations,
complexity param=0.1049
predicted class= Prog expected loss= 0.4706
class counts: 40 45
probabilities: 0.4706 0.5294
left son=6 40 obs right son=7 45 obs
Primary splits:
g2
13.2 to the left, improve=2.1780, 6 missing
ploidy splits as LRR, improve=1.9830, 0 missing
age
56.5 to the right, improve=1.6600, 0 missing 19 gleason
8.5
to
eet
1.5
to
Surrogate splits:
ploidy splits as
age
68.5 to
gleason
6.5
to
.
.
. the left, improve=1.6390, 0 missing
the right, improve=0.1086, 1 missing
LRL, agree=0.9620, 6 split
the right, agree=0.6076, 0 split
the left, agree=0.5823, 0 split There are 54 progressions class 1 and 92 nonprogressions, so the rst node
has an expected loss of 54=146 0:37. The computation is this simple only
for the default priors and losses.
Grades 1 and 2 go to the left, grades 3 and 4 to the right. The tree is arranged
so that the more severe" nodes go to the right.
The improvement is n times the change in impurity index. In this instance,
the largest improvement is for the variable grade, with an improvement of
10.36. The next best choice is Gleason score, with an improvement of 8.4.
The actual values of the improvement are not so important, but their relative
size gives an indication of the comparitive utility of the variables.
Ploidy is a categorical variable, with values of diploid, tetraploid, and aneuploid, in that order. To check the order, type tablestagec$ploidy. All
three possible splits were attempted: anueploid+diploid vs. tetraploid, anueploid+tetraploid vs. diploid, and anueploid vs. diploid + tetraploid. The best
split sends diploid to the right and the others to the left node 6, see gure
3.
For node 3, the primary split variable is missing on 6 subjects. All 6 are split
based on the rst surrogate, ploidy. Diploid and aneuploid tumors are sent to
the left, tetraploid to the right.
g2 13.2 g2 13.2 NA
Diploid aneuploid
33
2
5
Tetraploid
1
43
1 6 Further options
6.1 Program options The central tting function is rpart, whose main arguments are
20 : the model formula, as in lm and other S model tting functions. The
right hand side may contain both continuous and categorical factor terms.
If the outcome y has more than two levels, then categorical predictors must
be t by exhaustive enumeration, which can take a very long time.
data, weights, subset: as for other S models. Weights are not yet supported,
and will be ignored if present.
method: the type of splitting rule to use. Options at this point are classi cation,
anova, Poisson, and exponential.
parms: a list of method speci c optional parameters. For classi cation, the list...
View Full
Document
 Fall '13

Click to edit the document details