{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Rpart_TechReport61

# Form li j 8 li i 6 j 0 ij in which case l i

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lem. For P arbitrary loss matrix of dimension C 2, rpart uses the above formula with Li = j Li; j . P A second justi cation for altered priors is this. An impurity index I A = f pi has its maximum at p1 = p2 = : : : = pC = 1=C . If a problem had, for instance, a misclassi cation loss for class 1 which was twice the loss for a class 2 or 3 observation, one would wish IA to have its maximum at p1 =1 5, p2 = p3 =2 5, since this is the worst possible set of proportions on which to decide a node's class. The altered priors technique does exactly this, by shifting the pi . Two nal notes When altered priors are used, they a ect only the choice of split. The ordinary losses and priors are used to compute the risk of the node. The altered priors simply help the impurity rule choose splits that are likely to be good" in terms of the risk. The argument for altered priors is valid for both the gini and information splitting rules. 3.3 Example: Stage C prostate cancer class method This rst example is based on a data set of 146 stage C prostate cancer patients 4 . The main clinical endpoint of interest is whether the disease recurs after initial surgical removal of the prostate, and the time interval to that progression if any. The endpoint in this example is status, which takes on the value 1 if the disease has progressed and 0 if not. Later we'll analyze the data using the exponential exp method, which will take into account time to progression. A short description of each of the variables is listed below. The main predictor variable of interest in this study was DNA ploidy, as determined by ow cytometry. For diploid and tetraploid tumors, the ow cytometric method was also able to estimate the percent of tumor cells in a G2 growth stage of their cell cycle; G2  is systematically missing for most aneuploid tumors. The variables in the data set are 9 grade<2.5 | g2<13.2 No ploidy:ab g2>11.845 g2<11.005 g2>17.91 Prog No age>62.5 Prog No Prog No Prog Figure 3: Classi cation tree for the Stage C data pgtime time to progression, or last follow-up free of progression pgstat status at last follow-up 1=progressed, 0=censored age age at diagnosis eet early endocrine therapy 1=no, 0=yes ploidy diploid tetraploid aneuploid DNA pattern g2  of cells in G2 phase grade tumor grade 1-4 gleason Gleason grade 3-10 The model is t by using the rpart function. The rst argument of the function is a model formula, with the  symbol standing for is modeled as". The print function gives an abbreviated output, as for other S models. The plot and text command plot the tree and then label the plot, the result is shown in gure 3. progstat - factorstagec\$pgstat, levels=0:1, labels=c"No", "Prog" cfit - rpartprogstat age + eet + g2 + grade + gleason + ploidy, data=stagec, method='class' printcfit  node, split, n, loss, yval, yprob * denotes terminal node 1 root 146 54 No  0.6301 0.3699  10 2 grade 2.5 61 9 No  0.8525 0.1475  * 3 grade...
View Full Document

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern