This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ion Trees (I) Denote the prior probability of class j by j . The priors j can be estimated from the data by Nj /N. Sometimes priors are given beforehand. The estimated probability of a sample in class j going to node t is p(t  j) = Nj (t)/Nj . p(tL  j) + p(tR  j) = p(t  j). For a full tree, the sum of p(t  j) over all t's at the same level is 1. Jia Li http://www.stat.psu.edu/jiali Classification/Decision Trees (I) The joint probability of a sample being in class j and going to node t is thus: p(j, t) = j p(t  j) = j Nj (t)/Nj . The probability of any sample going to node t is: p(t) =
K j=1 p(j, t) = K j=1 j Nj (t)/Nj . Note p(tL ) + p(tR ) = p(t). Jia Li http://www.stat.psu.edu/jiali The probability of a sample being in class j given that it goes to node t is: p(j  t) = p(j, t)/p(t) . K For any t, j=1 p(j  t) = 1. Classification/Decision Trees (I) When j = Nj /N, we have the following simplification: p(j  t) = Nj (t)/N(t). p(t) = N(t)/N. p(j, t) = Nj (t)/N. Jia Li http://www.stat.psu.edu/jiali Classification/Decision Trees (I) Stopping Criteria A simple criteria: stop splitting a node t when max I (s, t) < ,
sS where is a chosen threshold. The above stopping criteria is unsatisfactory. A node with a small decrease of impurity after one step of splitting may have a large decrease after multiple levels of splits. Jia Li http://www.stat.psu.edu/jiali Classification/Decision Trees (I) Class Assignment Rule A class assignment rule assigns a class j = {1, ..., K } to every ~ ~ terminal node t T . The class as...
View Full
Document
 Fall '09
 JIALI
 Statistics

Click to edit the document details