{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

# trees2 - Classification/Decision Trees(II...

This preview shows pages 1–7. Sign up to view the full content.

Classification/Decision Trees (II) Classification/Decision Trees (II) Jia Li Department of Statistics The Pennsylvania State University Email: [email protected] http://www.stat.psu.edu/ jiali Jia Li http://www.stat.psu.edu/ jiali

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Classification/Decision Trees (II) Right Sized Trees Let the expected misclassification rate of a tree T be R ( T ). Recall the resubstitution estimate for R ( T ) is R ( T ) = t ˜ T r ( t ) p ( t ) = t ˜ T R ( t ) . R ( T ) is biased downward. R ( t ) R ( t L ) + R ( t R ) . Jia Li http://www.stat.psu.edu/ jiali
Classification/Decision Trees (II) Digit recognition example No. Terminal Nodes R ( T ) R ts ( T ) 71 .00 .42 63 .00 .40 58 .03 .39 40 .10 .32 34 .12 .32 19 .29 .31 10 .29 .30 9 .32 .34 7 .41 .47 6 .46 .54 5 .53 .61 2 .75 .82 1 .86 .91 The estimate R ( T ) becomes increasingly less accurate as the trees grow larger. The estimate R ts decreases first when the tree becomes larger, hits minimum at the tree with 10 terminal nodes, and begins to increase when the tree further grows. Jia Li http://www.stat.psu.edu/ jiali

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Classification/Decision Trees (II) Preliminaries for Pruning Grow a very large tree T max . 1. Until all terminal nodes are pure (contain only one class) or contain only identical measurement vectors. 2. When the number of data in each terminal node is no greater than a certain threshold, say 5, or even 1. 3. As long as the tree is su ciently large, the size of the initial tree is not critical. Jia Li http://www.stat.psu.edu/ jiali
Classification/Decision Trees (II) 1. Descendant: a node t is a descendant of node t if there is a connected path down the tree leading from t to t . 2. Ancestor: t is an ancestor of t if t is its descendant. 3. A branch T t of T with root node t T consists of the node t and all descendants of t in T . 4. Pruning a branch T t from a tree T consists of deleting from T all descendants of t , that is, cutting o ff all of T t except its root node. The tree pruned this way will be denoted by T T t . 5. If T is gotten from T by successively pruning o ff branches, then T is called a pruned subtree of T and denoted by T T . Jia Li http://www.stat.psu.edu/ jiali

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Classification/Decision Trees (II) Subtrees Even for a moderate sized T max , there is an enormously large number of subtrees and an even larger number ways to prune the initial tree to them.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 3

trees2 - Classification/Decision Trees(II...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online