dtree-4up - Decision Tree Example Three variables Machine...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Machine Learning CS6375 --- Fall 2010 Decision Tree Learning Reading: Sections 18.2-18.3, R&N Sections 3.1-3.4, Mitchell 2 Decision Tree Example • Three variables: – Attribute 1: Hair = {blond, dark} – Attribute 2: Height = {tall, short} – Class: Country = {Gromland, Polvia} 3 The class of a new input can be classified by following the tree all the way down to a leaf and by reporting the output of the leaf. For example: (B,T) is classified as (D,S) is classified as 4 Decision Trees Decision Trees are classifiers for instances represented as features vectors. Nodes are (equality and inequality) tests for feature values There is one branch for each value of the feature Leaves specify the categories (labels) Can categorize instances into multiple disjoint categories
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
5 General Case (Discrete Attributes) • We have R observations from training data – Each observation has M attributes X 1 ,.., X M – Each X i can take N distinct discrete values – Each observation has a class attribute Y with C distinct (discrete) values – Problem: Construct a sequence of tests on the attributes such that, given a new input ( x’ 1 ,.., x’ M ), the class attribute y is correctly predicted X = attributes of training data ( R x M ) Y = Class of training data ( R ) 6 General Decision Tree (Discrete Attributes) 7 Decision Tree Example 8 The class of a new input can be classified by following the tree all the way down to a leaf and by reporting the output of the leaf. For example: (0.2,0.8) is classified as (0.8,0.2) is classified as
Background image of page 2
9 General Case (Continuous Attributes) • We have R observations from training data – Each observation has M attributes X 1 ,.., X M – Each X i can take N continuous values – Each observation has a class attribute Y with C distinct (discrete) values – Problem: Construct a sequence of tests of the form X i < t i ? on the attributes such that, given a new input ( x’ 1 ,.., x’ M ), the class attribute y is correctly predicted X = attributes of training data ( R x M ) Y = Class of training data ( R ) 10 General Decision Tree (Continuous Attributes) 11 Basic Questions • How to choose the attribute/value to split on at each level of the tree? • When to stop splitting? When should a node be declared a leaf? • If a leaf node is impure, how should the class label be assigned? 12 How to choose the attribute/value to split on at each level of the tree? • Two classes (red circles/green crosses) • Two attributes: X 1 and X 2 • 11 points in training data • Goal: Construct a decision tree such that the leaf nodes predict correctly the class for all the training examples
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
13 How to choose the attribute/value to split on at each level of the tree? 14 This node is “pure” because there is only one class left g No ambiguity in the class label This node is almost “pure” g Little ambiguity in the class label These nodes contain a mixture of classes Do not disambiguate between the classes 15 This node is “pure” because there is only one class left g No ambiguity in the class label This node is almost “pure” g Little ambiguity in the class label
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 17

dtree-4up - Decision Tree Example Three variables Machine...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online