# 15.Module6_decisiontree-updated_14.pptx - Decision Tree 1...

• 20

This preview shows page 1 - 7 out of 20 pages.

Decision Tree 1
Introduction A classification scheme which generates a tree and a set of rules from given data set. The set of records available for developing classification methods is divided into two disjoint subsets - a training set and a test set . The attributes of the records are categorise into two types: Attributes whose domain is numerical are called numerical attributes. Attributes whose domain is not numerical are called the categorical attributes . 2
Introduction A decision tree is a tree with the following properties: An inner node represents an attribute . An edge represents a test on the attribute of the father node. A leaf represents one of the classes . Construction of a decision tree Based on the training data Top-Down strategy 3
Decision Tree Example The data set has five attributes. There is a special attribute: the attribute class is the class label. The attributes, temp (temperature) and humidity are numerical attributes Other attributes are categorical, that is, they cannot be ordered. Based on the training data set, we want to find a set of rules to know what values of outlook, temperature, humidity and wind, determine whether or not to play golf. 4
Decision Tree Example We have five leaf nodes. In a decision tree, each leaf node represents a rule. We have the following rules corresponding to the tree given in Figure. RULE 1 If it is sunny and the humidity is not above 75%, then play RULE 2 If it is sunny and the humidity is above 75%, then do not play. RULE 3 If it is overcast, then play. RULE 4 If it is rainy and not windy, then play. RULE 5 If it is rainy and windy, then don't play. 5
Classification The classification of an unknown input vector is done by traversing the tree from the root node to a leaf node. A record enters the tree at the root node. At the root, a test is applied to determine which child node the record will encounter next. This process is repeated until the record arrives at a leaf node. All the records that end up at a given leaf of the tree are classified in the same way.