# Homework 3.docx - 1 The following table summarizes a data...

• Homework Help
• 5
• 100% (19) 19 out of 19 people found this document helpful

This preview shows page 1 - 3 out of 5 pages.

1. The following table summarizes a data set with three attributes A, B. C and two class labels *, -. Build a two-level decision tree. A B C Number of Instances - + T T T 5 0 F T T 0 10 T F T 10 0 F F T 0 5 T T F 0 10 F T F 25 0 T F F 10 0 F F F 0 25 a. According to the classification error rate, which attribute would be chosen as the first splitting attribute? For each attribute, show the contingency table and the gains in classification error rate. b. Repeat for the two children of the root node. c. How many instances are misclassified by the resulting decision tree? d. Repeat parts (a), (b), and (c) using C as the splitting attribute. e. Use the results in parts (c) and (d) to conclude about the greedy nature of the decision tree induction algorithm. Answer:
- 15 35 E C=T =1-max (15/30, 15/30) =15/30 E C=F =1-max (35/70, 35/70) =35/70 c =E or -30/100 E c=T -70/100 E c=F =0/100=0 So, the algorithm chooses A because it has highest gain.