Question
Consider the following training set, where X = {0,1}^3 and Y = {0,1}:
((1,1,1),1),((1,0,0),1),((1,1,0),0),((0,0,1),0)
i) (6 Points) Suppose we wish to use this training set in order to build a decision tree of depth 2 (i.e., for each input we are allowed to ask two questions of the form (xi = 0?) before deciding on the label). Construct the decision tree using the greedy algorithm discussed in class. Does it matter whether you use empirical error, entropy, or Gini impurity as your performance metric in this case?
ii) (4 Points) Find a decision tree of depth 2 that attains zero training error.