CS 6375 Machine Learning
2009 Spring
Homework 1
Due: 01/28/2009 (tentative), 2:30pm
Part I: Written questions.
30 points.
1.
[15 points]. (based on an exercise from Terran Lane) The following is the training data for a
binary classification task.
Attr 1
Attr 2
Attr 3
Attr 4
class
a
1
c
-1
1
b
0
c
-1
1
a
0
c
1
1
b
1
c
1
1
b
0
c
1
2
a
0
d
-1
2
a
1
d
-1
2
b
1
c
-1
2
Construct a complete (unpruned) decision tree for this data using information gain as your
splitting criterion. Show your work for entropy calculations.
2.
[5 points] Problem 3.1 (b)(c)
in T. Mitchell book.
Give decision trees to represent the following Boolean functions:
(b)
]
[
C
B
A
∧
∨
(c)
A XOR B
3.
[10 points] Paper critique.
Read the paper: A comparative analysis of methods for pruning decision trees, F. Esposito, D.
Malerba, and G. Semeraro, in IEEE Transactions on Pattern Analysis and Machine Learning.
1997.
A critique is not just to summarize the paper (in fact, I expect your summary of the paper should
be short). Instead you should think about issues such as, what are strength and weakness of the
approach in the paper? have the authors conducted appropriate experiments to evaluate it? Do
you have other thoughts about the proposed methods (e.g., is the underlying assumption proper?
Does the method scale? Can it be used in real applications or other research problems?). Do you
have suggestions for a further study? What about the writing of the paper? Are the methods and
experiments clearly described in the paper? ……
You don’t need to answer all of these questions. These are only meant to help you do some
thinking while reading the paper.

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*