CS446: Machine Learning
Fall 2012
Problem Set 2
Handed Out: September 17
2014
Due: September 25,
2014
•
Feel free to talk to other members of the class in doing the homework.
I am more concerned that
you learn how to solve the problem than that you demonstrate that you solved it entirely on your
own. You should, however, write down your solution yourself. Please try to keep the solution brief
and clear.
•
Please use Piazza first if you have questions about the homework. Also feel free to send us e-mails
and come to office hours.
•
Please, no handwritten solutions. You will submit your solution manuscript as a single pdf file.
•
A large portion of this assignment deals with programming decision trees to see them in action. While
we do provide several pieces of code, you are required to try and test several decision tree algorithms
by writing your own code. While we encourage discussion within and outside the class,
cheating and
copying code is strictly prohibited. Copied code will result in the entire assignment being discarded
at the very least
.
•
The homework is due at 11:59 PM on the due date.
We will be using Compass for collecting
the homework assignments.
Please submit your solution manuscript as a pdf file via Compass
(
). Please do NOT hand in a hard copy of your write-up. Contact
the TAs if you are having technical difficulties in submitting the assignment.
1.Learning Decision Trees – 20 pointsFor this question, you will manually induce a decision tree from a small data set.Table 1 shows theBalloonsdata set from the UCI Machine Learning repository thatwas first used for an experiment in cognitive psychology1. The data consists of fourattributes (Color,Size,Act, andAge) and a binary label (Inflated).You willrepresent this data as decision trees using two splitting heuristics.(a)[7 points]Use the ID3 heuristic to represent the data as a decision tree.You can report the decision tree as a series ofif-thenstatements in the text, orgraphically, it’s your choice.Example:if feature_0 = x :if feature_1 = y :class = Telse :class = Felse:if feature ......
1
You can learn more about this data set at
1

Color
Size
Act
Age
Inflated
Yellow
Small
Stretch
Adult
T
Yellow
Small
Stretch
Child
T
Yellow
Small
Dip
Adult
T
Yellow
Small
Dip
Child
T
Yellow
Large
Stretch
Adult
T
Yellow
Large
Stretch
Child
F
Yellow
Large
Dip
Adult
F
Yellow
Large
Dip
Child
F
Purple
Small
Stretch
Adult
T
Purple
Small
Stretch
Child
F
Purple
Small
Dip
Adult
F
Purple
Small
Dip
Child
F
Purple
Large
Stretch
Adult
T
Purple
Large
Stretch
Child
F
Purple
Large
Dip
Adult
F
Purple
Large
Dip
Child
F
