hw1 - CS 6375 Machine Learning Fall 2010 Assignment 1:...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 6375 Machine Learning Fall 2010 Assignment 1: Decision Tree Induction Part I: Due by Thursday, September 9, 11:59 p.m. Part II: Due by Tuesday, September 21, 11:59 p.m. For Part I (the written problems), you may either slip a written (hard-copy) solution under the TA’s office door (do not leave it in the rack outside the TA’s office) or submit your solution elec- tronically via eLearning. Regardless of the submission method you use, any submission received after September 9 will be considered late. For Part II (the programming part), only electronic sub- missions via eLearning will be accepted. You will receive no credit if you only submit a hard-copy solution. Part I: Written Problems (25 points) 1. Representing Boolean Functions (10 points) Give decision trees to represent the following concepts: (a) ( ¬ A B ) ∧ ¬ ( C A ) . Your decision tree must contain as few nodes as possible. (b) ( A B ) C 2. Decision Trees (15 points) Spam has become an increasingly annoying problem for e-mail users. In this problem we are interested in using the ID3 decision tree induction algorithm to automatically determine whether or not an e-mail is a spam based on whether the words “nigeria”, “viagra”, and “learning” appear in the e-mail. Below are the instances from which our decision tree will be learned. Note that a word has the value 1 if and only if it is present in the corresponding e-mail. No. nigeria viagra learning Class 1 1 0 0 1 2 0 1 0 1 3 0 0 0 0 4 1 0 1 0 5 0 0 0 0 6 1 1 0 1 7 0 1 1 0 8 1 0 0 1 9 0 0 0 0 10 1 0 0 1 Using these descriptions as training instances, show the decision tree created by the ID3 decision tree learning algorithm. Show the information gain calculations that you computed to create the tree. Be sure to indicate the class value to associate with each leaf of the tree and the set of instances that are associated with each leaf. 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Part II: Programming (75 points) Implement the ID3 decision tree learning algorithm that we discussed in class. To simplify the implementation, your system only needs to handle binary classification tasks (i.e. each instance will have a class value of 0 or 1). In addition, you may assume that all attributes are binary-valued
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/03/2010 for the course COMPUTER S CS6375 taught by Professor Vincentng during the Fall '10 term at University of Texas at Dallas, Richardson.

Page1 / 4

hw1 - CS 6375 Machine Learning Fall 2010 Assignment 1:...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online