CSE 572-DM-Huan.Liu-S10-HW3-Sol

# CSE 572-DM-Huan.Liu-S10-HW3-Sol - School of Computing...

This preview shows pages 1–4. Sign up to view the full content.

Data Mining (CSE 572), ASU, Spring 2010, Huan Liu http://www.public.asu.edu/~huanliu/DM10S/cse572.html Page 1 From 7 School of Computing, Informatics, and Decision Systems Engineering Data Mining (CSE 572) Spring 2010 Instructor: Dr. Huan Liu TA: Mohammad Ali Abbasi Homework # 3, Classification Deadline Feb 24, 2010 HW3-1 : Consider the training example shown in Table 1, and Test set shown in Table 2, for a binary classification of mammals. (45 Points) Table 1. Training set.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Data Mining (CSE 572), ASU, Spring 2010, Huan Liu http://www.public.asu.edu/~huanliu/DM10S/cse572.html Page 2 From 7 Table 2. Test set. a- What is the entropy of this collection of training set? Yes 4 No 6 j t j p t j p t Entropy ) | ( log ) | ( ) ( Entropy = - (4/10 * log 4/10 + 6/10 * log 6/10) = 0.9708 b- What are the Information Gains of “body temperature”, “Give Birth”, “Hibernates”, and “four legged”? k i i split i Entropy n n p Entropy GAIN 1 ) ( ) ( “body temperature”: 0.9708 – 0.36095 = 0.60985 “Give Birth”: 0.60985 “Hibernates”: 0.01982 “four legged”: 0.01982 c- Based on the results from part b, what is the best split (according to the information gain)? “Give Birth” and “Body Temperature” are better parameters
Data Mining (CSE 572), ASU, Spring 2010, Huan Liu http://www.public.asu.edu/~huanliu/DM10S/cse572.html Page 3 From 7 d- Between “ Give Birth ”, and “ four legged ” what is the best split according to the classification error rate ? Give Birth: Classification error at node t is calculated by ) | ( max 1 ) ( t i P t Error i Give Birth Yes No Yes 4 0 No 1 5 Error yes = 1- max (4/5, 1/5) = 0.2 Error no = 1- max (0/5, 5/5) = 0 Weighted Classification Error = 5/10 * 0.2 + 5/10 * 0 = 0.1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 04/08/2010 for the course CS 420 taught by Professor Dawsonengler during the Spring '02 term at San Jose State.

### Page1 / 7

CSE 572-DM-Huan.Liu-S10-HW3-Sol - School of Computing...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online