# Lect22 - Learning decision trees Test 1 V13 V11 V12 Test 4...

This preview shows pages 1–9. Sign up to view the full content.

Learning decision trees Test 2 Test 6 Test 5 Test 3 Test 4 Test 1 V11 V22 V21 V12 V13 Label 2 Label 1 Label 1 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
+ - - + + + - - + - + - + + - - + + + - - + - + - - + - - + - + - - + - + - + + - - + + - - - + - + - + + - - + + + - - + - + - + + - - + - + - - + + + - + - + + - + - + + + - - + - + - - + - + - - + - + - + - - - + - - - - + - - + - - - + + + + + + + + - - - - - - - - - - - - + + + + + + + + + + + + + + - - + - + - + - + + + - - - - - - - - - - - - + + + + + + - - - - - Highly Disorganized High Entropy Much Information Required Highly Organized Low Entropy Little Information Required Training Data 2
Measuring Information What is the expected number of bits? 16/32 use 1 bit 8/32 use 2 bits 4 x 2/32 use 4 bits 0.5(1) + 0.25(2) + 0.0625(4) + 0.0625(4) + 0.0625(4) + 0.0625(4) = 0.5 + 0.5 + 0.25 + 0.25 + 0.25 + 0.25 = 2 S 7 = A A A A A A A A A A A A A A A A B B B B B B B B C C D D E E F F 16 8 2 2 2 2 A 1 B 01 C 0000 D 0001 E 0010 F 0011 FOR SAY )) (Pr( log ) Pr( 2 v v Labels v H(S) = 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Information Gain Subtract Information required after split from before + - - + + + - - + - + - + + - - + + + - - + - + - - + - - + - + - - + - + - + + - - + + - - - + - + - + + - - + + + - - + - + - + + - - + - + - - + + + - + - + + - + - + + + - - + - + - - + - + - - + - + - + - - - + - - - - + - - + - - - + + + + + + + + S b w/ H(S b ) S a2 w/ H(S a2 ) S a1 w/ H(S a1 ) S a3 w/ H(S a3 ) Information required: Before H(S b ) After Pr(S a1 ) H(S a1 ) + Pr(S a2 ) H(S a2 ) + Pr(S a3 ) H(S a3 ) Estimate probabilities using sample counts Information Gain = 4
Choosing the Most Useful Test Estimate information gain for each test Choose the highest Information Gain = 5

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Example Restaurant example in text Tennis example 6
Will I Play Tennis? Features: Outlook Sun, Overcast, Rain Temp. Hot, Mild, Cool Humidity High, Normal, Low Wind Strong, Weak Label +, - Features are evaluated in the morning Tennis is played in the afternoon 7

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Training Set 1. S H H W - 2. S H H S - 3. O H H W + 4. R M H W + 5. R C N W + 6. R C N S - 7. O C N S + 8. S M H W - 9. S C N W + 10.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 22

Lect22 - Learning decision trees Test 1 V13 V11 V12 Test 4...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online