LT-6 - Overcast 83 78 FALSE Play Overcast 64 65 TRUE Play...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Splitting criteria In C4.5, given a node ‘t’, the splitting criteria used is Info (t) = - = k j j S S C f 1 ) , ( . log 2 S S C f j ) , ( where f(C j ,S) stands for the number of samples in S that belong to class C j (out of k possible classes) and |S| denotes the total number of samples in the set S. Gain ratio (t) = ) ( ) ( t n Informatio Split t Gain where Info x (t) = - = n i i i T Info T T 1 ) ( . | | | | Gain (t) = Info (t) – Info x (t) Split info(t) = - = n i i i T T T T 1 2 | | | | log . | | | | and where x refers to the splitting attribute currently being tested. Let us have a look how these parameter calculations are worked out with reference to the Outlook data set as shown in Table prepared by Quinlan. We choose the ‘outlook’ attribute as the splitting attribute to be tested (x 1 ).
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outlook temp humidity windy Class Overcast 72 90 TRUE Play
Background image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Overcast 83 78 FALSE Play Overcast 64 65 TRUE Play Overcast 81 75 FALSE Play Rain 71 80 TRUE Dont play Rain 65 70 TRUE Dont play Rain 75 80 FALSE Play Rain 68 80 FALSE Play Rain 70 96 FALSE Play Sunny 75 70 TRUE Play Sunny 80 90 TRUE Dont play Sunny 85 85 FALSE Dont play Sunny 72 95 FALSE Dont play Sunny 69 70 FALSE Play Info(T) = -9/14 log 2 (9/14) 5/14 log 2 (5/14) = 0.940 Info x1 (t) = 4/14(-4/4 log 2 (4/4) 0/4 log 2 (0/4)) + 5/14(-3/5 log 2 (3/5) -2/5 log 2 (2/5)) + 5/14(-2/5 log 2 (2/5) 3/5 log 2 (3/5)) = 0.694 Split Info x1 (T) = -4/14 log 2 (4/14)- 5/14 log 2 (5/14) - 5/14 log 2 (5/14) = 1.577 Gain (x1) = 0.940 0.694 = 0.246 Gain ratio (x1) = 0.246/1.557 = 0.156 A node is selected to split that provides the maximum information gain ratio....
View Full Document

Page1 / 2

LT-6 - Overcast 83 78 FALSE Play Overcast 64 65 TRUE Play...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online