This preview shows page 1. Sign up to view the full content.
Unformatted text preview: one class left Æ No ambiguity in the class label This node is almost “pure” Æ Little ambiguity in the class label These nodes contain a mixture of classes Æ Do not disambiguate between the classes 15 • Pi = probability of occurrence of value i – High entropy Æ All the classes are (nearly) equally likely – Low entropy Æ A few classes are likely; most of the classes are rarely observed – assume 0 log2 0 = 0
16 Entropy
The entropy captures the degree of “purity” of the distribution Example Entropy Calculation 17 18 Conditional Entropy
Entropy before splitting: H Information Gain After splitting, a fraction PL of the data goes to the left node, which has entropy HL After splitting, a fraction PR of the data goes to the right node, which has entropy HR We want nodes as pure as possible Æ We want to reduce the entropy as much as possible Æ We want to maximize the difference between the entropy of the parent node and the expected entropy of the children Maximize: The average entropy (or “conditional entropy”) after splitting is:
19 20 21 22 Notations
• Entropy: H(Y) = Entropy of the distribution of classes at a node • Conditional Entropy: – Discrete: H(YXj) = Entropy after splitting with respect to variable j – Continuous: H(YXj,t) = Entropy after splitting with respect to variable j with threshold t • Information gain: – Discrete: IG(YXj) = H(Y)  H(YXj) – Continuous: IG(YXj,t) = H(Y)  H(YXj,t)
23 Another Illustrative Example
'D\ 2XWORRN 7HPSHUDWXUH 6XQQ\ 6XQQ\ 2YHUFDVW 5DLQ 5DLQ 5DLQ 2YHUFDVW 6XQQ\ 6XQQ\ 5DLQ 6XQQ\ 2YHUFDVW 2YHUFDVW 5DLQ +RW +RW +RW 0LOG &RRO &RRO &RRO 0LOG &RRO 0LOG 0LOG 0LOG +RW 0LOG +XPLGLW\ :LQG +LJK +LJK +LJK +LJK 1RUPDO 1RUPDO 1RUPDO +LJK 1RUPDO 1RUPDO 1RUPDO +LJK 1RUPDO +LJK :HDN 6WURQJ :HDN :HDN :HDN 6WURQJ 6WURQJ :HDN :HDN :HDN 6WURQJ 6WURQJ :HDN 6WURQJ 3OD\7HQQLV 1R 1R <HV <HV <HV 1R <HV 1R <HV <HV <HV <HV <HV 1R
24 Another Illustrative Example
'D\ 2XWORRN 7HPSHUDWXUH 6XQQ\ 6XQQ\ 2YHUFDVW 5DLQ 5DLQ 5DLQ 2YHUFDVW 6XQQ\ 6XQQ\ 5DLQ 6XQQ\ 2YHUFDVW 2YHUFDVW 5DLQ +RW +RW +RW 0LOG &RRO &RRO &RRO 0LOG &RRO 0LOG 0LOG 0LOG +RW 0LOG +XPLGLW\ :LQG +LJK +LJK +LJK +LJK 1RUPDO 1RUPDO 1RUPDO +LJK 1RUPDO 1RUPDO 1RUPDO +LJK 1RUPDO +LJK :HDN 6WURQJ :HDN :HDN :HDN 6WURQJ 6WURQJ :HDN :HDN :HDN 6WURQJ 6WURQJ :HDN 6WURQJ 3OD\7HQQLV 1R 1R <HV <HV <HV 1R <HV 1R <HV <HV <HV <HV <HV 1R
25 Another Illustrative Example
+XPLGLW\ :LQG +LJK :HDN +LJK 6WURQJ +LJK :HDN +LJK :HDN 1RUPDO :HDN 1RUPDO 6WURQJ 1RUPDO 6WURQJ +LJK :HDN 1RUPDO :HDN 1RUPDO :HDN 1RUPDO 6WURQJ +LJK 6WURQJ 1RUPDO :HDN +LJK 6WURQJ 3OD\7HQQLV 1R 1R <HV <HV <HV 1R <HV 1R <HV <HV <HV <HV <HV 1R (QWURS\ 6 = − − = ORJ ORJ ( 26 Another Illustrative Example
+XPLGLW\ +XPLGLW\ :LQG 3OD\7HQQLV +LJK :HDN 1R +LJK 6WURQJ 1R +LJK :HDN <HV +LJK :HDN <HV 1RUPDO :HDN <HV 1RUPDO 6WURQJ 1R 1RUPDO 6WURQJ <HV ( +LJK :HDN 1R 1RUPDO :HDN <HV 1RUPDO :HDN <HV 1RUPDO 6WURQJ <HV +LJK 6WURQJ <HV 1RUPDO :HDN <HV +LJK 6WURQJ 1R +XPLGLW\ Another Illustrative Example
+XPLGLW\ :LQG 3OD\7HQQLV +LJK :HDN 1R +LJK 6WURQJ 1R :LQG +LJK :HDN <HV +LJK :HDN <HV 1RUPDO :HDN <HV 1RUPDO 6WURQJ 1R :H...
View
Full
Document
This note was uploaded on 11/03/2010 for the course UNIVERSITY CS6375 taught by Professor Vicentng during the Fall '10 term at University of Texas at Dallas, Richardson.
 Fall '10
 VicentNg

Click to edit the document details