# LT-7 - Process of finding the best split by using GI, CART...

Process of finding the best split by using GI, CART To explain the process of finding best split by using GI, let us consider again the example of outlook data set: Outlook temp humidit windy Class Overcast 72 90 TRUE Play Overcast 83 78 FALSE Play Overcast 64 65 TRUE Play Overcast 81 75 FALSE Play Rain 71 80 TRUE Don’t Rain 65 70 TRUE Don’t Rain 75 80 FALSE Play Rain 68 80 FALSE Play Rain 70 96 FALSE Play Sunny 75 70 TRUE Play Sunny 80 90 TRUE Don’t Sunny 85 85 FALSE Don’t Sunny 72 95 FALSE Don’t Sunny 69 70 FALSE Play Impurity measure is as follows:- i(t) = j i t j p t i p ) / ( ) / ( ................. (C.1) Gini (t) = 1- i i P 2 ................. (C.2) Decrease in impurity is as follows: - φ(s,t) = max∆i(s,t) = i(t) – p R i(t R ) – p L i(t L ) .............. (C.3) Now we calculate values for root node to substitute the values from the above table in the above equation i(t) = j i t j p t i p ) / ( ) / ( = + -

This note was uploaded on 12/06/2011 for the course DM 301

