# PROBLEM SET 6.docx - PROBLEM SET 6 1 Using rpart Analysis...

• 5

This preview shows page 1 - 3 out of 5 pages.

PROBLEM SET 6 1) Using rpart Analysis: The classification tree which is dependent on attributes such as education, marital status, capital gain and loss, predicts the income, that is less than/ greater than or equal to(</>/<=/>=) 50k. 1) If a person is married, and 1.1)Education num > 0.9458, then his income is above 50k(greater than 50k) 1.2)Education num < 0.9458, and a) Captial gain is greater than or equal to (>=) 0.5352, then income is greater than (>) 50k. b) Captial gain is less than (<) 0.5352, and i) Capital loss is greater than or equal to (>=) 4.254, then income is greater than (>) 50k. ii) Capital loss is less than (<) 4.254, then income is less than or equal to (<=) 50k. 2) If a person’s marital status is anything other than married ( divorced, never married, seperated or widowed ) and,
2.1) capital gain is greater than or equal to (>=) 0.8082, then income is greater than (>) 50k. 2.2) Capital gain is less than (<) 0.8082, then income is less than or equal to (<=) 50k. 2) Using Rattle Analysis: The Rattle decision tree has 17 nodes which state that capital gain, relationship, education, capital loss and occupations attributes play an important role in deciding the person’s income only if the OT is > 50 or <= 50K. On the other hand, classification trees focus on marital status, education, capital loss attributes and capital gain. Rattle tree defines probabilities for each node unlike the cfit tree.