ch5_classification

ch5_classification - Chapter 5 Classification...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Chapter 5 Classification Classification: Definition T id R e fu n d M a r ita l S ta tu s T a x a b le In c o m e C h e a t 1 Y e s S in g le 1 2 5 K N o 2 N o M a r r ie d 1 0 0 K N o 3 N o S in g le 7 0 K N o 4 Y e s M a r r ie d 1 2 0 K N o 5 N o D iv o r c e d 9 5 K Y e s 6 N o M a r r ie d 6 0 K N o 7 Y e s D iv o r c e d 2 2 0 K N o 8 N o S in g le 8 5 K Y e s 9 N o M a r r ie d 7 5 K N o 1 0 N o S in g le 9 0 K Y e s Classification: Definition z Given a collection of records ( training set ) – Each record contains a set of attributes , one of the attributes is the class . z Find a model for class attribute as a function of the values of other attributes. z Goal: previously unseen records should be assigned a class as accurately as possible. – A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it. Illustrating Classification Task Apply Model Learn Model Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No 10 No Small 90K Yes Tid Attrib1 Attrib2 Attrib3 Class 11 No Small 55K ? 12 Yes Medium 80K ? 13 Yes Large 110K ? 14 No Small 95K ? 15 No Large 67K ? Examples of Classification Task z Predicting tumor cells as benign or malignant z Classifying credit card transactions as legitimate or fraudulent z Classifying secondary structures of protein as alpha-helix, beta-sheet, or random coil z Categorizing news stories as finance, weather, entertainment, sports, etc Classification Techniques z Decision Tree based Methods z Rule-based Methods z Memory based reasoning z Neural Networks z Naïve Bayes and Bayesian Belief Networks z Support Vector Machines Example of a Decision Tree Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes c a t e g o r i c a l c a t e g o r i c a l c o n t i n u o u s c l a s s Refund MarSt TaxInc YES NO NO NO Yes No Married Single, Divorced < 80K > 80K Splitting Attributes Training Data Model: Decision Tree Another Example of Decision Tree Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes c a t e g o r i c a l c a t e g o r i c a l c o n t i n u o u s c l a s s MarSt Refund TaxInc YES NO NO NO Yes No Married Single, Divorced < 80K > 80K There could be more than one tree that fits the same data! Decision Tree Classification Task Apply Model Learn Model Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K...
View Full Document

This note was uploaded on 06/16/2011 for the course CS 5141 taught by Professor Chenenhong during the Spring '10 term at USTC.

Page1 / 102

ch5_classification - Chapter 5 Classification...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online