Chap6_ThreeSimpleClassificationMethods-1

Chap6_ThreeSimpleClassificationMethods-1 - Chapter 6 Three...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
The three methods: Naïve rule Naïve Bayes K-nearest-neighbor Common characteristics: Data-driven, not model-driven Make no assumptions about the data
Background image of page 2
Naïve Rule Classify all records as the majority class Not a “real” method Introduced so it will serve as a benchmark against which to measure other results
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Naïve Bayes
Background image of page 4
Naïve Bayes: The Basic Idea For a given new record to be classified, find other records like it (i.e., same values for the predictors) What is the prevalent class among those records? Assign that class to your new record
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Usage Requires categorical variables Numerical variable must be binned and converted to categorical Can be used with very large data sets Example: Spell check – computer attempts to assign your misspelled word to an established “class” (i.e., correctly spelled word)
Background image of page 6
Exact Bayes Classifier Relies on finding other records that share same predictor values as record-to-be-classified. Want to find “probability of belonging to class C , given specified values of predictors.” Even with large data sets, may be hard to find other records that exactly match your record, in terms of predictor values.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Solution – Naïve Bayes Assume independence of predictor variables (within each class) Use multiplication rule Find same probability that record belongs to class C, given predictor values, without limiting calculation to records that share all those same values
Background image of page 8
Example: Financial Fraud Target variable: Audit finds fraud, no fraud Predictors: Prior pending legal charges (yes/no) Size of firm (small/large)
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Charges? Size
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/09/2011 for the course MAR 08 taught by Professor Staff during the Spring '08 term at Youngstown State University.

Page1 / 31

Chap6_ThreeSimpleClassificationMethods-1 - Chapter 6 Three...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online