{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Chap10_DiscriminantAnalysis

# Chap10_DiscriminantAnalysis - Chapter 10 Discriminant...

This preview shows pages 1–11. Sign up to view the full content.

Chapter 10 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Discriminant Analysis: Background A classical statistical technique Used for classification long before data mining Classifying organisms into species Classifying skulls Fingerprint analysis And also used for business data mining ( loans, customer types, etc.) Can also be used to highlight aspects that distinguish classes ( profiling )
Small Example: Riding Mowers Goal : classify purchase behavior (buy/no-buy) of riding mowers based on income and lot size Outcome : owner or non-owner (0/1) Predictors : lot size, income

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Can we manually draw a line that separates owners from non-owners?
Example: Loan Acceptance In the prior small example, separation is clear In data mining applications, there will be more records, more predictors, and less clear separation Consider Universal Bank example with only 2 predictors: Outcome: accept/don’t accept loan Predictors: Annual income (Income) Avg. monthly credit card spending (CCAvg)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Sample of 200 customers
5000 customers

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Algorithm for Discriminant Analysis
The Idea To classify a new record, measure its distance from the center of each class Then, classify the record to the closest class

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Step 1: Measuring Distance Need to measure each record’s distance from the center of each class The center of a class is called a centroid The centroid is simply a vector (list) of the means of each of the predictors. This mean is computed from all the records that belong to that class.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 30

Chap10_DiscriminantAnalysis - Chapter 10 Discriminant...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online