# Lecture11 - Data Mining: Principles and Algorithms Jianyong...

This preview shows pages 1–7. Sign up to view the full content.

12/17/2009 Data Mining: Principles and Algorithms 1 Data Mining: Principles and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University [email protected]

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/17/2009 Data Mining: Principles and Algorithms 2 Chapter 6. Classification and Prediction What is classification? What is prediction? Issues regarding classification and prediction Classification by decision tree induction Bayesian classification Rule-based classification Artificial Neural Networks Support Vector Machines (SVM) Associative classification Lazy learners (or learning from your neighbors) Other classification methods Ensemble methods Prediction Accuracy and error measures Summary
12/17/2009 Data Mining: Principles and Algorithms 3 Associative Classification Associative classification - Association rules are generated and analyzed for use in classification - Search for strong associations between frequent patterns (conjunctions of attribute- value pairs) and class labels - Classification: based on evaluating a set of rules in the form of P 1 ^ p 2 … ^ p n “A class = C” (support, confidence) Why effective? - It adopts an existing association rule mining algorithm to mine the complete set of rules, and selects the set of best rules for classifier construction - In many studies, associative classification has been found to be more accurate than some traditional classification methods, such as C4.5

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/17/2009 Data Mining: Principles and Algorithms 4 Typical Associative Classification Methods CBA ( 98 ) - Mine association rules in the form of Cond-set (a set of attribute-value pairs) class label E.g., X=5, 5<Y≤6 Class=Yes E.g. of a traditional association rule: Cheese, Milk Bread (Sup=5%, conf=70%) - Rule generator + Classifier builder (including rule selection) CMAR ( Classification based on Multiple Association Rules: Li, Han, Pei, ICDM 01 ) - Classification: Statistical analysis on multiple rules
12/17/2009 Data Mining: Principles and Algorithms 5 CBA: Integrating Classification and Association Rule Mining Rule generator - Apriori-like level-wise association rule mining Input: table form dataset (transformed needed) or transaction form dataset. Output: a complete set of CARs.(class association rules) - Rule-pruning during rule mining Pessimistic error-rate-based pruning (Also used in C4.5) A rule r is pruned if removing a single item from a rule results in a reduction of the pessimistic error rate Probability of error (apparent error rate) where N = # tuples covered by r n C = # tuples in majority class covered by r

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12/17/2009 Data Mining: Principles and Algorithms 6 CBA: Integrating Classification and Association Rule Mining Running example A B C e p y e p y e q y g q y g q y g q n g w n g w n e p n f q n CARs after pruning:
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 06/02/2010 for the course COMPUTER DM2009F taught by Professor Wangwei during the Fall '09 term at Tsinghua University.

### Page1 / 44

Lecture11 - Data Mining: Principles and Algorithms Jianyong...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online