Lecture11 - Data Mining: Principles and Algorithms Jianyong...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
12/17/2009 Data Mining: Principles and Algorithms 1 Data Mining: Principles and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University [email protected]
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
12/17/2009 Data Mining: Principles and Algorithms 2 Chapter 6. Classification and Prediction What is classification? What is prediction? Issues regarding classification and prediction Classification by decision tree induction Bayesian classification Rule-based classification Artificial Neural Networks Support Vector Machines (SVM) Associative classification Lazy learners (or learning from your neighbors) Other classification methods Ensemble methods Prediction Accuracy and error measures Summary
Background image of page 2
12/17/2009 Data Mining: Principles and Algorithms 3 Associative Classification Associative classification - Association rules are generated and analyzed for use in classification - Search for strong associations between frequent patterns (conjunctions of attribute- value pairs) and class labels - Classification: based on evaluating a set of rules in the form of P 1 ^ p 2 … ^ p n “A class = C” (support, confidence) Why effective? - It adopts an existing association rule mining algorithm to mine the complete set of rules, and selects the set of best rules for classifier construction - In many studies, associative classification has been found to be more accurate than some traditional classification methods, such as C4.5
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
12/17/2009 Data Mining: Principles and Algorithms 4 Typical Associative Classification Methods CBA ( 98 ) - Mine association rules in the form of Cond-set (a set of attribute-value pairs) class label E.g., X=5, 5<Y≤6 Class=Yes E.g. of a traditional association rule: Cheese, Milk Bread (Sup=5%, conf=70%) - Rule generator + Classifier builder (including rule selection) CMAR ( Classification based on Multiple Association Rules: Li, Han, Pei, ICDM 01 ) - Classification: Statistical analysis on multiple rules
Background image of page 4
12/17/2009 Data Mining: Principles and Algorithms 5 CBA: Integrating Classification and Association Rule Mining Rule generator - Apriori-like level-wise association rule mining Input: table form dataset (transformed needed) or transaction form dataset. Output: a complete set of CARs.(class association rules) - Rule-pruning during rule mining Pessimistic error-rate-based pruning (Also used in C4.5) A rule r is pruned if removing a single item from a rule results in a reduction of the pessimistic error rate Probability of error (apparent error rate) where N = # tuples covered by r n C = # tuples in majority class covered by r
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
12/17/2009 Data Mining: Principles and Algorithms 6 CBA: Integrating Classification and Association Rule Mining Running example A B C e p y e p y e q y g q y g q y g q n g w n g w n e p n f q n CARs after pruning:
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/02/2010 for the course COMPUTER DM2009F taught by Professor Wangwei during the Fall '09 term at Tsinghua University.

Page1 / 44

Lecture11 - Data Mining: Principles and Algorithms Jianyong...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online