# Lecture05-2 - Data Mining Principles and Algorithms...

This preview shows pages 1–9. Sign up to view the full content.

November 5, 2009 Data Mining: Principle and Algorithms 1 Data Mining: Principles and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University [email protected]

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
November 5, 2009 Data Mining: Principle and Algorithms 2 Chapter 3: Mining Frequent Patterns, Association and Correlations Basic concepts and a road map Efficient and scalable frequent itemset mining methods Mining various kinds of association rules From association mining to correlation analysis Constraint-based association mining Sequential pattern mining Graph pattern mining Summary
November 5, 2009 Data Mining: Principle and Algorithms 3 Mining Various Kinds of Association Rules Mining multi-level association Mining multi-dimensional association Mining quantitative association Mining interesting correlation patterns

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
November 5, 2009 Data Mining: Principle and Algorithms 4 Mining Multiple-Level Association Rules Items often form hierarchy Flexible support settings - Items at the lower level are expected to have lower support Exploration of shared multi-level mining (Agrawal & [email protected] 95) uniform support Milk [support = 10%] 2 % Milk [support = 6%] Skim Milk [support = 4%] Level 1 min_sup = 5% Level 2 min_sup = 5% Level 1 min_sup = 5% Level 2 min_sup = 3% reduced support
November 5, 2009 Data Mining: Principle and Algorithms 5 Multi-level Association: Redundancy Filtering Some rules may be redundant due to ancestor relationships between items. Example - milk wheat bread [support = 8%, confidence = 70%] - 2% milk wheat bread [support = 2%, confidence = 72%] We say the first rule is an ancestor of the second rule. A rule is redundant if its support is close to the expected value, based on the rule s ancestor.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
November 5, 2009 Data Mining: Principle and Algorithms 6 Mining Multi-Dimensional Association Single-dimensional rules: buys(X, milk ) buys(X, bread ) Multi-dimensional rules: 2 dimensions or predicates - Inter-dimension association rules ( no repeated predicates ) age(X, 19-25 ) occupation(X, student ) buys(X, coke ) - hybrid-dimension association rules ( repeated predicates ) age(X, 19-25 ) buys(X, popcorn ) buys(X, coke )
November 5, 2009 Data Mining: Principle and Algorithms 7 Mining Quantitative Associations Techniques can be categorized by how numerical attributes, such as age or salary are treated 1. Static discretization based on predefined concept hierarchies 2. Dynamic discretization based on data distribution (quantitative rules, e.g., Agrawal & [email protected] and Lent, Swami and Widom ICDE 97)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
November 5, 2009 Data Mining: Principle and Algorithms 8 Quantitative Association Rules age(X,”34 - 35”) income(X,”30 - 50K”) buys(X,”high resolution TV”) Proposed by Lent, Swami and Widom ICDE
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 06/02/2010 for the course COMPUTER DM2009F taught by Professor Wangwei during the Fall '09 term at Tsinghua University.

### Page1 / 34

Lecture05-2 - Data Mining Principles and Algorithms...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online