Lecture05-1 - Data Mining Principles and Algorithms...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
October 29, 2009 Data Mining: Principle and Algorithms 1 Data Mining: Principles and Algorithms Jianyong Wang Database Lab, Institute of Software Department of Computer Science and Technology Tsinghua University [email protected]
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
October 29, 2009 Data Mining: Principle and Algorithms 2 Chapter 3: Mining Frequent Patterns, Association and Correlations Basic concepts and a road map Efficient and scalable frequent itemset mining methods Mining various kinds of association rules From association mining to correlation analysis Constraint-based association mining Sequential pattern mining Summary
Background image of page 2
October 29, 2009 Data Mining: Principle and Algorithms 3 Efficient Mining of Frequent Closed Itemsets Closed pattern mining strategies - The current status The CHARM algorithm - A Vertical Data Format based Closed itemset Mining algorithm The CLOSET+ algorithm - The hybrid tree-projection method - The item-skipping technique - Efficient subset checking Experimental results - Comparison with OP, CHARM and CLOSET - Scalability test Conclusion
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Various Closed Pattern Mining Strategies
Background image of page 4
October 29, 2009 Data Mining: Principle and Algorithms 5 Problem statement (re-visit) Mining frequent closed itemset - Itemset »A non-empty set of distinct items - Frequent itemset X »Given a support threshold, min_sup , an itemset X is frequent if - Closed itemset Y »There exists no itemset Y , such that and hold sup min_ ) X ( Sup Y Y ' ) ( ) ( ' Y Sup Y Sup
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
October 29, 2009 Data Mining: Principle and Algorithms 6 Typical Closed Itemset Mining Algoritms Typical algorithms - A-close »Breath-first search based - CLOSET/ CLOSET+ » FP-tree and depth-first search based - MAFIA »Vertical bitmap representation - CHARM »Vertical data representation and diffset technique
Background image of page 6
October 29, 2009 Data Mining: Principle and Algorithms 7 Running example min_sup =2 , f_list =<f:4, c:4, a:3, b:3, m:3, p:3> Table 1 a transaction database TDB tid itemset ordered frequent item list 10 a, c, f, m, p f, c, a, m, p 20 a, c, d, f, m, p f, c, a, m, p 30 a, b, c, f, g, m f, c, a, b, m 40 b, f, i f, b 50 b, c, n, p c, b, p
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
October 29, 2009 Data Mining: Principle and Algorithms 8 Running example (a) The set of frequent closed itemsets (b) FP-tree Ø root f:4 c:4 b:3 fb:2 cb:2 cp:3 fcam:3 fcamp:2 c:1 b:1 p:1 b:1 b:1 m:1 f:4 c:3 a:3 m:2 p:2
Background image of page 8
October 29, 2009 Data Mining: Principle and Algorithms 9 Strategies for Frequent Closed Itemset Mining Search order - Breadth-first search vs. depth-first search » Depth-first search is more efficient than breadth-first search for mining long patterns Data representation - Horizontal vs. vertical data formats » Need further performance study to compare these two schemes in terms of scalability, runtime and space usage efficiency
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
October 29, 2009 Data Mining: Principle and Algorithms 10 Strategies for Frequent Closed Itemset Mining Data compression techniques - FP-tree (CLOSET+) vs. diffset(CHARM) Existing search space pruning - Item merging » If every transaction containing itemset X also contains itemset Y but not any proper superset of Y , then X U Y
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 44

Lecture05-1 - Data Mining Principles and Algorithms...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online