lect3

lect3 - Recap: Mining association rules from large datasets...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
Recap: Mining association rules from large datasets
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Recap Task 1: Methods for finding all frequent itemsets efficiently Task 2: Methods for finding association rules efficiently
Background image of page 2
Recap Frequent itemsets (measure: support) Apriori principle Apriori algorithm for finding frequent itemsets Prunes really well in practice Makes multiple passes over the dataset
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Making a single pass over the data: the AprioriTid algorithm The database is not used for counting support after the 1 st pass! Instead information in data structure C k is used for counting support in every step C k is generated from C k-1 For small values of k , storage requirements for data structures could be larger than the database! For large values of k , storage requirements can be very small
Background image of page 4
Lecture outline Task 1: Methods for finding all frequent itemsets efficiently Task 2: Methods for finding association rules efficiently
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Definition: Association Rule Let D be database of transactions e.g.: Let I be the set of items that appear in the database, e.g., I={A,B,C,D,E,F} A rule is defined by X Y , where X I , Y I , and X Y= e.g.: {B,C} {A} is a rule Transaction ID Items 2000 A, B, C 1000 A, C 4000 A, D 5000 B, E, F
Background image of page 6
Definition: Association Rule Example: Beer } Diaper , Milk { 4 . 0 5 2 | T | ) Beer Diaper, , Milk ( = = = σ s 67 . 0 3 2 ) Diaper , Milk ( ) Beer Diaper, Milk, ( = = = c Association Rule An implication expression of the form   Y , where   X   and  Y  are non- overlapping itemsets Example:    {Milk, Diaper}   {Beer}  Rule Evaluation Metrics Support ( s ) Fraction of transactions that contain  both  X  and  Y Confidence ( c ) Measures how often items in  Y   appear in transactions that contain  X TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
TID date items_bought 100 10/10/99 {F,A,D,B} 200 15/10/99 {D,A,C,E,B} 300 19/10/99 {C,A,B,E} 400 20/10/99 {B,A,D} Example What is the support and confidence of the rule: {B,D} {A} Support: percentage of tuples that contain {A,B,D} = Confidence: = D} {B, contain that tuples of number D} B, {A, contain that tuples of number 75% 100%
Background image of page 8
Association-rule mining task Given a set of transactions D , the goal of association rule mining is to find all rules having support minsup threshold confidence minconf threshold
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Brute-force algorithm for association-rule mining List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive !
Background image of page 10
How many association rules are there? Given d unique items in I : Total number of itemsets = 2 d Total number of possible association rules: 1 2 3 1 1 1 1 + - = - × = + - = - = d d d k k d j j k d k d R If d=6, R = 602 rules
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Mining Association Rules Two-step approach:
Background image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 55

lect3 - Recap: Mining association rules from large datasets...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online