lect3

lect3 - Recap Mining association rules from large datasets...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
Recap: Mining association rules from large datasets
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Recap Task 1: Methods for finding all frequent itemsets efficiently Task 2: Methods for finding association rules efficiently
Background image of page 2
Recap Frequent itemsets (measure: support) Apriori principle Apriori algorithm for finding frequent itemsets Prunes really well in practice Makes multiple passes over the dataset
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Making a single pass over the data: the AprioriTid algorithm The database is not used for counting support after the 1 st pass! Instead information in data structure C k is used for counting support in every step C k is generated from C k-1 For small values of k , storage requirements for data structures could be larger than the database! For large values of k , storage requirements can be very small
Background image of page 4
Lecture outline Task 1: Methods for finding all frequent itemsets efficiently Task 2: Methods for finding association rules efficiently
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Definition: Association Rule Let D be database of transactions e.g.: Let I be the set of items that appear in the database, e.g., I={A,B,C,D,E,F} A rule is defined by X Y , where X I , Y I , and X Y= e.g.: {B,C} {A} is a rule Transaction ID Items 2000 A, B, C 1000 A, C 4000 A, D 5000 B, E, F
Background image of page 6
Definition: Association Rule Example: Beer } Diaper , Milk { 4 . 0 5 2 | T | ) Beer Diaper, , Milk ( = = = σ s 67 . 0 3 2 ) Diaper , Milk ( ) Beer Diaper, Milk, ( = = = c Association Rule An implication expression of the form   Y , where   X   and  Y  are non- overlapping itemsets Example:    {Milk, Diaper}   {Beer}  Rule Evaluation Metrics Support ( s ) Fraction of transactions that contain  both  X  and  Y Confidence ( c ) Measures how often items in  Y   appear in transactions that contain  X TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
TID date items_bought 100 10/10/99 {F,A,D,B} 200 15/10/99 {D,A,C,E,B} 300 19/10/99 {C,A,B,E} 400 20/10/99 {B,A,D} Example What is the support and confidence of the rule: {B,D} {A} Support: percentage of tuples that contain {A,B,D} = Confidence: = D} {B, contain that tuples of number D} B, {A, contain that tuples of number 75% 100%
Background image of page 8
Association-rule mining task Given a set of transactions D , the goal of association rule mining is to find all rules having support minsup threshold confidence minconf threshold
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Brute-force algorithm for association-rule mining List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive !
Background image of page 10
How many association rules are there? Given d unique items in I : Total number of itemsets = 2 d Total number of possible association rules: 1 2 3 1 1 1 1 + - = - × = + - = - = d d d k k d j j k d k d R If d=6, R = 602 rules
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Mining Association Rules Two-step approach:
Background image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 55

lect3 - Recap Mining association rules from large datasets...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online