Association Rule Mining

Association Rule Mining

Association Rule Mining z Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Example of Association Rules {Diaper} {Beer}, {Milk, Bread} {Eggs,Coke}, {Beer, Bread} {Milk}, Implication means co-occurrence, not causality!

Definition: Frequent Itemset z Itemset A collection of one or more items ± Example: {Milk, Bread, Diaper} k-itemset ± An itemset that contains k items z Support count ( σ ) Frequency of occurrence of an itemset E.g. σ ({Milk, Bread,Diaper}) = 2 z Support Fraction of transactions that contain an itemset E.g. s({Milk, Bread, Diaper}) = 2/5 z Frequent Itemset An itemset whose support is greater than or equal to a minsup threshold TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
Definition: Association Rule Example: Beer } Diaper , Milk { 4 . 0 5 2 | T | ) Beer Diaper, , Milk ( = = = σ s 67 . 0 3 2 ) Diaper , Milk ( ) Beer Diaper, Milk, ( = = = c z Association Rule An implication expression of the form X Y, where X and Y are itemsets Example: {Milk, Diaper} {Beer} z Rule Evaluation Metrics Support (s) ± Fraction of transactions that contain both X and Y Confidence (c) ± Measures how often items in Y appear in transactions that contain X TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke

Association Rule Mining Task z Given a set of transactions T, the goal of association rule mining is to find all rules having support minsup threshold confidence minconf threshold z Brute-force approach: List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive !
Mining Association Rules Example of Rules: {Milk,Diaper} {Beer} (s=0.4, c=0.67) {Milk,Beer} {Diaper} (s=0.4, c=1.0) {Diaper,Beer} {Milk} (s=0.4, c=0.67) {Beer} {Milk,Diaper} (s=0.4, c=0.67) {Diaper} {Milk,Beer} (s=0.4, c=0.5) {Milk} {Diaper,Beer} (s=0.4, c=0.5) TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4

