lect2

lect2 - Mining Association Rules in Large Databases...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Mining Association Rules in Large Databases
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Association rules Given a set of transactions D , find rules that will predict the occurrence of an item (or a set of items) based on the occurrences of other items in the transaction Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Examples of association rules {Diaper} {Beer}, {Milk, Bread} {Diaper,Coke}, {Beer, Bread} {Milk},
Background image of page 2
An even simpler concept: frequent itemsets Given a set of transactions D , find combination of items that occur frequently Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Examples of frequent itemsets {Diaper, Beer}, {Milk, Bread} {Beer, Bread, Milk},
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Lecture outline Task 1: Methods for finding all frequent itemsets efficiently Task 2: Methods for finding association rules efficiently
Background image of page 4
Definition: Frequent Itemset Itemset A set of one or more items E.g.: {Milk, Bread, Diaper} k -itemset An itemset that contains k items Support count ( σ ) Frequency of occurrence of an itemset (number of transactions it appears) E.g. σ ({Milk, Bread,Diaper}) = 2 Support Fraction of the transactions in which an itemset appears E.g . s({Milk, Bread, Diaper}) = 2/5 Frequent Itemset An itemset whose support is greater than or equal to a minsup threshold TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Why do we want to find frequent itemsets? Find all combinations of items that occur together They might be interesting (e.g., in placement of items in a store ) Frequent itemsets are only positive combinations (we do not report combinations that do not occur frequently together) Frequent itemsets aims at providing a summary for the data
Background image of page 6
Finding frequent sets Task: Given a transaction database D and a minsup threshold find all frequent itemsets and the frequency of each set in this collection Stated differently: Count the number of times combinations of attributes occur in the data. If the count of a combination is above minsup report it. Recall: The input is a transaction database D where every transaction consists of a subset of items from some universe I
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How many itemsets are there? null AB AC AD AE BC BD BE CD CE DE A B C D E ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE Given d items, there are 2 d possible itemsets
Background image of page 8
When is the task sensible and feasible? If minsup = 0 , then all subsets of I will be frequent and thus the size of the collection will be very large This summary is very large (maybe larger than the original input) and thus not interesting The task of finding all frequent sets is interesting typically only for relatively large values of minsup
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 49

lect2 - Mining Association Rules in Large Databases...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online