1 “Association Rules” Market Baskets Frequent Itemsets A-priori Algorithm

2 The Market-Basket Model rhombus6 A large set of items , e.g., things sold in a supermarket. rhombus6 A large set of baskets , each of which is a small set of the items, e.g., the things one customer buys on one day.
3 Support rhombus6 Simplest question: find sets of items that appear “frequently” in the baskets. rhombus6 Support for itemset I = the number of baskets containing all items in I . rhombus6 Given a support threshold s , sets of items that appear in > s baskets are called frequent itemsets .

4 Example rhombus6 Items={milk, coke, pepsi, beer, juice}. rhombus6 Support = 3 baskets. B1 = {m, c, b} B2 = {m, p, j} B3 = {m, b} B4 = {c, j} B5 = {m, p, b} B6 = {m, c, b, j} B7 = {c, b, j} B8 = {b, c} rhombus6 Frequent itemsets: {m}, {c}, {b}, {j}, {m, b}, {c, b}, {j, c}.
5 Applications --- (1) rhombus6 Real market baskets: chain stores keep terabytes of information about what customers buy together. rhombus4 Tells how typical customers navigate stores, lets them position tempting items. rhombus4 Suggests tie-in “tricks,” e.g., run sale on diapers and raise the price of beer. rhombus6 High support needed, or no \$\$’s .

6 Applications --- (2) rhombus6 “Baskets” = documents; “items” = words in those documents. rhombus4 Lets us find words that appear together unusually frequently, i.e., linked concepts. rhombus6 “Baskets” = sentences, “items” = documents containing those sentences. rhombus4 Items that appear together too often could represent plagiarism.
7 Applications --- (3) rhombus6 “Baskets” = Web pages; “items” = linked pages. rhombus4 Pairs of pages with many common references may be about the same topic. rhombus6 “Baskets” = Web pages p ; “items” = pages that link to p . rhombus4 Pages with many of the same links may be mirrors or about the same topic.

8 Important Point rhombus6 “Market Baskets” is an abstraction that models any many-many relationship between two concepts: “items” and “baskets.” rhombus4 Items need not be “contained” in baskets.
