02-assoc

Support threshold s then sets of items that appear in

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ar in at least s baskets are called frequent itemsets 1/5/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 7 Items = {milk, coke, pepsi, beer, juice} Support = 3 baskets B1 = {m, c, b} B3 = {m, b} B5 = {m, p, b} B7 = {c, b, j} 1/5/2011 B2 = {m, p, j} B4= {c, j} B6 = {m, c, b, j} B8 = {b, c} Frequent itemsets: {m}, {c}, {b}, {j}, {m,b} , {b,c} , {c,j}. Jure Leskovec, Stanford C246: Mining Massive Datasets 8 Items = products; baskets = sets of products someone bought in one trip to the store Real market baskets: chain stores keep terabytes of data about what customers buy together Tells how typical customers navigate stores, lets them position tempting items Suggests tie-in “tricks”, e.g., run sale on diapers and raise the price of beer High support needed, or no $$’s Amazon’s people who bought X also bought Y 1/5/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 9 Baskets = sentences; items = documents containing those sentences Items that appear together too often could represent plagiarism Notice items do not have to be “in” baskets Baskets = patients; items = drugs & side-effects Has been used to detect combinations of drugs that result in particular side-effects But requires extension: absence of an item needs to be observed as well as presence. 1/5/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 10...
View Full Document

Ask a homework question - tutors are online