cs345-streams3-2

cs345-streams3-2 - Still More StreamMining Frequent...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Still More Stream-Mining Frequent Itemsets Elephants and Troops Exponentially Decaying Windows
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Counting Items Problem : given a stream, which items  appear more than  s   times in the  window? Possible solution : think of the stream of  baskets as one binary stream per item. 1 = item present; 0 = not present. Use DGIM to estimate counts of 1’s for all  items.
Background image of page 2
3 Extensions In principle, you could count frequent  pairs or even larger sets the same way. One stream per itemset. Drawbacks: 1. Only approximate. 2. Number of itemsets is way too big.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Approaches 1. Elephants and troops ”: a heuristic  way to converge on unusually strongly  connected itemsets. 2. Exponentially decaying windows : a  heuristic for selecting likely frequent  itemsets. 
Background image of page 4
5 Elephants and Troops When Sergey Brin wasn’t worrying  about Google, he tried the following  experiment. Goal : find unusually correlated sets of  words. High Correlation  ” = frequency of  occurrence of set >> product of frequency  of members.
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Experimental Setup The data was an early Google crawl of  the Stanford Web. Each night, the data would be streamed 
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 17

cs345-streams3-2 - Still More StreamMining Frequent...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online