cs345-streams2

cs345-streams2 - MoreStreamMining ComputingMoments 1 x...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
1 More Stream-Mining Counting How Many Elements Computing “Moments”
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Counting Distinct Elements Problem : a data stream consists of  elements chosen from a set of size  n .   Maintain a count of the number of  distinct elements seen so far. Obvious approach : maintain the set of  elements seen.
Background image of page 2
3 Applications How many different words are found  among the Web pages being crawled at  a site? Unusually low or high numbers could  indicate artificial pages (spam?). How many different Web pages does  each customer request in a week?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Using Small Storage Real Problem : what if we do not have  space to store the complete set? Estimate the count in an unbiased way. Accept that the count may be in error,  but limit the probability that the error is  large.
Background image of page 4
Flajolet-Martin* Approach Pick a hash function  h   that maps each of  the  n  elements to log 2 n   bits, uniformly. Important that the hash function be (almost) a  random permutation of the elements. For each stream element  a , let  r  ( a  ) be  the number of trailing 0’s in  ( ). Record 
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 01/25/2012.

Page1 / 18

cs345-streams2 - MoreStreamMining ComputingMoments 1 x...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online