cs345-streams2

# cs345-streams2 - MoreStreamMining ComputingMoments 1 x...

This preview shows pages 1–6. Sign up to view the full content.

1 More Stream-Mining Counting How Many Elements Computing “Moments”

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Counting Distinct Elements Problem : a data stream consists of  elements chosen from a set of size  n .   Maintain a count of the number of  distinct elements seen so far. Obvious approach : maintain the set of  elements seen.
3 Applications How many different words are found  among the Web pages being crawled at  a site? Unusually low or high numbers could  indicate artificial pages (spam?). How many different Web pages does  each customer request in a week?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Using Small Storage Real Problem : what if we do not have  space to store the complete set? Estimate the count in an unbiased way. Accept that the count may be in error,  but limit the probability that the error is  large.
5 Flajolet-Martin* Approach Pick a hash function  h   that maps each of  the  n  elements to log 2 n   bits, uniformly. Important that the hash function be (almost) a  random permutation of the elements. For each stream element  a , let  r  ( a  ) be  the number of trailing 0’s in  ( ).

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern