2020_High_Performance_Python_-or_Humans_O'Reilly_Media,_Inc. 593.pdf

This preview shows page 1 out of 1 page.

K-Minimum Values In the Morris counter, we lose any sort of information about the items we insert. That is to say, the counter’s internal state is the same whether we do .add("micha") or .add("ian") . This extra information is useful and, if used properly, could help us have our counters count only unique items. In this way, calling .add("micha") thousands of times would increase the counter only once. To implement this behavior, we will exploit properties of hashing func‐ tions (see “Hash Functions and Entropy” for a more in-depth discussion of hash functions). The main property we would like to take advantage of is the fact that the hash function takes input and uniformly distributes it. For example, let’s assume we have a hash function that takes in a string and outputs a number between 0 and 1. For that function to be uniform means that when we feed it in a string, we are equally likely to get a value of 0.5 as a value of 0.2 or any other value. This also means that if we feed it in

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture