{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lec31 - CS 70 Fall 2006 Discrete Mathematics for CS...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 70 Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 31 Two Killer Applications In this lecture, we will see two “killer apps” of elementary probability in Computer Science. 1. Suppose a hash function distributes keys evenly over a table of size n . How many (randomly chosen) keys can we hash before the probability of a collision exceeds (say) 1 2 ? 2. Consider the following simple load balancing scenario. We are given m jobs and n machines; we allocate each job to a machine uniformly at random and independently of all other jobs. What is a likely value for the maximum load on any machine? As we shall see, both of these questions can be tackled by an analysis of the balls-and-bins probability space which we have already encountered. Application 1: Hash functions As you may recall, a hash table is a data structure that supports the storage of sets of keys from a (large) universe U (say, the names of all 250m people in the US). The operations supported are ADD ing a key to the set, DELETE ing a key from the set, and testing MEMBER ship of a key in the set. The hash function h maps U to a table T of modest size. To ADD a key x to our set, we evaluate h ( x ) (i.e., apply the hash function to the key) and store x at the location h ( x ) in the table T . All keys in our set that are mapped to the same table location are stored in a simple linked list. The operations DELETE and MEMBER are implemented in similar fashion, by evaluating h ( x ) and searching the linked list at h ( x ) . Note that the efficiency of a hash function depends on having only few collisions — i.e., keys that map to the same location. This is because the search time for DELETE and MEMBER operations is proportional to the length of the corresponding linked list. The question we are interested in here is the following: suppose our hash table T has size n , and that our hash function h distributes U evenly over T . 1 Assume that the keys we want to store are chosen uniformly at random and independently from the universe U . What is the largest number, m , of keys we can store before the probability of a collision reaches 1 2 ? Let’s begin by seeing how this problem can be put into the balls and bins framework. The balls will be the m keys to be stored, and the bins will be the n locations in the hash table T . Since the keys are chosen uniformly and independently from U , and since the hash function distributes keys evenly over the table, we can see each key (ball) as choosing a hash table location (bin) uniformly and independently from T . Thus the probability space corresponding to this hashing experiment is exactly the same as the balls and bins space. We are interested in the event A that there is no collision, or equivalently, that all m balls land in different bins. Clearly Pr [ A ] will decrease as m increases (with n fixed). Our goal is to find the largest value of m 1 I.e., | U | = α n (the size of U is an integer multiple α of the size of T ), and for each y T , the number of keys x U for which h ( x ) = y is exactly α .
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern