Unformatted text preview: Introduction to Algorithms Massachusetts Institute of Technology SingaporeMIT Alliance Professors Erik Demaine, Lee Wee Sun, and Charles E. Leiserson Day 14 6.046J/18.410J SMA5503 Handout 18 Problem Set 4 Solutions
MIT students: This problem set is due in recitation on Day 13. Reading: Chapters 10 and 11 Both exercises and problems should be solved, but only the problems should be turned in. Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered by the exercises. Mark the top of each sheet with your name, the course number, the problem number, your recitation instructor and time, the date, and the names of any students with whom you collaborated. MIT students: Each problem should be done on a separate sheet (or sheets) of threehole punched paper. You will often be called upon to "give an algorithm" to solve a certain problem. Your writeup should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of your essay should provide the following: 1. A description of the algorithm in English and, if helpful, pseudocode. 2. At least one worked example or diagram to show more precisely how your algorithm works. 3. A proof (or indication) of the correctness of the algorithm. 4. An analysis of the running time of the algorithm. Remember, your goal is to communicate. Graders will be instructed to take off points for convo luted and obtuse descriptions. 2 Exercise 41. Do exercise 10.16 on page 204 of CLRS. Solution: Handout 18: Problem Set 4 Solutions Use stack A for E NQUEUE operations, and stack B for D EQUEUE operations. Simulate E NQUEUE by pushing new element onto stack A. Simulate D EQUEUE by popping top element from stack B, but if stack B is empty when a D EQUEUE is requested, first empty stack A into stack B by popping elements one at a time from stack A and pushing them onto stack B. (Note that copying the stack reverses its order, so the oldest element is then on top and can be removed with D EQUEUE.) E NQUEUE takes time O(1). In the average case, D EQUEUE also takes time O(1), but we could get unlucky and have to transfer n elements from one stack to another, so it is worst case O(n). (However, we can amortize the cost of the transfer over all of the E NQUEUE and D EQUEUE oper ations. It is clear that each element must be popped from A and pushed onto B exactly once, so an amortized analysis gives an average worst case time of O(1) for D EQUEUE.) Exercise 42. Do exercise 10.24 on page 208 of CLRS. Solution: Store the value you're looking for in the sentinel. Exercise 43. Do exercise 10.34 on page 213 of CLRS. Solution: Use a variable m, that indicates the number of elements currently in the list, and always keep the list in array locations 1 through m. Allocating a new object is accomplished by using array entry A[m + 1]. Whenever an object that occupies one of the locations 1 through m is freed, take the object at location m and move it there, thus freeing entry m. This way we guarantee that the list is always compact. Exercise 44. Suppose we hash elements of a set U of keys into m slots. Show that if U  > (n  1)m, there is a subset of U of size n consisting of keys that all hash to the same slot, so that the worstcase searching time for hashing with chaining is (n). Solution: Mapping (n  1)m + 1 keys into a table of size m must result in at least one slot with n keys or more (pigeonhole principle): if each slot held at most n  1 keys, there would only be at most (n  1)m keys. The (n  1)m + 1th key would have to go in some slot which already had n  1 keys. Therefore, the worstcase searching time for hashing with chaining is (n). Handout 18: Problem Set 4 Solutions Exercise 45. Do exercise 11.33 on page 236 of CLRS. Solution: 3 All permutations can be generated by a sequence of two character interchanges. Thus it suffices to show that if two arbitrary characters i and j are switched, then the values hash to the same place. Now consider two numbers x and y which have characters i and j interchanged. w.l.o.g., say i > j. x  y = = = = (xi  yi )(m + 1)(i1) + (xj  yj )(m + 1)(j1) (xi  xj )(m + 1)(i1)  (xi  xj )(m + 1)(j1) (xi  xj )((m + 1)(i1)  (m + 1)(j1) ) (xi  xj )(m + 1)(j1) ((m + 1)(ij)  1)
ij1 k=0 = (xi  xj )(m + 1)(j1) ((m + 1)  1) 0 mod m Problem 41. Comparisons among dynamic sets (m + 1)k For each type of dynamic set in the following table, what is the asymptotic running time for each operation listed, in terms of the number of elements n? For operations that have not been explicitly defined, consider how you would implement the oper ation given the data structure. You do not need to give the algorithm, just the running time. State any assumptions that you make. Assume that the hash tables resolve collisions by chaining with doubly linked lists. unsorted singly sorted doubly linked list, linked list, worstcase worstcase S EARCH(L, k) I NSERT(L, x) D ELETE(L, x) S UCCESSOR(L, x) M INIMUM(L) M AXIMUM(L) minheap, worstcase hash table, worstcase hash table, averagecase 4 Solution: Handout 18: Problem Set 4 Solutions S EARCH(L, k) I NSERT(L, x) D ELETE(L, x) S UCCESSOR(L, x) M INIMUM(L) M AXIMUM(L) unsorted singly sorted doubly minheap, hash table, linked list, linked list, worstcase worstcase worstcase worstcase O(n) O(n) O(n) O(n) O(1) O(n) O(lgn) O(1) or O(n) O(n) O(1) O(lgn) O(1) O(n) O(1) O(n) O(n) O(n) O(1) O(1) O(n) O(n) O(n) or O(1) O(n) O(n) hash table, averagecase O(1) O(1) O(1) O(n) O(n) O(n) Doublylinked sorted lists can find the maximum in constanttime if they maintain a tail attribute, or are circular; otherwise, they must scan through the entire list to find the end, taking O(n). Inserting into a hash table takes worstcase O(n) if you want to ensure there are no duplicate entries, because you have to do a search first. Otherwise, it's O(1). Problem 42. kuniversal hashing and authentication Let H be a class of hash functions in which each hash function h H maps the universe U of keys to {0, 1, . . . , m  1}. We say that H is kuniversal if, for every fixed sequence of k distinct keys x(1) , x(2) , . . . , x(k) and for any h chosen at random from H, the sequence h(x(1) ), h(x(2) ), . . . , h(x(k) ) is equally likely to be any of the mk sequences of length k with elements drawn from {0, 1, . . . , m  1}. (a) Show that if the family H of hash functions is 2universal, then it is universal. Solution: If H is 2universal then for any two fixed keys x = y, the sequence x, y is equally likely to be any sequence in {0, 1, . . . , m  1}2 . Therefore, as h varies over H, the number of collisions (h(x) = h(y)) is (1/m) H, and H is universal. (b) Suppose that the universe U is the set of ntuples of values drawn from p = {0, 1, . . . , p  1}, where p is prime. Consider an element x = x0 , x1 , . . . , xn1 U . For any ntuple a = a0 , a1 , . . . , an1 U , define the hash function ha by ha (x) = n1 j=0 aj xj mod p . Let H = {ha }. This is the family of hash functions shown in lecture to be universal. Show that H is not 2universal. (Hint: Find a key for which all hash functions in H produce the same value.) Solution: Suppose we take x = 0, 0, . . . , 0, and some fixed y U . Then for any a U , ha (x), ha (y) = 0, ha (y). This shows that the class H is not 2universal, since not all sequences ha (x), ha (y) are equally likely to occur. Handout 18: Problem Set 4 Solutions (c) Suppose that we modify H slightly from part (b): For any a U and for any b p , define ha,b (x) = n1 j=0 5 aj xj + b mod p . and H = ha,b . Argue that H is 2universal. (Hint: Consider fixed x U and y U , with xi = yi for some i. What happens to ha,b (x) and ha,b (y) as ai and b range over p ?) Solution: For each key pair x, y, x = y, we wish to show that all value pairs ha,b (x), ha,b (y) are equally likely to occur when h is chosen randomly from H that is, when a0 , a1 , . . . , an1 and b are chosen randomly. If x = y, we must have xi = yi for some i. Assume w.l.o.g. that i = 0. We define = ha,b (x), = ha,b (y), and X = This gives us the equations = (a0 x0 + b + X) mod p, and = (a0 y0 + b + Y ) mod p. Since x0 = y0 and p is prime, there is a unique solution to the above equations for a0 and b in terms of and . To see this more explicitly, consider that if we want to be able to generate all possible pairs , , then it is sufficient to be able to independently control and  . We have  a0 (x0  y0 ) + X  Y (mod p), so a0 = (   X + Y ) (x0  y0 )1 mod p. We know that (x0  y0 )1 exists and is unique because x0 = y0 and p is prime. So we may make  whatever we want by choosing a particular a0 . Having done so, we may make whatever we want by choosing a particular b: this simply applies an identical offset to both and , leaving their difference modp the same. Therefore, for any given a1 . . . an1 , we may find a hash function ha,b which generates any possible , by choosing the right a0 and b. Since there are p2 possible choices for a0 and b, and also p2 possible values for , , each , is generated by exactly one choice of a0 and b. This is true for all pn1 choices of a1 . . . an1 . Therefore, there are exactly pn1 functions ha,b which generate each value pair , . All value pairs are then equally likely when ha,b is chosen randomly, so H is 2universal.
n1 j=1 n1 j=1 aj x j , Y = aj y j . 6 Handout 18: Problem Set 4 Solutions (d) Suppose that Alice and Bob secretly agree on a hash function h from a 2universal family H of hash functions. Each h H maps from a universe of keys U to p , where p is prime. Later, Alice sends a message m to Bob over the Internet, where m U . She authenticates this message to Bob by also sending an authentication tag t = h(m), and Bob checks that the pair (m, t) he receives satisfies t = h(m). Suppose that an adversary intercepts (m, t) en route and tries to fool Bob by replacing the pair with a different pair (m , t ). Argue that the probability that the adversary succeeds in fooling Bob into accepting (m , t ) is at most 1/p, no matter how much computing power the adversary has, even if the adversary knows the family H of hash functions used. Solution: For any key pair m, m , all hash value pairs h(m), h(m ) are equally likely when h is chosen at random. This is what it means for H to be 2universal. In particular, all p pairs t, h(m ) are equally likely. So even if the adversary knows H, seeing (m, t) tells him nothing about h(m ). Since the probabilities for the p cases are equal and must sum to 1, all possible h(m )s have probability 1/p, and the adversary can do no better than guessing. ...
View
Full Document
 Fall '04
 PiotrIndykandCharlesE.Leiserson
 Algorithms, hash function, Cuckoo hashing, Bloom filter

Click to edit the document details