Unformatted text preview: Introduction to Algorithms Massachusetts Institute of Technology Singapore-MIT Alliance Professors Erik Demaine, Lee Wee Sun, and Charles E. Leiserson Day 28 6.046J/18.410J SMA5503 Handout 27 Problem Set 7 Solutions
MIT students: This problem set is due in lecture on Day 26. Reading: Chapters 17 Both exercises and problems should be solved, but only the problems should be turned in. Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered by the exercises. Mark the top of each sheet with your name, the course number, the problem number, your recitation instructor and time, the date, and the names of any students with whom you collaborated. MIT students: Each problem should be done on a separate sheet (or sheets) of three-hole punched paper. You will often be called upon to "give an algorithm" to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of your essay should provide the following: 1. A description of the algorithm in English and, if helpful, pseudocode. 2. At least one worked example or diagram to show more precisely how your algorithm works. 3. A proof (or indication) of the correctness of the algorithm. 4. An analysis of the running time of the algorithm. Remember, your goal is to communicate. Graders will be instructed to take off points for convo luted and obtuse descriptions. Exercise 7-1. Do exercise 17.1-1 on page 409 of CLRS. Solution: No, a sequence of M ULTIPUSH operations could make the amortized bound . 2 Exercise 7-2. Do exercise 17.3-4 on page 416 of CLRS. Solution: Handout 27: Problem Set 7 Solutions Exercise 7-3. Do exercise 17.3-7 on page 416 of CLRS. Solution: You use an unsorted array, so insert takes worst-case time. For D ELETE -L ARGER -H ALF , you use the linear-time median algorithm to find the median, then you use PARTITION to partition the array around the median, then you delete the larger side of the partition in time. For the amortized analysis, insert each item with 2 tokens on it. When you perform a D ELETE -L ARGER -H ALF operation, each item in the list pays 1 token for the operation. When you delete the larger half, the tokens on these items are redistributed on the remaining items. If each item on the list starts with 2 tokens, they each have one after the median finding, and then each item in the deleted half gives its token to one of the remaining items. Thus, there are always two tokens per item and we get constant amortized time. Exercise 7-4. Do exercise 17.4-1 on page 424 of CLRS. Solution: To keep insertion time reasonable. Insertion into a dynamic open-address hash table can be made to run in time by expanding when and contracting when . Problem 7-1. Reducing the space in the van Emde Boas structure In this problem, we will use hashing to modify the van Emde Boas data structure presented in lecture in order to reduce its space usage. Recall the problem statement: In the fixed-universe successor problem, a data structure must . The data structure must support maintain a dynamic subset of the universe the operations of inserting elements into , deleting elements from , finding the successor (next element in ) from any element in , and finding the predecessor (previous element in ) from any element in . Recall the outline of the van Emde Boas data structure: The universe represented by a widget of size . Each widget of size stores an array sub cursive subwidgets sub sub sub each of size stores a summary widget summary of size , representing which subwid each widget also stores its minimum element min separately from all gets are nonempty. Each widget the subwidgets. Finally, each widget maintains the value max of its maximum element. of . In addition, The total cost of executing stack operations, assuming the stack begins with finishes with objects is bounded by . objects and is re- Handout 27: Problem Set 7 Solutions 3 For reference, the van Emde Boas algorithms for insertion and finding successors in time are given as follows. For any widget , and for any in the universe of possible elements in , define high and low to be nonnegative integers so that high low . Thus, high and low are both less than , and represent the high-order and low-order halves of the bits in the binary representation of .
V EB-I NSERT V EB-S UCCESSOR 1 if min 2 then return min 3 if low max sub high 4 then V EB-S UCCESSOR low 5 return high 6 else V EB-S UCCESSOR high 7 return min sub sub high summary (a) Argue that the van Emde Boas data structure uses space. (Hint: Derive a recurrence for the space occupied by a widget of size .) Solution: The space occupied by the data structure is given by the recurrence because in each widget there are recursive subwidgets, recursive summary widget, and an array of size . First we prove that by the substitution method. Assume by induction that for all . Then is nonempty, that is, min sub sub high low summary high 1 if min 2 then exchange min 3 if subwidget sub high 4 then V EB-I NSERT low 5 else min sub high 6 V EB-I NSERT high 7 if max 8 then max NIL 4 Handout 27: Problem Set 7 Solutions
provided that is chosen large enough. The constant must be chosen large enough to satisfy the base case. Second we prove that by the substitution method. Assume by induction that for all . Then The constant must be chosen small enough to satisfy the base case. Consider the following modifications to the van Emde Boas data structure. 1. Empty widgets are represented by the value NIL instead of being explicitly represented by a recursive construction. is stored as a dynamic hash table (as in Section 17.4 of CLRS) instead of an array. The key of a subwidget sub is , so we can quickly find the th subwidget sub by a single search in the hash table sub . 3. As a consequence of the first two modifications, the hash table sub only stores the nonempty subwidgets. The NIL values of the empty subwidgets are not even stored in the hash table. Thus, the space occupied by the hash table sub is proportional to the number of nonempty subwidgets of . Whenever we insert an element into an empty (NIL ) widget, we create a widget using the following procedure, which runs in time: Returns a new widget containing just the element . C REATE -W IDGET 1 allocate a widget structure 2 min 3 max 4 summary NIL 5 sub a new empty dynamic hash table 6 return In the next two problem parts, you will develop the insertion and successor operations for this modified van Emde Boas structure. It suffices to simply describe the necessary changes from the V EB-I NSERT and V EB-S UCCESSOR operations detailed above. In any case, you should give special attention to the interaction with the hash table sub . sub sub sub 2. The structure sub containing the subwidgets Handout 27: Problem Set 7 Solutions
(b) Give an efficient algorithm for inserting an element into the modified van Emde Boas structure, using C REATE -W IDGET as a subroutine. Solution: The algorithm is similar to V EB-I NSERT . One main change is that the two cases are distinguished based on testing whether a particular key is stored in the hash table . A second main change is that when the key is not in the hash table, a new sub widget is created using C REATE -W IDGET . We summarize with the pseudocode: 5 with key high (c) Give an efficient algorithm for finding the successor of an element in the modified van Emde Boas structure. Solution: (d) Using known results, argue that the running time of your modified insertion and suc cessor algorithms run in expected time, under the assumption of simple uniform hashing. Solution: Each recursive call used to perform instructions, and now additionally performs additional hash-table operations. Thus, under the assumption of simple uniform hashing, the total cost goes up by an expected constant factor from the normal van Emde Boas structure. (e) Prove that the space occupied by the modified data structure is . You may ignore the possibility of deletions, and assume that only insertions and successor operations are performed. Solution: The algorithm is identical to V EB-S UCCESSOR , except that references to sub translate into searches in the hash table sub for key . M ODIFIED -I NSERT 1 if NIL 2 C REATE -W IDGET then 3 else if min 4 then exchange min 5 if the hash table sub has an entry for key high 6 then M ODIFIED -I NSERT low sub high 7 else C REATE -W IDGET 8 insert into hash table sub the subwidget Sets sub high 9 M ODIFIED -I NSERT high summary 10 if max 11 then max 6 Handout 27: Problem Set 7 Solutions
Each widget by itself (ignoring its subwidgets and summary widgets) takes space. We store a widget only if its min field is occupied by an element. The hash table increases the space by a constant factor (amortizing over the constant cost of each subwidgets). Thus the space is . Handout 27: Problem Set 7 Solutions
Problem 7-2. The cost of restructuring red-black trees 7 There are four basic operations on red-black trees that perform structural modifications: node insertions, node deletions, rotations, and color modifications. We have seen that RB-I NSERT and RB-D ELETE use only rotations, node insertions, and node deletions to maintain the red-black properties, but they may make many more color modifications. (a) Describe a legal red-black tree with nodes such that calling RB-I NSERT to add the st node causes color modifications. Then describe a legal red-black tree with nodes for which calling RB-D ELETE on a particular node causes color modifications. Solution: For RB-I NSERT , consider a complete red-black tree with an even number of levels in which nodes at odd levels are black and nodes at even levels are red. When a node is inserted as a child of one of the leaves, then color changes will be needed to fix the colors of nodes on the path from the inserted node to the root. For RB-D ELETE , consider a complete red-black tree in which all nodes are black. If a leaf is deleted, then the "double blackness" will be pushed all the way up to the root, with a color change at each level (case 2 of RB-D ELETE -F IXUP ), for a total of color changes. Although the worst-case number of color modifications per operation can be logarithmic, we shall prove that any sequence of RB-I NSERT and RB-D ELETE operations on an initially empty red black tree causes structural modifications in the worst case. (b) Some of the cases handled by the main loop of the code of both RB-I NSERT-F IXUP and RB-D ELETE -F IXUP are terminating: once encountered, they cause the loop to terminate after a constant number of additional operations. For each of the cases of RB-I NSERT-F IXUP and RB-D ELETE -F IXUP , specify which are terminating and which are not. (hint: Look at Figures 13.5, 13.6, and 13.7). Solution: All cases except for case 1 of RB-I NSERT-F IXUP and case 2 of RB-D ELETE -F IXUP are terminating. We shall first analyze the structural modifications when only insertions are performed. Let be a red-black tree, and define to be the number of red nodes in . Assume that unit of potential can pay for the structural modifications performed by any of the three cases of RB-I NSERT-F IXUP . Solution: (c) Let be the result of applying Case 1 of RB-I NSERT-F IXUP to . . Argue that 8 Handout 27: Problem Set 7 Solutions
Case 1 of RB-I NSERT-F IXUP reduces the number of red nodes by one, a fact that can be seen in Figure 13.4 in CLRS. Hence, . (d) Node insertion into a red-black tree using RB-I NSERT can be broken down into three parts. List the structural modifications and potential changes resulting from lines 1-16 of RB-I NSERT , from nonterminating cases of RB-I NSERT-F IXUP , and from terminat ing cases of RB-I NSERT-F IXUP . Solution: Lines 1-16 of RB-I NSERT cause one node insertion and a unit increase in potential. The nonterminating case of RB-I NSERT-F IXUP (Case 1) makes three color changes and decreases the potential by one. The terminating cases of RB-I NSERT-F IXUP (Cases 2 and 3) cause one rotation each and do not affect the potential. (e) Using part (d), argue that the amortized number of structural modifications performed by any call of RB-I NSERT is . Solution: The number of structural modifications and amount of potential change resulting from lines 1-16 of RB-I NSERT and from the terminating cases of RB-I NSERT-F IXUP are constant, so the amortized cost of these parts are constant. The nonterminating case of RB-I NSERT-F IXUP may repeat up to times, but its amortized cost is 0, since by our assumption the unit decrease in the potential pays for the structural modifica tions needed. Therefore, the worst-case amortized cost of RB-I NSERT is constant. We now wish to prove that there are structural modifications when there are both insertions and deletions. Let us define, for each node , and let be the tree that results from applying any nonterminating case of RB-I NSERT-F IXUP or RB-D ELETE -F IXUP to . (f) Show that for all nonterminating cases of RB-I NSERT-F IXUP . Argue that the amortized number of structural modifications performed by any call of RB-I NSERT-F IXUP is . Now we redefine the potential of a red-black tree as if if if if is red is black and has no red children is black and has one red child is black and has two red children Handout 27: Problem Set 7 Solutions
Solution: From Figure 13.4, we see that Case 1 of RB-I NSERT-F IXUP makes the following changes to the tree: 9 The total change in potential is , which pays for the structural modifications performed, and thus the amortized cost of Case 1 (nonterminating case) is . Because the terminating cases of RB-I NSERT-F IXUP cause constant structural changes and con stant change in potential, since is based solely on node color and the number of color changes caused by terminating cases is constant. The amortized cost of the ter minating cases is at most constant. Hence, the overall amortized cost of RB-I NSERT is constant. for all nonterminating cases of RB-D ELETE -F IXUP . (g) Show that Argue that the amortized number of structural modifications performed by any call of RB-D ELETE -F IXUP is . Solution: Figure 13.7 shows that Case 2 of RB-D ELETE -F IXUP makes the following changes to the tree: Changes a black node with no red children to a red node (node ), resulting in a potential change of . If is red, then it loses a black child, with no effect on potential. If is black, then it goes from having no red children to having one red child, resulting in a potential change of . The total change in potential is either or , depending on the color of . In either case, one unit of potential pays for the structural modifications performed, and thus the amortized cost of Case 2 (nonterminating case) is at most . Because the terminating cases of RB-D ELETE cause constant structural changes and constant change in poten tial, since is based solely on node color and the number of color changes caused by terminating cases is constant. The amortized cost of the terminating cases is at most constant. Hence, the overall amortized cost of RB-D ELETE -F IXUP is constant. Solution: (h) Complete the proof that in the worst case, any sequence of operations performs structural modifications. RB-I NSERT and RB-D ELETE Changes a black node with two red children to a red node (node ), resulting in a potential change of . Changes a red node to a black node with one red child (node in the top diagram; node in the bottom diagram), resulting in no potential change. Changes a red node to a black node with no red children (node ), resulting in a potential change of . 10 Handout 27: Problem Set 7 Solutions
Since the amortized cost of each operation is bounded above by a constant, the actual number of structural modifications for any sequence of RB-I NSERT and RB-D ELETE operations on an initially empty red-black tree cause structural modifications in the worst case. ...
View Full Document
- Fall '04
- Algorithms, Analysis of algorithms, Red-black tree, Self-balancing binary search tree, Associative array