l4.pdf - CSE 101 A LGORITHMS S UMMER 2000 Lecture Notes 4...

Info icon This preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CSE 101 - A LGORITHMS - S UMMER 2000 Lecture Notes 4 Wednesday, July 12, 2000. 30 3.3.3 Bucket Sort Like counting sort and radix sort, becket sort also assumes that the input verfies some property. More precisely, it assumes that the n elements are uniformly distributed over a certain domain; to simplify the exposition we suppose that they are uniformly distributed over the interval [0; 1). Thus, if we divide the interval [0; 1) in n equal-sized subintervals [0; 1=n), [1=n; 2=n), ..., [(n , 1)=n; 1) called buckets, then we expect only very few numbers to fall into any of the buckets. Let B [0; :::; n , 1] be an array of n lists which are initially empty. Then the following algorithm runs in linear time on the average: BUCKET-S ORT (A) 1 for i 2 3 1 to n do insert A[i] at its position into the list B [bn  A[i]c] concatenate all the lists B [0], B [1], ..., B [n , 1] Exercise 3.1 Illustrate the execution of BUCKET-S ORT (h0:87; 0:21; 0:17; 0:92; 0:81; 0:12; 0:67; 0:52; 0:77; 0:85i). Exercise 3.2 Informally analyze B UCKET-S ORT. 3.4 Problems Related to Sorting This section presents two interesting and useful problems which are or seem to be very related to sorting. 3.4.1 Binary Search Searching is a common operation of databases. The search problem can be formulated in various ways, depending on the type of desired returned information. We only consider an oversimplified version in this subsection: SEARCH I NPUT: A sequence A = ha1 ; a2 ; :::; an i and an element x. O UTPUT: If x occurs in A then x, otherwise not found. The method to do it seems obvious: visit each element in A and compare it with x; if such alement is found then return it, otherwise return not found. This algorithm obviously runs in O(n) time, which seems to be good. Unfortunately, linear algorithms are too slow when bilions of items are involved (imagine Yahoo’s databases). For that reason, the databases are in general organized in efficient datastructures, thus making operations like search (and many others) run in logarithminc time. We only consider one of the simplest data structures in this subsection, a sorted array. Therefore, let us consider the following more refined problem: SORTED-SEARCH I NPUT: A sorted sequence A = ha1 ; a2 ; :::; an i and an element x. O UTPUT: If x occurs in A then x, otherwise not found. We can now use a simple degenerated divide and conquer algorithm to solve this problem, which is called binary search: B INARY-S EARCH (A; i; k; x) 1 if i > k then return not found 2 j b i+2k c 3 if x = A[j ] then return x 4 if x < A[j ] then return B INARY-S EARCH (A; i; j 5 return B INARY-S EARCH (A; j + 1; k; x) , 1; x) 3.5. EXERCISES WITH SOLUTIONS 31 Then B INARY-S EARCH (A; 1; n; x) is a solution algorithm for the problem SORTED-SEARCH. From now on in the course, we’ll write B INARY-S EARCH (A; x) instead of B INARY-S EARCH (A; 1; n; x) whenever possible. Exercise 3.3 Illustrate the execution of B INARY-S EARCH (A; 3:14) for A Do the same thing for A = h0:17; 1; 1:27; 2:2; 2:9; 3:13; 3:9; 4:2; 5i. = h0:17; 1; 1:27; 2:2; 2:9; 3:14; 3:9; 4:2; 5i. Exercise 3.4 Why is the input array of B INARY-S EARCH required to be sorted? Exercise 3.5 Write the recurence for B INARY-S EARCH and then show that its running time is O(log n). 3.4.2 Medians and Order Statistics 3.5 Exercises with Solutions Exercise 3.6 Let A = h3; 9; 5; 3; 1; 4; 8; 7i. Illustrate the execution of 1. I NSERTION -S ORT (A); 2. S ELECTION -S ORT (A); 3. Q UICK -S ORT (A; 1; 8); do not illustrate the execution of PARTITION; 4. M ERGE -S ORT (A; 1; 8); do not illustrate the execution of M ERGE; 5. H EAP -S ORT (A); do not illustrate the executions of BUILD -H EAP and H EAPIFY; 6. C OUNTING -S ORT (A; 9). Proof: We only show how the array A is modified. This is enough to show that you understood the sorting algorithms. There were many different solutions in your quizes; I condidered them all correct if it was clear that you understood the algorithms. 1. I NSERTION -S ORT h3; 9; 5; 3; 1; 4; 8; 7i h3; 9; 5; 3; 1; 4; 8; 7i h3; 5; 9; 3; 1; 4; 8; 7i h3; 3; 5; 9; 1; 4; 8; 7i h1; 3; 3; 5; 9; 4; 8; 7i h1; 3; 3; 4; 5; 9; 8; 7i h1; 3; 3; 4; 5; 8; 9; 7i h1; 3; 3; 4; 5; 7; 8; 9i 2. S ELECTION -S ORT h3; 9; 5; 3; 1; 4; 8; 7i h1; 9; 5; 3; 3; 4; 8; 7i h1; 5; 9; 3; 3; 4; 8; 7i h1; 3; 9; 5; 3; 4; 8; 7i h1; 3; 5; 9; 3; 4; 8; 7i h1; 3; 3; 9; 5; 4; 8; 7i h1; 3; 3; 5; 9; 4; 8; 7i h1; 3; 3; 4; 9; 5; 8; 7i h1; 3; 3; 4; 5; 9; 8; 7i h1; 3; 3; 4; 5; 8; 9; 7i h1; 3; 3; 4; 5; 7; 9; 8i h1; 3; 3; 4; 5; 7; 8; 9i 32 3. Q UICK -S ORT The array A is first partitioned as (consider that the pivot is A[1]) h1; 3; 3; 9; 5; 4; 8; 7i. Then Q UICK -S ORT (A; 1; 3) and Q UICK -S ORT (A; 4; 8) are called. The first is not going to change the array A, so we only illustrate the second. The subarray h9; 5; 4; 8; 7i is again partitioned (the pivot is 9 now) modifying A to h1; 3; 3; 7; 5; 4; 8; 9i and then Q UICK -S ORT (A; 4; 7) is called. We keep doing this and obtain h1; 3; 3; 4; 5; 7; 8; 9i. 4. M ERGE -S ORT h3; 9; 5; 3; 1; 4; 8; 7i h3; 9; 5; 3i h1; 4; 8; 7i h3; 9i h5; 3i h1; 4i h8; 7i h3i h9i h5i h3i h1i h4i h8i h7i h3; 9i h3; 5i h1; 4i h7; 8i h3; 3; 5; 9i h1; 4; 7; 8i h1; 3; 3; 4; 5; 7; 8; 9i 5. H EAP -S ORT The procedure BUILD -H EAP yields A = h9; 7; 8; 3; 1; 4; 5; 3i. Then the following changes generated by swapings and heapifies end up with the sorted array: h3; 7; 8; 3; 1; 4; 5; 9i h8; 7; 5; 3; 1; 4; 3; 9i h3; 7; 5; 3; 1; 4; 8; 9i h7; 3; 5; 3; 1; 4; 8; 9i h4; 3; 5; 3; 1; 7; 8; 9i h5; 3; 4; 3; 1; 7; 8; 9i h1; 3; 4; 3; 5; 7; 8; 9i h4; 3; 1; 3; 5; 7; 8; 9i h3; 3; 1; 4; 5; 7; 8; 9i h1; 3; 3; 4; 5; 7; 8; 9i h1; 3; 3; 4; 5; 7; 8; 9i h1; 3; 3; 4; 5; 7; 8; 9i 6. C OUNTING -S ORT The frequency array is F = h1; 0; 2; 1; 1; 0; 1; 1; 1i. After the next step, it becomes F = h1; 1; 3; 4; 5; 5; 6; 7; 8i. Next, we visit the elements in A from the last one toward the first and output them in B according to F , appropriately decreasing the frequencies. We get B = h1; 3; 3; 4; 5; 7; 8; 9i and F = h0; 1; 1; 3; 4; 5; 5; 6; 7i. Exercise 3.7 Insertion sort can be expressed as a recursive procedure as follows. In order to sort A[1::n], we recursively sort A[1::n , 1] and then insert A[n] into the sorted array A[1::n , 1]. Write a recurrence for the running time of this recursive version of insertion sort. Proof: Let T (n) be the time needed to sort an array of n elements using this recursive version of insertion sort. In the recursive step of this version, the same algorithm is run on an array of n , 1 elements, and there is an additional step of at most n , 1 comparison in order to insert the last element input in the sorted array of n , 1 elements. Then the recurrence for the running time is T (n) = T (n , 1) + O(n): Exercise 3.8 Exercise 2-6. in Skiena You can use any sorting algorithms presented so far as subroutines. 1. Let S be an unsorted array of n integers. Give an algorithm which finds the pair x; y Your algorithm must run in O(n) worst-case time. 2. Let S be a sorted array of n integers. Give an algorithm which finds the pair Your algorithm must run in O(1) worst-case time. 2 S that maximizes jx , yj. x; y 2 S that maximizes jx , yj. 3. Let S be an unsorted array of n integers. Give an algorithm which finds the pair x; y for x 6= y . Your algorithm must run in O(n lg n) worst-case time. 2 S that minimizes jx , yj, 3.5. EXERCISES WITH SOLUTIONS 33 4. Let S be an sorted array of n integers. Give an algorithm which finds the pair x; y for x 6= y . Your algorithm must run in O(n) worst-case time. 2 S that minimizes jx , yj, Proof: 1. We scan through the array once keeping track of the smallest and the largest integers found so far. At the end we have the smallest and largest integers which when subtracted will get the maximum absolute difference. The worst-case running time of the algorithm is T (n) = 2n = O(n) because we compare each integer in the array with the smallest integer and largest integer found so far. 2. We can let y = A[1] and x = A[n]. Since the array is already sorted y will be the smallest element in the array, and x will be the largest element in the array thus maximizing the difference jx , y j. Accessing an array requires constant time so the worst-case running time of this algorithm is O(1). 3. We can sort the array using MergeSort which runs in worst-case time O(n lg n). Then we can traverse array keeping track of the two consecutive array elements which have the smallest difference and are not equal. A single traversal of the array takes worst-case time O(n). The total worst-case time of the algorithm is O(n log n). 4. Again we traverse the array keeping track of the two consecutive array elements which have the smallest difference and are not equal. The worst-case running time of this algorithm is O(n). Exercise 3.9 Given an array of real numbers S , find a pair of numbers x; y in S that minimizes jx + y j. Give your best algorithm for this problem, argue that it is correct and then analyze it. For partial credit you can write an O(n2 ) algorithm. Proof: A solution in O(n2 ) time would be to generate all pairs x; y , to calculate jx + y j and to select those that give the minimum. The O(n log n) solution based on sorting that I expected is the following: M IN -S UM (S ) 1 sort S by the abolute value of its elements 2 3 4 min for i 1 1 to length(S ) , 1 if jS [i] + S [i + 1]j < min then fmin 5 return x; y . jS [i] + S [i + 1]j; x S [i]; y S [i + 1]g The idea is to sort the numbers by their absolute value. This can be easily done by modifying any sorting algorithm such that to compare jS [i]j with jS [j ]j instead of S [i] with S [j ]; the running time remains the same. This algorithm is correct because if two numbers x and y minimize the expression jx + y j, then one of the following cases can appear: 1. x; y are both positive; then x; y must be the smallest two positive numbers, so they occur at consecutive positions in the sorted array. 2. x; y are both negative; then x; y must be the largest two negative numbers, so they are consecutive in the sorted array. 3. One of x; y is positive and the other is negative; then jx + y j minimized by two consecutive elements in the sorted array. = jjxj , jyjj; but the expression jjxj , jyjj is Therefore, the expression jx + y j is minimized by two elements x; y which occur on consecutive positions in the array sorted by absolute value of numbers. The running time of this algorithm is given by the running time of sorting in step 1, because the other steps take linear time. Hence, the running time is O(n log n). 34 Exercise 3.10 (2.5, Skiena) Given two sets S1 and S2 (each of size n), and a number x, describe an O(n lg n) algorithm for finding whether there exists a pair of elements, one from S1 and one from S2 , that add up to x. Proof: (Solution 1) A (n2 ) algorithm would entail examining each element y1 in S1 and determining if there is an element y2 in S2 such that y1 + y2 = x. FINDSUM(S1 ; S2 ; x) 2 S1 do for yj 2 S2 do 1 for yi 2 3 if yi + yj = x then return (yi ; yj ) Proof: (Solution 2) A more efficient solution takes advantage of sorting as a subroutine. First, we sort the numbers in S2 . Next, we loop through each element yi in S1 and then do a binary search on the sorted S2 for x , yi . FINDSUM(S1 ; S2 ; x) 1 M ERGE -S ORT(S2 ; 1; n) 2 S1 do yj ! BINARY-S EARCH (S2 ; x , yi ) if yj = 6 not found then return (yi ; yj ) 2 for yi 3 4 The M ERGE -S ORT takes time (n lg n). For each element in S1 the B INARY-S EARCH on S2 will take O(lg n). Since this binary search is done n times the for loop requires O(n lg n) worst-case time. The worst-case time for the entire algorithm is also O(n lg n). Exercise 3.11 Give an efficient algorithm for sorting a list of n keys that may each be either 0 or 1. What is the order of the worst-case running time of your algorithm? Write the complete algorithm in pseudo-code. Make sure your algorithm has the stable sorting property. Proof: Use radix sort (with count sort) for one digit numbers (so k = 1). Then the running time is O(n  k ), that is, O(n). Alternatively, you can create two arrays, one for zeros and another one for ones, scan the input updating the two arrays, and then append them. The pseudo-code is easy. Exercise 3.12 (From Neapolitan and Naimipour, p.84. Problem 13.) Write an algorithm that sorts a list of n items by dividing it into three sublists of almost n=3 items, sorting each sublist recursively and merging the three sorted sublists. Analyze your algorithm, and give the results using order notation. Proof: We’re doing a three-way merging version of Mergesort. You may write the pseudocode according to your preferred style, as long as it is clear and correct! I have marked comments with a “#” sign. Inputs: positive integer n ; array of keys S indexed from 1 to n. Outputs: the array S containing the keys in nondecreasing order. procedure MERGESORT3 (n ; var S ); # var will ensure that the changes we make to S will be retained. const third = floor (n=3); var U : array [1::third]; V : array [1::third]; W : array [1::(n , 2  third)]; UV : array [1::(2  third)]; # This array will consist of the keys in both U and V in sorted # (nondecreasing) order. This is a stepping stone to merging all three # sorted sublists. begin 3.5. EXERCISES WITH SOLUTIONS 35 if n = 2 then sort(S ) # We assume you know how to write the code for this case # (when S has only two elements). elseif n > 1 then copy S [1] through S [third] to U ; copy S [third + 1] through S [2  third] to V ; copy S [2  third + 1] through S [n] to W ; MERGESORT3(third; U ); MERGESORT3(third; V ); MERGESORT3(n , 2  third; W ); MERGE(third; third; U; V; UV ); # The inputs to MERGE are: h and m (the lengths of the two sorted # arrays to be merged), the arrays themselves, and the var-ed array that # will contain the keys in the two smaller arrays in sorted (non# larger context decreasing) # order. MERGE is O(h + m , 1), which means that it is O(n) in the of MERGESORT3. MERGE(2  third; n , 2  third; UV; W; S ); end end; The master theorem will aid us in our time complexity analysis. Let us consider W(n), the worst-case time complexity function. The recurrence relation is: W (n) = 3W (n=3) + O(n) (don’t worry about floors and ceilings here). Remember that MERGE is O(n), with its worst-case W(n)  (n) ; the copy operations are also O(n), if you would like to count those. So, we will use the master theorem with a = 3; b = 3; and f (n) (n) and O(n). Then, nlogb a = nlog3 3 = n1 = n. We are in case 2 of the master theorem, and thus W (n) (nlgn). k-Way Merge Sort Give an O(n log k ) algorithm to merge k sorted lists into one sorted list, where n is the total number of elements Exercise 3.13 1. in all the input lists. Analyze the time complexity of your algorithm. 2. Analyze a k -way merge sort algorithm which first splits the input array in k arrays (instead of 2) of size n=k and then merges them. Is it better than merge sort? Proof: 1. (Solution 1) An O(n log k ) algorithm for the problem is to pair the lists and merge each pair, and recursively merge the k=2 resulting lists. The time is given by the recurrence T (n; k ) = O(n) + T (n; k=2), since the time to perform the merge for each pair of lists is proportional to the sum of the two sizes, so a constant amount of work is done per element. Then unwinding the recurrence gives log k levels each with O(n) comparisons, for a total time of O(n log k ). Proof: (Solution 2) Informally, another solution is the following. Given k sorted lists L1 ; : : : ; Lk as input, we want our algorithm to output a sorted array A containing all elements in the lists. As in the algorithm for merging two lists, our algorithm will put at each step one element in the final array A, starting with the smallest and ending with the biggest. Then the basic idea is to construct a heap having k elements, one from each list and use it as a priority queue. More precisely, the i-th element in the heap will be the smallest element in the list Li which has still not been inserted in the final array A. Then at each step our algorithm will do the following: it will extract the minimum from the heap and insert it in array A; then, if the minimum was an element of list Lj , it will insert in the heap the next smallest element in Lj . Notice that to compute whether the minimum of the heap was in the i-th list, we start our algorithm by rewriting each element Li [j ] of the i-th list as a pair (Li [j ]; i); that is, by labeling each element of the list with the number of the list to which it belongs. This labeling phase takes n steps, since there are n elements to be labelled. Moreover, there are n basic insertion steps since there are n elements to be inserted in A, and for each of these n steps, there is an operation of extraction of the minimum from the heap and an operation of insertion of an element in the heap, which both 36 take O(log k ) time. The time to construct the heap in the first step is O(k log k ). Then, since k computation time is O(n log k ). The algorithm is the following: K-MERGE(L1; : : : 1. for i = 1; : : : 2.  n, the overall ; Lk ) ; k, rewrite all elements Li [j ] of the i-th list as a pair (Li [j ]; i); 3. for i = 1; : : : ; k, 4. Heap-Insert(H; Li [1]); (insert the first element of list Li in heap H ). 4. set indi = 2; 5. for i = 1; : : : ; n, 6. A[i] Heap-extract-min(H ); 7. ifA[i] = (Lh [j ]; h) for some h then 8. Heap-Insert(H; Lh[indh ]); (insert the next element of list Lh in heap H ). 9. set indh indh + 1; 10. return(A). 2. There will be k subproblems of size n=k and a k -way merging which takes O(nlog k ). Therefore, the recurrence is T (n) = k  T (n=k ) + O(n  log k ). Using the tree method, for example, we get T (n) = h  (n  log k ), where h is the height of the tree, that is, logk n. Hence, T (n) = n  logk n  log k = n  (log n= log k)  log k = n  log n. The conlcusion is that k -way merge sort is not better than the usual merge sort. Exercise 3.14 (Convex Hull; page 35, Skiena) Given n points in two dimensions, find the convex polygon of smallest area that contains them all. Proof: Let (x1 ; y1 ); ::; (xn ; yn ) be n points in two dimensions. Very often in geometrical problems, it is a good idea to sort the points by one or both coordinates. In our problem, we sort the points (xi ; yi ) such that either xi < xi+1 or xi = xi+1 and yi  yi+1 . This sorting assures us that the points are visited from the left to the right and from the bottom to the top when the counter increases from 1 to n. The strategy to solve this problem is to iteratively obtain the convex hull given by the points 1::i, where i increases from 2 to n. Notice that the ith point is always a vertex in the convex hull given by the points 1::i. We have to understand how the convex hull is modified when a new point is added. First, let us consider two arrays high[1::n] and low[1::n] where high[i] and low[i] are going to be the high and the low neighbors of i in the convex hull given by 1::i, respectively. The key observation is that if there is a point k 2 1::i , 1 such that k is below the line determined by the points i and high[k ] then k cannot be a vertex in the convex hull of 1::i; similarly, k cannot be a vertex in the convex hull of 1::i if k is above the line determined by i and low[k ]. Summarizing all these up, we get the following algorithm: Step 1: Sort the pairs (xi ; yi ) using a standard comparison algorithm (for example M ERGE -S ORT) where the order relation is given by: (xi ; yi )  (xj ; yj ) iff xi < yi or xi = xj and yi  yj , Step 2: high[1::n], low[1::n]; high[1] = 0, low[1] = 0 for i = 2 to n k =i,1 while B ELOW(k; i; high[k ]) = true k = high[k] high[i] = k k =i,1 while A BOVE(k; i; low[k ]) = true k = low[k] low[i] = k 3.5. EXERCISES WITH SOLUTIONS 37 where B ELOW(k; i; j ) is true iff k is below the line determined by the points i and j , and A BOVE(k; i; j ) is true iff k is above the line det...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern