This preview shows pages 1–5. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Analysis of Algorithms
Merge Sort and Analysis of Recurrences c 2011 Aybar C. Acar & Ci˘dem G¨nd¨z Demir
¸g
uu
Compiled on: February 21, 2011
Analysis of Algorithms Divide and Conquer
“Nothing is particularly hard if you divide it into small
jobs.”
– Henry Ford
Many useful algorithms follow what is called the
‘‘divideandconquer” approach:
Divide the problem into smaller subproblems.
Conquer each subproblem by either:
solving it, if simple enough
or recursively dividing into smaller
subsubproblems
Combine the results of the subproblems to get the result of the
original problem. Analysis of Algorithms Sorting with DivideandConquer
Given an array A[0 . . . n − 1] of items, one way to approach the
problem is:
Divide: Bisect the array into two subarrays of size n/2
Conquer: Sort the subarrays directly or recursively bisect until
they are trivial to sort.
Combine: Merge the individual sorted results until you get back
the sorted A
This approach is called Merge Sort.
Usually, the array is recursively divided into subarrays of size 1
Really trivial However, variants exist that only divide down to 57 items
before conquering
Use a diﬀerent algorithm like insertion sort to ﬁnish the job.
Called “hybrid merge sort” The “merge” step is crucial.
Analysis of Algorithms The Merge Operation
An array A, with two adjacent subarrays A[p , q ) and A[q , r ) that
are sorted within themselves (called sorted runs):
p A: ... ... q 1 4 5 7 r 2 3 6 8 ... ... How do you get a combined sorted run A[p , r )?
Split A[p , r ) at q and copy into two temporary arrays, L and R:
p 0 1 L: 1 4 1 4 5 2 3 4 5 7 7 2 r 3 6 8 ... ... 0 R: 1 2 3 4 2 3 6 8 8 ... ... 8 A: q The last items in L and R are sentinels, values that delimit the
data. Assume no value can be larger than a sentinel.
Analysis of Algorithms The Merge Operation (cont.)
Merge and L and R back into A[p , r ) as a single sorted run:
2 3 1 4 5 7 4 0 1 2 3 4 2 3 6 8 8
8
8
8
8
8
8
8
8
8 1 8 0 p ... ... 1 q 2
4 3
5 4
7 5
2 r 6
3 7
6 8 ... ... Question
What is the invariant here?
At the end of kth iteration, A[p , p + k ) contains the smallest k
elements of L and R.
Analysis of Algorithms The Merge Algorithm
merge(A,p,q,r):
nL ← q − p
nR ← r − q
create L[0 . . . nL ] and R [0 . . . nR ]
L[nL ] ← ∞ R [nR ] ← ∞
for i ← 0 to nL − 1 do
L [i ] ← A [p + i ]
end for
for j ← 0 to nR − 1 do
R [j ] ← A [q + j ]
end for
i ←0j ←0
for k ← p to r − 1 do {Place sentinels}
{Copy A[p , q ) to L} {Copy A[q , r ) to R} {Get the smaller top value of L or R}
if L[i ] ≤ R [j ] then
A [k ] ← L [i ]
i ←i +1
else
A [k ] ← R [j ]
j ←j +1
end if
end for
Analysis of Algorithms Time Complexity of Merge Question
What is the time T(n) for merge (A, p , q , r ) if n = (r − p )?
T (n) = c1 + c2 n
Question
What is the Θ bound of T (n)?
T (n) = Θ(n) Analysis of Algorithms The Merge Sort Algorithm mergesort(A,p,r):
if p < r − 1 then
q ← (p + r )/2
mergesort (A, p , q )
mergesort (A, q , r )
merge (A, p , q , r )
end if {Can’t split a single value!}
{Divide}
{Conquer}
{Conquer}
{Combine} For the whole array:
mergesort (A, 0, n) Analysis of Algorithms Merge Sort: Random Input Analysis of Algorithms Some Observations The number of times mergesort recurses is only a function of
how many times you can bisect A
So, it is dependent only on n The running time of merge is Θ(n) regardless of the numbers
themselves.
Merge sort’s running time should depend only on n and not
the conﬁguration of the array A
Compare with insertion sort... Analysis of Algorithms Merge Sort: Reverse Sorted Input Analysis of Algorithms The Time Requirement of Merge Sort
mergesort(A,p,r):
if p < r − 1 then
{Θ(1)}
{Θ(1)}
{T ( n/2 )}
{T ( n/2 )}
{Θ(n)} q ← (p + r )/2
mergesort (A, p , q )
mergesort (A, q , r )
merge (A, p , q , r )
end if
Can be shown with what is known as a “recurrence”: T (n ) = Θ(1)
: n ≤ 1(Base case )
T ( n/2 ) + T ( n/2 ) + Θ(n) : n > 1(Recursion) This can be simpliﬁed with a few harmless assumptions:
T (n) = 2T (n/2) + Θ(n)
Analysis of Algorithms Finding Bounds on Recurrences: Substitution Method T (n) = 2T (n/2) + n
How do we ﬁnd the bound?
The straightforward way is to “guess” the bound and verify.
Called the “Substitution Method”
Uses mathematical induction
Used to ﬁnd either the upper or lower bound. Analysis of Algorithms Substitution Method: Induction
T (n) = 2T (n/2) + n
Induction Step:
Assume T (n) = O (n log n). Then,
T (n0 ) ≤ cn0 log n0
for some c , n0 :
n log (n/2)
)+n
2
≤ cn log (n/2) + n T (n) ≤ 2(c = cn log n − cn log 2 + n
= cn log n − cn + n
≤ cn log n Analysis of Algorithms Basis of Induction
T (n) ≤ cn log n
Basis: Prove that this is true for n = 2 and n = 3.
Assume T (1) = 1. From the recurrence, T (n) = 2T (n/2) + n:
T (2) = 4
T (3) = 5
Then,
T (2) ≤ c 2 log 2
T (3) ≤ c 3 log 3
So we can select n0 = 4 and c = 3.
Important
Notice that T (n) < cn log n does not hold for T (1):
(T (1) = 1) > (c 1 log 1 = 0)
But this is not a problem, since we only need to prove T (n) ≤ cn log n for n above
some n0 . We only need T (2) and T (3) to compute T (n ≥ 4) and they hold. So we
just choose n0 = 4.
Analysis of Algorithms Caution
You must obey the exact form of the inductive hypothesis lest you end up with
false conﬁrmations.
Assume you guess T (n) = O (n):
≤ 2(c (n/2)) + n ≤ cn + n = T (n ) O (n) ⇐ WRONG!! To see this, assume that T(1) = 1, and using the original recurrence we know
that T(2) = 4, T(4) = 12, T(8) = 32, T(16) = 80 &c. Hence,
T (2) = 4 ≤ 2c only if c≥2 T (4) = 12 ≤ 4c only if c≥3 T (8) = 32 ≤ 8c only if c≥4 T (16) = 80 ≤ 16c
.
.
. only if c≥5
.
.
. It is easy to see that there is no constant value of c and n0 such that the
inequality will hold for any n ≥ n0
Analysis of Algorithms Substitution Method Substitution conﬁrms that mergesort is indeed O (n log n).
Substitution method cannot “solve” the recursion.
It can only conﬁrm your guess is correct.
So how do we guess?
Experience
There are usually only a few possibilities
Merge sort was obviously Ω(n) and O (n2 ).
The reasonable guess was n log n, since it is between n and n2 . Other methods such as iteration method
also called repeated substitution Analysis of Algorithms Iteration Method
The iteration method expands the recurrence until it reaches
the base case.
By repeatedly substituting the recurrence into itself.
T (n ) = 2T (n/2) + n T (n ) = 2(2T (n/4) + n/2) + n T (n ) = 4T (n/4) + 2n/2 + n T (n ) = 4(2T (n/8) + n/4) + n + n T (n ) = 8T (n/8) + n + n + n .
.
. = .
.
. T (n ) = 2k T (n/2k ) + kn When n = 2k , k = log n. So,
T (n) = 2T (n/n) + (log n)n = nT (1) + n log n = n + n log n = Θ(n log n) Analysis of Algorithms Recursion trees
Recursion trees are incremental graphical representations of
the recursion of an algorithm.
They can be seen as a graphical version of the iteration
method.
Useful in ﬁnding the form of the bound.
Assume you have the recurrence:
T (n) = 3T (n/4) + cn2
How is the ﬁrst level of recursion going to work? cn2 T (n/4) T (n/4) T (n/4) Analysis of Algorithms Recursion Tree: Development cn2 c T n
16 n 2 T n
16 c 4 T n
16 T n
16 n 2 T c 4 n
16 T n
16 T Analysis of Algorithms n
16 n 2 T 4 n
16 T n
16 Recursion Tree: Final Tree
cn2
c
log4 n
c n 2
16 c n 2 n 2
16 T (1) T (1) T (1) T (n) = cn2 + c 4 c n 2
16 c n 2
16 T (1) T (1) T (1) 32
3
cn +
16
16 cn2 n 2 c c 4 n 2
16 c ...................... n 2
16 c n 2
16 n 2 c n 2
16 T (1) T (1) T (1) 2 cn2 +. . .+ 3
16 32
cn
16 4 n 2
16 T (1) T (1) T (1) log4 n−1 Analysis of Algorithms c 3
16 2 cn2 Θ(nlog4 3 ) cn2 +Θ(nlog4 3 ) Simpliﬁcation of T(N)
32
cn +
16 T (n) = cn2 +
+ log4 n−1 3
16 log4 n−1 T (n ) = T (n ) = i =0 3
16 2 cn2 + . . . cn2 + Θ(nlog4 3 )
i cn2 + Θ(nlog4 3 ) i =0
3 log4 n
16
cn2
3
−1
16
∞
i <
= 3
16 3
16 + Θ(nlog4 3 ) cn2 + Θ(nlog4 3 ) 1
cn2 + Θ(nlog4 3 )
3
1 − 16 = O (n2 )
Analysis of Algorithms ...
View
Full
Document
 Spring '11
 Pablo
 Algorithms, Sort

Click to edit the document details