503_lecture3_S11

503_lecture3_S11 - UMass Lowell Computer Science 91.503 91...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: UMass Lowell Computer Science 91.503 91 503 Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2011 p g, Lecture 3 Friday, Friday, 2/11/11 y Amortized Analysis 1 Overview Amortize: "To pay off a debt, usually by periodic To payments" [Websters] Websters] Amortized Analysis: "creative accounting" for operations can show average cost of an operation is small (when averaged over sequence of operations, not distribution of inputs) even though a single operation in the sequence is expensive result must hold for any sequence of these operations guarantee holds in worst-case worst- no probability is involved (unlike average-case analysis) average analysis method only; no effect on code operation Stack (computational geometry application) Dynamic Table Binary Counter (in book but not lecture) 2 More Motivation DepthDepth-First and Breadth-First Search BreadthDijkstra's Single-Source Shortest Path (demo) SingleFibonacci Heaps p Knuth-MorrisKnuth-Morris-Pratt String Matching RedRed-Black Tree Restructuring ...(see index) 3 source: 91.503 textbook Cormen et al. Overview (continued) 3 ways to determine amortized cost of an operation that is part of a sequence of operations: Aggregate Method find upper bound T(n) on total cost of sequence of n operations amortized cost = average cost per operation = T(n)/n same for all the (potentially different types of) operations in the sequence amortized cost can differ across types of operations overcharge some operations early in sequence store overcharge as "prepaid credit" on specific data structure objects amortized cost can differ across operations (as in accounting method) overcharge some operations early in sequence (as in accounting method) store overcharge as "potential energy" of data structure as a whole t h " t ti l " fd t t t h l (unlike accounting method) 4 Accounting Method Potential Method Aggregate Method: Stack Operations Aggregate Method find upper bound T(n) on total cost of sequence of n operations amortized cost = average cost per operation = T(n)/n same for all the operations in the sequence Traditional Stack Operations PUSH(S,x) PUSH(S,x) pushes object x onto stack S POP(S) pops top of stack S, returns popped object O(1) time: consider cost as 1 for our discussion Total actual cost of sequence of n PUSH/POP operations = n STACKSTACK-EMPTY(S) time in (1) 5 Aggregate Method: Stack Operations (continued) New Stack Operation MULTIPOP(S,k) MULTIPOP(S,k) pops top k elements off stack S pops entire stack if it has < k items MULTIPOP(S,k) MULTIPOP(S,k) 1 while not STACK-EMPTY(S) and k = 0 STACK( ) 2 3 POP(S) POP(S) k =k-1 Use cost =1 for each POP 1 Cost = min(s,k) min(s,k) WorstWorst-case cost in O(s) MULTIPOP actual cost for stack containing s items: in O(n) 6 source: 91.503 textbook Cormen et al. Aggregate Method: Stack O St k Operations (continued) ti ( ti d) 7 source: 91.503 textbook Cormen et al. Aggregate Method: Stack Operations (continued) Sequence of n PUSH POP MULTIPOP ops PUSH, POP, initially empty stack MULTIPOP worst-case O(n) worst( ) O(n2) for sequence ( q Aggregate method yields tighter upper bound Sequence of n operations has O(n) worst-case cost worst Each item can be popped at most once for each push # POP calls (including ones in MULTIPOP) <= #push calls <= n Average cost of an operation = O(n)/n = O(1) = amortized cost of each operation holds for PUSH, POP and MULTIPOP PUSH 8 source: 91.503 textbook Cormen et al. Accounting Method Accounting Method amortized cost can differ across operations overcharge some operations early in sequence store overcharge as "prepaid credit" on specific data structure objects Let ci be actual cost of ith operation ^ Let ci be amortized cost of ith operation (what we charge) charge) n n Total amortized cost of sequence of operations must ci ^ ci bei =1 upper bound on total actual cost of sequence i =1 n n Total credit in data structure = c - c ^ must be nonnegative for all n i =1 i i =1 i 9 source: 91.503 textbook Cormen et al. Accounting Method: Stack Operations Operation Actual Cost Assigned Amortized Cost PUSH 1 POP 1 MULTIPOP min(k,s) min(k,s) 2 0 0 Paying for a sequence using amortized cost: start with empty stack PUSH of an item always precedes POP, MULTIPOP pay for PUSH & store 1 unit of credit credit for each item pays for actual POP, MULTIPOP cost of that item credit never "goes negative" total amortized cost of any sequence of n ops is in O(n) O(n amortized cost is upper bound on total actual cost 10 source: 91.503 textbook Cormen et al. Potential Method Potential Method amortized cost can differ across operations (as in accounting method) overcharge some operations early in sequence (as in accounting method) store overcharge as "potential energy" of data structure as a whole (unlike accounting method) Let ci be actual cost of i th operation Let Di be data structure after applying i th operation Let (Di ) be potential associated with Di Amortized cost of i th operation: c = c + ( D ) - ( D A ti d t f ti ^ i i i i -1 ) Total amortized cost of n operations: ^ c i =1 n i = (c i =1 n i + ( D i ) - ( D i -1 )) = c i =1 n i + ( Dn ) - ( D0 ) terms telescope Require: ( Dn ) ( D0 ) so total amortized cost is upper bound on total actual cost Since n might not be known in advance guarantee "payment in advance by requiring advance, payment advance" ( Di ) ( D0 ) 11 source: 91.503 textbook Cormen et al. Potential Method: Stack Operations Potential function value = number of items in stack guarantees nonnegative potential after ith operation Amortized operation costs (assuming stack has s items) PUSH PUSH: potential difference= ( Di ) - ( Di -1 ) = ( s + 1) - s = 1 amortized cost = ci = ci + ( Di ) - ( Di -1 ) = 1 + 1 = 2 ^ MULTIPOP(S,k) MULTIPOP(S,k) pops k'=min(k,s) items off stack k'=min(k,s) potential difference= ( D ) - ( D ) = -k ' i i -1 amortized cost = ti d t ^ ci = ci + ( Di ) - ( Di -1 ) = k '-k ' = 0 POP amortized cost also = 0 Amortized cost O(1) total amortized cost of12 sequence of n operations in O(n) source: 91.503 textbook Cormen et al. Dynamic Tables: Overview Dynamic Table T: array of slots Ignore implementation choices: stack, heap, hash table... if too full, increase size & copy entries to T' if too empty, decrease size & copy entries to T' Actual expansion or contraction cost is large Show amortized cost of insert or delete in O(1) ( ) num[T] = number of items currently stored in table empty table: (T) = 1 (convention guaranteeing load factor can be full table: (T) = 1 lower bounded by a useful constant) 13 source: 91.503 textbook Cormen et al. Analyze dynamic table insert and delete Load factor (T) = num[T]/size[T] Dynamic Tables: Table (Expansion Only) Load factor bounds (d bl size when T is full): bo nds (double i h i f ll) Sequence of n inserts on initially empty table WHY? WorstWorst-case cost of insert is in O(n) WorstWorst-case cost of sequence of n inserts is in O(n2) LOOSE Double only when table is already full. "elementary" insertion 14 source: 91.503 textbook Cormen et al. Dynamic Tables: Table Expansion (cont.) (cont ) Amortized Analysis Aggregate Method: count only elementary insertions ci= Accounting Method: i if i-1 is exact power of 2 icopy 1 otherwise insert n lg n total cost of n inserts = ci n + 2 j < n + 2n = 3n i =1 j =0 charge cost = 3 for each element inserted g intuition for 3 each item pays for 3 elementary insertions inserting itself into current table expansion: moving itself expansion: moving another item that has already been moved 15 source: 91.503 textbook Cormen et al. Dynamic Tables: Table Expansion (cont.) (cont ) Amortized Analysis y Potential Method: source: 91.503 textbook Cormen et al. (T ) = 2num[T ] - size[T ] Value of potential function () 0 right after expansion (then becomes 2 why?) why?) builds to table size by time table is full (T ) = 0 when size[T ] = 2num[T ] (T ) = 2num[T ] - size[T ] always nonnegative, so sum of amortized costs of ni inserts is upper bound on sum of actual costs i b d f l Amortized cost of ith insert i = potential after ith operation Case 1: insert does not cause expansion Case 2: insert causes expansion p 3 functions: sizei, numi, i ^ ci = ci + i - i -1 = 1 + (2numi - sizei ) - (2numi -1 - sizei -1 ) = 3 ^ ci = ci + i - i -1 = numi + (2numi - sizei ) - (2numi -1 - sizei -1 ) = 3 use these: numi -1 = sizei -1 numi = numi -1 + 1 sizei = 2 size 16 i -1 Dynamic Tables: Table Expansion & Contraction Load factor bounds: bo nds: count elementary insertions & d l ti i ti deletions (double size when T is full) (halve size when T is full (why ?)): (why ?)): same as INSERT DELETE pseudocode analogous to INSERT a a ogous NS Amortized Analysis (T ) = 2num[T ] - size[T ] if (T ) 1 / 2 Potential Method: size[T ] / 2 - num[T ] if (T ) < 1 / 2 Value of potential function () Different from INSERT = 0 for empty table 0 right after expansion or contraction builds as (T) increases to 1 or decreases to always nonnegative, so sum of amortized costs of se ts s upper bound on sum of actual n inserts is uppe bou d o su o actua costs = 0 when (T)=1/2 motivation for choice of potential function: can pay = num[T] when (T)=1 17 for moving num[T] items = num[T] when (T)=1/4 source: 91.503 textbook Cormen et al. Dynamic Tables: Table Expansion & Contraction Amortized Analysis Potential Method 3 functions: sizei, numi, i motivation for choice of potential function 18 source: 91.503 textbook Cormen et al. Dynamic Tables: Table Expansion & Contraction (T ) = 2num[T ] - size[T ] if (T ) 1/ 2 Potential Method size[T ] / 2 - num[T ] if (T ) < 1/ 2 Analyze cost of sequence of n inserts and/or deletes A l t f f i t d/ d l t Amortized cost of ith operation Case 1: INSERT Case 1a: i-1 >= . By previous insert analysis: c 3 ^i holds whether or not table expands Amortized Analysis source: 91.503 textbook Cormen et al. Case 1b i-1 < and i < C 1b: d Case 1c: i-1 < and i >= no expansion (why?) ^ ci = ci + i - i -1 = 1 + (( sizei / 2) - numi ) - (( sizei -1 / 2) - numi -1 ) = 0 no expansion (why?) ^ ci = ci + i - i -1 = 1 + (2numi - sizei ) - (( sizei -1 / 2) - numi -1 ) 3 19 see derivation in textbook Dynamic Tables: Table Expansion & Contraction Amortized Analysis Potential Method Amortized cost of ith operation (continued) (T ) = 2num[T ] - size[T ] if (T ) 1/ 2 Case 2: DELETE size[T ] / 2 - num[T ] if (T ) < 1/ 2 Case Case 2a: i-1 >= . textbook exercise 2b: i-1 < and i < no contraction ^ ci = ci + i - i -1 = 1 + ( sizei / 2 - numi ) - ( sizei -1 / 2 - numi -1 ) = 2 C Case 2 i-1 < and i < 2c: d contraction t ti ^ ci = ci + i - i -1 = (numi + 1) + ( sizei / 2 - numi ) - ( sizei -1 / 2 - numi -1 ) = 1 Conclusion: Conclusion: amortized cost of each operation i b C l i ti d t f h ti is bounded above d d b by a constant, so time for sequence of n operations is in O(n). O(n). 20 source: 91.503 textbook Cormen et al. Example: Dynamic Closest Pair S = S1 S 2 S3 S1 S2 S3 Goal: Goal: Fast maintenance of closest pair in dynamic setting 21 source: "Fast hierarchical clustering and other applications of dynamic closest pairs," David Eppstein, Journal of Experimental Algorithmics, Vol. 5, August 2000. Example: Dynamic Closest Pair (continued) S S S 22 source: "Fast hierarchical clustering and other applications of dynamic closest pairs," David Eppstein, Journal of Experimental Algorithmics, Vol. 5, August 2000. Example: Dynamic Closest Pair (continued) Rules Partition d P titi dynamic set S i t k log n subsets. i t into b t Each subset Si has an associated digraph Gi consisting of a set of disjoint, directed paths. Total T l number of edges in all graphs remains linear b f d i ll h i li Combine and rebuild if number of edges reaches 2n. 2n Closest pair is always in some Gi. Initially all points are in single set S1. Operations: Create Gi for a subset Si. Insert a point. Delete a point. Merge subsets until k log n . We use log base 2. source: "Fast hierarchical clustering and other applications of dynamic closest pairs," 23 David Eppstein, Journal of Experimental Algorithmics, Vol. 5, August 2000. Example: Dynamic Closest Pair (continued) Rules: Operations source: "Fast hierarchical clustering and other applications of dynamic closest pairs," David Eppstein, Journal of Experimental Algorithmics, Vol. 5, August 2000. pp , p g , , g Create Gi for a subset Si: Select starting point (we choose leftmost (or higher one in case of a tie)) Iteratively extend the path P, selecting next vertex as: C Case 1 nearest neighbor in S \ P if last point on path belongs to Si 1: i hb i l i hb l Case 2: nearest neighbor in Si \ P if last point on path belongs to S \ Si Insert a point x: Create new subset Sk+1={x}. } k+1 Merge subsets if necessary. Create Gi for new or merged subsets. Delete a point x: Create new subset Sk+1= all points y s ch that (y,x) is a directed edge in some Gi. ne s bset k+1 such (y,x) Remove x and adjacent edges from all Gi. (We also remove y from its subset.) Merge subsets if necessary. Create Gi for new or merged subsets. Merge subsets until k log n : Choose subsets Si and Sj to minimize size ratio |Sj|/ |Si|. 24 See handout for example. Example: Dynamic Closest Pair (continued) Potential Function source: "Fast hierarchical clustering and other applications of dynamic closest pairs," David Eppstein, Journal of Experimental Algorithmics, Vol. 5, August 2000. pp , p g , , g Potential for a subset Si : i = n|Si|log|Si|. log|S Total potential = n2logn - i. logn Paper proves this Theorem in Section 3: Theorem: Theorem: The data structure maintains the closest pair i S in O(n) space, amortized ti i in i ti d time O(nl n) per log insertion, and amortized time O(nlog2n) per deletion. HW#3 contains a problem related to this paper. Please read up through Section 3. Later in the semester we will have sufficient background for the remainder of the paper. 25 ...
View Full Document

Ask a homework question - tutors are online