Advanced Algorithms 1.5

# Advanced Algorithms 1.5 - 12.1.5 The Expected Value for 2i...

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 12.1.5 The Expected Value for 2i , 2i , 1 Over All Cases Weighting the three cases presented above by the probability of their occurences, we have that: k ! k,na k ! k2 a + X 1 k2 dA; bl 1 E 2i , 2i , 1 Nda; A , 2na na+l 2 3 l=1 NdA; bl 2 X k2k 4,1 + k,na 1 5 N 2na l l=1 2 k = , N geometric series where bl is the server which before the move has M b = B , where B is the lth closest OFF server to a from outside the disk of radius a centered at a. Note that the sum is to k , na which is the number of OFF servers outside of the disk of radius a centered at a. This is the result we wanted. 12.1.6 The Intuition Recall that x = k2k min R2x;j, = k2k 2A : j, na , Here is some intuition behind the use of the use of the exponential factor 2j,j. If na is large then there are not many OFF servers outside of a so the 2na term is large, making a smaller. On the other hand, if na is small, then there are many OFF servers outside of the circle, and these contribute to a big potential function by 2 making 2nka big. These two cases correspond to the cost we could expect. If na is large, then Harmonic will have a reasonably high probability of moving an ONline server whose matched OFFline server is in the disk centered at a of radius a to cover a request at A. This would result in a "small" expected cost. However, by a similar argument, we would expect a larger expected cost if na is small. 13 A k-competitive, deterministic algorithm for trees The results of this section are due to Chrobak and Larmore 5 . Let V; E be a tree an undirected graph without cycles with a positive distance function on each edge. We view each edge, e; as an actual segment of length de. Let W denote the in nite set of all points on all edges in E , and let requests be any points in W . The cost of travelling from an endpoint of e to a non-vertex point in edge e will simply be the fraction of de proportional to the location of the point from the end. The algorithm presented will of course also apply to the discrete case where all requests are vertices. Online-33 De nition 11 At any time, there is some request which we are trying to service. We say that a server is active if there is no server located between it and . If there are several servers located at the same point, with no servers located between them and , we pick exactly one of them to be active. We can do this deterministically according to some ordering of the servers if we like, calling the highest-priority server the active one. Note that this de nition makes sense because the acyclic tree structure means that there is exactly 1 path between any two points. I A I Request A A A Figure 8: Active servers are marked with an A, inactive ones with an I. Note that if all servers move with a constant speed, the closest one at the bottom will reach the request and the others will become inactive somewhere on the way to the request. Our algorithm A is simple to describe: to service a request i, all active servers move towards i at constant speed, each one stopping when one of the following is true 1. the server reaches the destination, i. 2. the server is eclipsed by another server and becomes inactive; According to the second condition, a server that is active at the beginning of a request might not be active later on in the request. As soon as a server sees another server between it and i, it becomes inactive and stops moving. 13.1 Proof of k-competitiveness We will use a potential function to show that, for any sequence 1; 2; : : : of requests, CA kCMIN . Let t be the value of the potential function after the on-line ~ algorithm has processed t. Let t be the value of the potential function before the on-line algorithm has processed t but after the o -line algorithm has processed it. Suppose that the on-line servers are at postions s1; s2; : : : ; sk and the o -line servers are at positions a1; a2; : : : ; ak . We de ne the potential funtion by X = dsi ; sj + kM s; a: i j where M s; a is the minimum cost matching between the on-line servers and the o -line servers. i.e. M s; a = min Pk=1 dai; s i where the minimum is taken over i Online-34 all permutations . Note that this is a potential function which often arises in the analysis of algorithms for the k-server problem. ~1 1 ~2 2 Request is made and an off-line server is moved. 1 a1 2 a2 3 Server moved by on-line algorithm. Claim 23 ~ t , t,1 kCMIN t; ~ t , t ,CA t: where A is the on-line algorithm. From this claim we can derive that t , t,1 kCMIN t , CA t. If we take the sum over t then the left hand side telescopes to give us nal , 0 kCMIN , CA . We know that 0 nal and so CA kCMIN + 0. We assume that all of the on-line servers and all of the o -line servers are initially at the same single point. This implies that 0 = 0 and so CA kCMIN . From this inequality we can see that the algorithm is kcompetitive. It only remains for us to prove the claim. Proof of claim We rst consider what happens when kthe o -line algorithm moves P a server to the request. Choose so that M s; a = i=1 dai; s i. Suppose that the o -line algorithm chooses to move al to the request. Since none of the on-line servers are moving Pi j dsi ; sj does not change. Hence ~ t , t,1 = kM s; a kdistance traveled by al to the request = kCMIN t: We now consider how changes when the on-line algorithm moves its servers. Number the on-line servers so that s1; s2; : : : ; sq are active and sq+1 ; : : : ; sk are inactive. Online-35 The active servers are all moving at the same speed towards the request. Suppose that they all move a small distance . It is easy to see that without loss of generality al is matched to an active server in the minimum cost matching. Recall that al is the o -line server which is already at the request. Hence the distance between these two servers decreases by . Also, the distance between other pairs of matching servers increases by at most . Therefore, M q , 1 , = q , 2 : We also have to consider the change in Pi j dsi; sj . Let X SI = dsi ; sj ; q i j X SIA = dsi ; sj ; iq j X SA = dsi ; sj : i j q Then, ! , q SA = ,2 2 = 2 qq 2 1 = , qq , 1; SIA = k , q , q , 1 = k , q 2 , q; SI = 0: The rst equation comes from the fact that each pair of active servers move towards each other by a distance ,2 . The second is true because there are k , q inactive servers, each of which has one active server moving awayPfrom it and q , 1 moving towards it. From the above equations we derive that i j dsi; sj increases by k , q2 , q , qq , 1 . Hence k , q2 , q , qq , 1 + kq , 2 = ,q : Now q is the cost incurred by the on-line algorithm when it moves its q active servers ~ a distance . Hence t , t ,CA t which is the inequality that we were trying to prove. Note that a special case of the k-server problem on a tree is paging. Create M tree-vertices corresponding to the pages of main slow memory, create one dummy 1 tree-vertex, v, and connect v to all the other M vertices using edges of length 2 . 1 + 1 , the cost of Note that the cost of moving a server from one page to another is 2 2 Online-36 13.2 Paging as a case of k-server on a tree V i 1 _ 2 Figure 9: Paging as a special case of the k-server problem on a tree. swapping. More generally, we could let the length of the edge from v to page i be any positive function f i, obtaining the Generalized Paging problem, where the cost of swapping pages i and j is f i + f j . Let us consider the behavior of the above algorithm A on this special case. The resulting algorithm for paging is known as Flush-When-Full. The interpretation is simple if one keeps track of marked" pages. When a server is at a vertex corresponding to a page p, this page is considered marked. As soon as the server leaves that vertex to go towards v, the page will be unmarked. Initially there are k servers on k pages. These pages are thus marked. Suppose request i causes the rst page fault. Algorithm A will then move all servers towards v, resulting in the unmarking of all pages in fast memory. All these servers move at constant speed towards v and thus will reach the middle vertex v at the same time. One arbitrarily selected server at v will continue to the requested page i and the other k , 1 will become inactive. The page i will then be marked. Later, if there is a page fault on say j and there is at least one server at vertex v, one of these active servers will be moved to j . In terms of paging, this is interpreted as swappping j with an arbitrarily selected unmarked page of fast memory and then marking j . The claim of the previous section implies that Flush-When-Full is k-competitive. Of course we don't have to move more than one server per step. We could pretend" to moves servers simultaneously but actually just keep track of where the servers should be by keeping track of which pages of fast memory are marked and move one server to the request-destination per step. The cost per step would be the the total distance travelled by that server since the last time it reached a request-destination. This way the cost per page-fault is always exactly one. Flush-When-Full is much like Marking, except that Marking uses randomization to select a server at a tie. Flush-When-Full is k-competitive while Marking is Hk -competitive against an oblivious adversary. This shows how useful a simple randomization step can be. Flush-When-Full applies to Generalized Paging and is the only known k -competitive algorithm for that problem. Question: Could LRU be an example of Flush-When-Full? Yes, it is. In other words, LRU would never get rid of a marked page and thus, by carefully selecting which unmarked page to remove from fast memory, Flush-When-Full reduces to Online-37 LRU. To see that a marked page is never removed from fast memory by LRU, notice that each marked page has been requested after the last request to any unmarked page. 14 Electric Network Theory We will use electric network theory for a randomized k-server algorithm due to Coppersmith, Doyle, Raghavan, and Snir 8 . Their algorithm will be k-competitive against an adaptive on-line adversary for a subclass of metrics. An electric network is a graph G = V; E such that each edge has weight e = 1 , where Re 2 R+ is called the resistance and e 2 R+ is called the conductance Re of edge e. We can then ask what the e ective resistance also called the equivalent resistance between any two vertices is, i.e. the resistance which is felt when applying a di erence of voltage between any two vertices. The e ective conductance is the inverse of the e ective resistance. For resistances in series, the e ective resistance between the endpoints is equal to the sum of the resistances. For resistances in parallel, the e ective conductance is equal to the sum of the conductances. See Figures 10 and 11. In general, though, these rules are not enough to determine the e ective resistance between any two vertices in an electric network consider the case when the underlying graph is the complete graph K4 on 4 vertices. In full generality, one has to use Kircho 's rst law and the relation V = RI . Simply stated, the rst law says that the sum of the currents entering a junction is equal to the sum of the currents leaving that junction. r1 k r2 rx l Figure 10: Resistance in series. The e ective resistance between k and l is r1 + r2 + : : : + rx. r1 r2 k l ry Figure 11: Resistances in parallel. The e ective conductance between k and l is 1 1 1 r1 + r2 + : : : + ry . Online-38 De nition 12 A distance matrix D = dij is resistive if it is the e ective resistance matrix of some electric network G, G = V; E . Now let us use the e ective resistance as a metric. This is justi ed by the following proposition. Proposition 24 If D is resistive then D is symmetric and D satis es the triangle inequality dij + djk dik 8i; j; k. The converse of this proposition is not necessarily true. A metric does not necessarily induce a resistive distance matrix. In fact, there are metric spaces on four points that aren't resistive. Satisfying the triangle inequality isn't enough. Metrics which correspond to resistive distance matrices will be referred to as resistive. What are resistive matrices or resistive metrics? Here are two simple examples. 1. if D is 3 3, symmetric, and satis es the triangle inequality, then it is always resistive. Hence, the above proposition is true in the converse for 3 3 matrices. As an example, consider a triangular network with e ective resistances 3, 4, 1 3 2 and 5. See gure 12. We claim that the edge-resistances are 11 , 11 and 11 see gure. We verify it for the vertices 1 and 3. Consider the two paths between these vertices. We need the e ective resistance to be 4, and thus the e ective conductance to be 1 . Verify that 4 ! 1 2 + 11 1 11 + 11 = 44 = 4 11 3 1 2. a tree metric is resistive. To see this, make a tree of resistances with Rij = di; j along the edges of the tree. Because there are no cycles, every pair of points is connected by a series of resistances. The e ective resistance is the sum of the edge-resistances. 1 3 2 5 4 3 3 _ 11 1 2 _ 11 2 1 _ 11 3 Figure 12: The given matrix of e ective resistances left is relabelled right as the corresponding electrical network with resistances. Another property of resistive metrics is given by the following lemma. Online-39 Lemma 25 If D is resistive then any induced submatrix D0 is resistive. How can we compute an array of conductances from D? We will not go through the proof here, but the outline of the algorithm is as follows: Assume D is n n. Construct an n , 1 n , 1 matrix D such that dij = d1i + d1j , dij =2 0 for 2 i; j n. Let = D,1 . Let Let Let Pj6=i ii ij = , ij for i 6= j , 2 i; j n. ij = ii for 2 i n. This allows us to determine i1 and 1i. = 0 for 1 i n. If D is resistive, the above procedure yield a matrix of non-negative conductances. 14.1 The Algorithm: RWALK Consider the following algorithm for the k-server problem. We have servers at a1 : : :ak , and we get a request at ak+1. Consider the distance matrix D0 among these k + 1 elements. Assuming D0 is resistive, calculate 0 and then move the server at ai to the request with probability Prai ! ak+1 = 0 i;k+1 P 0 1j k j;k+1 Notice that a shorter distance corresponds to a higher conductance which in turn corresponds to a higher probability. Thus, this algorithm is intuitively correct because we are more likely to move a server close" to the request. Theorem 26 If every induced subgraph on k + 1 points is resistive, then RWALK is k-competitive against an adaptive on-line adversary. Two important cases covered by this theorem are: k-server with k = 2. RWALK can be used and is thus 2-competitive. k-server on a tree. We know that D0 will always be resistive by Lemma 25 and hence RWALK can be used. The theorem shows that RWALK is k-competitive in this case. Online-40 Proof: We need to show E CRWALKQ kE CQRWALK where Q is an adaptive on-line adversary. We can rewrite this as E CRWALK , kCQ 0: We will in fact show that E + CRWALK , kCQ 0 where is a potential function that we will de ne. We will show that at every step, E + CRWALKstep , kCQstep 0: In words, this means that, in any single step, the cost of RWALK is at most k times the cost of the adversary, once the costs are amortized according to the potential function . Summing over all steps, we obtain that E n , 0 + CRWALKQ , kCQRWALK 0; for a sequence of n requests. Since n 0 and we shall assume that 0 = 0, we derive the competitive factor of k. 0 = 0 corresponds to the case in which all servers start initially from the same location see below, and 0 6= 0 would just result in a weak competitive factor of k. We want to measure the di erence between the locations of RWALK's servers and Q's servers. Let a = a1; :::; ak be the set of locations of RWALK's servers, and let b = b1; :::; bk be the set of locations of Q's servers. Then de ne a potential function X a; b = dai; aj + kM a; b 1i j k where M a; b is the cost of the minimum cost matching between A and b in other words, M a; b = min Pk=1 dai; b i where the minimum is taken over all permui tations . Intuitively, the sum is the amount of separation" of the elements of A, and the matching term is the di erence" between the algorithm and the adversary. Some intuitive argument justifying the M a; b term goes as follows. If the algorithm has to pay much more than the adversary, it means that some of the algorithm's servers are far away from the adversary's servers, implying that M a; b is large. The reduction in M a; b can therefore be used to pay for the move. Let us consider the request i. We decompose the processing of i into two steps. step 1: adversary moves a server to i. Online-41 step 2: on-line algorithm RWALK moves a server to i. the minimum matching for the old A and b, we can get a matching for the new A and b by using the same matching. So if bj was matched to ai, then kdai; b0j , dai; bj kdbj ; b0j : We also have and Therefore, Step 1. The adversary moves from bj to b0j . Then kdbj ; b0j because given CRWALKstep = 0; CQstep = dbj ; b0j : E , kCQstep 0: Case 2. The on-line algorithm moves. Since the adversary has already moved a server to the requested node, assume WLOG that the request is at b1 we can always renumber. Let the minimum matching M before the move be a1; b1 : : : ak ; bk again we can renumber to make this the case. Assume that RWALK moves aj to the request b1. We claim that M a; b da1; bj , da1; b1 , daj ; bj since a possible matching can be de ned from the minimum cost matching between the old A's and b's by simply assigning a1 to bj and aj to b1: a1 , bj a2 , b2 ... aj , b1 ... ak , bk From the triangle inequality, we have da1; bj ,daj ; bj da1; aj and thus M a; b da1; aj , da1; b1. Therefore, X dai; b1 , dai; aj + k da1; aj , da1; b1 : i6=j We also have CRWALKstep = daj ; b1 and hence X + CRWALKstep dai; b1 , dai; aj + k da1; aj , da1; b1 + daj ; b1 i6=j X dai; b1 , dai; aj + da1; aj , da1; b1 : i Online-42 ...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online