This preview shows pages 1–8. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Algorithms – Sequence Alignment Sequence Alignment II Design and Analysis of Algorithms Andrei Bulatov Algorithms – Sequence Alignment 92 Alignments Let and be two strings A matching is a set of ordered pairs, such that an element of each set occurs at most once. A matching is an alignment if there no crossing pairs: (i,j) and (i’,j’) are in the matching and i < i’ then j < j’ m x x x X , , , 2 1 K = n y y y Y , , , 2 1 K = if (i,j) and (i’,j’) are in the matching and i < i’ then j < j’ o c u r r a n c e o c u r r e n c e c o c u r r a n c e o c u r r e n c e c Algorithms – Sequence Alignment 93 The Problem Let M be an alignment between X and Y. Each position of X or Y that is not matched in M is called a gap . Each pair (i,j) ∈ M such that is called a mismatch The cost of M is given as follows: here is 0, a p penalty For each gap in M we incur a cost j i y x ≠ There is δ > 0, a gap penalty . For each gap in M we incur a cost of δ For each pair of letters p,q in the alphabet, there is a mismatch cost For each (i,j) ∈ M we pay the mismatch cost Usually, The cost of M is the sum its gap penalties and mismatch costs . pq α . j i y x α . = pp α Algorithms – Sequence Alignment 94 The Problem (cntd) The Sequence Alignment Problem Instance : Sequences X and Y Objective : ind an alignment between X and Y of minimal cost. Find an alignment between X and Y of minimal cost. Algorithms – Sequence Alignment 95 Graph Based Approach Having and construct a square gridlike graph m x x x X , , , 2 1 K = n y y y Y , , , 2 1 K = 3 x XY G Lemma Let f(i,j) denote the minimum weight of a path from (0,0) to (i,j) in Then for all i,j, we have f(i,j) = OPT(i,j) 1 x 2 x 1 y 2 y 3 y 4 y Weights: δ on each horizontal or vertical arc on the diagonal arc from (i,j) to (i + 1, j + 1) j i y x α . XY G Algorithms – Sequence Alignment 96 Backward Search We introduce another function related to OPT Let g(i,j) denote the length of a shortest path from (i,j) to (m,n) 2 x 3 x Lemma Then for all i,j, we have 1 x 1 y 2 y 3 y 4 y )} 1 , ( ), , 1 ( ), 1 , 1 ( min{ ) , ( 1 1 + + + + + + + = + + j i g j i g j i g j i g j i y x δ δ α Algorithms – Sequence Alignment 97 Backward Search (cntd) Lemma The length of the shortest cornercorner path in that passes through (i,j) is f(i,j) + g(i,j) Proof t k denote the length of a shortest corner rner path that passes XY G Let k denote the length of a shortest cornertocorner path that passes through (i,j) It splits into to parts: from (0,0) to (i,j), and from (i,j) to (m,n) The length of the first part is ≥ f(i,j), the length of the second...
View
Full
Document
This note was uploaded on 11/11/2009 for the course CS 405/705 taught by Professor Bulatov during the Fall '09 term at Simon Fraser.
 Fall '09
 Bulatov
 Algorithms

Click to edit the document details