alignment

# Every diagonal edge adds an extra element to common

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: coords: 01 2 2 A T -- 3 3 C -- -- T 1 4 5 6 7 8 T GA T C elements of w 00 GC 23 A 4 T 5 -- -- C 7 5 A 6 6 j wj = preﬁx of w of length j: w1 … w j si, j-1 si-1, j-1 + 1 if vi = wj C 7 8 Every path is a common subsequence. Every diagonal edge adds an extra element to common subsequence T5 LCS Problem: Find a path with maximum number of diagonal edges A6 C7 W 0 i-1,j 1 0 A A 0 V i-1,j -1 i,j -1 T 6 A4 The length of LCS(vi,wj) is computed by: si, j = max A 5 Every Path in the Grid Corresponds to an Alignment v1 … v i si-1, j G 4 C3 positions in w: 1 < 3 < 5 < 6 < 7 preﬁx of v of length i: = T 3 G2 Every common subsequence is a path in 2-D grid vi C 2 T1 positions in v: 2 < 3 < 4 < 6 < 8 Computing LCS T 1 i0 (0,0)(1,0)(2,1)(2,2)(3,3)(3,4)(4,5)(5,5)(6,6)(7,6)(8,7) Matches shown in red A 0 elements of v j coords: Edit Graph for LCS Problem 1 0 T 2 i,j G C 2 G 3 4 0 1 2 2 3 4" V= A T - G T" || W= |" A T C G –" 3 T T 1 4 0 1 2 3 4 4" 8 Edit Distance Edit Distance: Example Levenshtein (1966): the e dit distance between two strings as the minimum number of elementary operations (insertions, deletions, and substitutions) to transform one string into the other d(v,w) = MIN number of elementary operations " " to transform v to w TGCATAT ATCCGAT in 5 steps TGCATAT TGCATA TGCAT ATGCAT...
View Full Document

## This note was uploaded on 02/10/2014 for the course CS 548 taught by Professor Asaben-hur during the Spring '12 term at Colorado State.

Ask a homework question - tutors are online