{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

LCS.efficient(1)

# LCS.efficient(1) - An Efficient Algorithm for the LCS...

This preview shows pages 1–7. Sign up to view the full content.

An Efficient Algorithm for the LCS Problem

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Longest Common Subsequence Problem The longest common subsequence problem, also called the LCS problem is a special case of the similarity problem. Definition : Given a string S of length n , a subsequence is a string such that for some . A substring is a subset of S which are located contiguously but in a subsequence the characters are not necessarily contiguous but they are in order from left to right. Thus a substring is a subsequence but the converse is not true. ) ( ) .... ( ) ( 2 1 k i S i S i S k i i i i 3 2 1 1 n k
Longest Common Subsequence Definition: The longest common subsequence or LCS of two strings S1 and S2 is the longest subsequence common between two strings. S1 : A -- A T -- G G C C -- A T A n=10 S2: A T A T A A T T C T A T -- m=12 The LCS is AATCAT . The length of the LCS is 6 . The solution is not unique for all pair of strings . Consider the pair ( ATTA, ATAT ). The solutions are ATT, ATA . In general, for arbitrary pair of strings, there may exist many solutions.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
LCS Problem The LCS can be found by dynamic programming formulation. Since it is using the general dynamic programming algorithm its complexity is O(nm) . A longest substring problem, on the other hand has a O(n+m) solution. Subsequences are much more complex than substrings. Can we do better for the LCS problem? We will see
LCS for S 1 and S 2 The optimal alignment is shown above. Note the alignment shows three insert (dark), one delete ( green ) and three substitution or replacement operations ( blue ), which gives an edit distance of 7 . But, the 3 replacement operations can be realized by 3 insert and 3 delete operations because a replacement is equivalent to first delete the character and then insert a character in its place like: S 1 : A -- A T -- G G C C -- A T A n=10 S 2 : A T A T A A T T C T A T -- m=12 G -- G -- C -- -- A -- T -- T

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Edit Distance and LCS are related if we give a cost of 2 for replace operation and cost of 1 for both insert and
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 23

LCS.efficient(1) - An Efficient Algorithm for the LCS...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online