lect25-longest-common-subseq - Lecture Notes CMSC 251...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture Notes CMSC 251 Lecture 25: Longest Common Subsequence (April 28, 1998) Read: Section 16.3 in CLR. Strings: One important area of algorithm design is the study of algorithms for character strings. There are a number of important problems here. Among the most important has to do with efficiently searching for a substring or generally a pattern in large piece of text. (This is what text editors and functions like ”grep” do when you perform a search.) In many instances you do not want to find a piece of text exactly, but rather something that is ”similar”. This arises for example in genetics research. Genetic codes are stored as long DNA molecules. The DNA strands can be broken down into a long sequences each of which is one of four basic types: C, G, T, A. But exact matches rarely occur in biology because of small changes in DNA replication. Exact sub- string search will only find exact matches. For this reason, it is of interest to compute similarities between strings that do not match exactly. The method of string similarities should be insensitive to random insertions and deletions of characters from some originating string. There are a number of measures of similarity in strings. The first is the edit distance , that is, the minimum number of single character insertions, deletions, or transpositions necessary to convert one string into another. The other, which we will study today, is that of determining the length of the longest common subsequence. Longest Common Subsequence: Let us think of character strings as sequences of characters. Given two sequences X = h x 1 ,x 2 ,...,x m i and Z = h z 1 ,z 2 ,...,z k i , we say that Z is a subsequence of X if there is a strictly increasing sequence of k indices h i 1 ,i 2 ,...,i k i ( 1 i 1 <i 2 < ... < i k n ) such that Z = h X i 1 ,X i 2 ,...,X i k i . For example, let X = h ABRACADABRA i and let Z = h AADAA i , then Z is a subsequence of X .
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/13/2012 for the course CMSC 351 taught by Professor Staff during the Fall '11 term at University of Louisville.

Page1 / 3

lect25-longest-common-subseq - Lecture Notes CMSC 251...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online