Lecture12_SuffixTrees

# Lecture12_SuffixTrees - String algorithms 6.046...

This preview shows pages 1–2. Sign up to view the full content.

String algorithms II Prof. Manolis Kellis 6.046 – Introduction to Algorithms – Spring ’05 Lecture 12 String algorithms Last time: Exact string matching Naïve algorithm Fundamental pre-processing Knuth-Morris-Pratt / Boyer-Moore / Z-algorithm Semi-numerical string matching Rabin-Karp algorithm Today: String matching II – Suffix-trees Linear time construction – Applications Recitation: More on Suffix Trees Finite State Machines Regular Expression Matching Where have we gotten so far? Last time Fundamental preprocessing in linear time Searching for pattern p in linear time: O( Text ) Today’s challenge: Can we do better? Searching for any pattern p in linear time O( pattern ) After pre-processing the text once T= P= b a a b a c a b a b a d a b a d a b a d b a a b a c a b a b a d a b a d a b Length n m Text T=‘ Pattern P=‘Knuth’ More involved pre-processing step Fundamental pre-processing only searched for: – Common prefix / suffix at any position – Redundancy with beginning/end of string Suffix trees – Redundancy across all substrings starting at every position over the remainder of the list • Example: – Suffix tree of xabxac x a b a b x a c x a c c b x a c c c Suffix tree definition Definition: Suffix tree T for string S (of length n) Rooted, directed tree T, n leaves, numbered 1..n Path to leaf i spells out the suffix S[i..], by concatenating edge labels Common prefixes share common paths, diverge to form internal nodes Î Effectively exhibit common prefixes of every suffix Î Explores full substring redundancy structure of S x a b a b x a c x a c c b x a c c 1 2 3 4 5 x a b x a c b x a c x a c a c c 1 3 4 5 6 c 6 a b x a c 2 x a b x a c b x a c a c c 1 3 5 6 a b x a c 2 x a c 4 Exact string matching with suffix trees Given the suffix tree for text T Search pattern P in O(pattern) time – For every character in P, traverse the appropriate path of the tree, reading one character each time – If P is not found in a path, P does not occur in T – If P is found in its entirety, then all occurrences of P

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern