Lecture12_SuffixTrees

Lecture12_SuffixTrees - String algorithms 6.046...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
String algorithms II Prof. Manolis Kellis 6.046 – Introduction to Algorithms – Spring ’05 Lecture 12 String algorithms Last time: Exact string matching Naïve algorithm Fundamental pre-processing Knuth-Morris-Pratt / Boyer-Moore / Z-algorithm Semi-numerical string matching Rabin-Karp algorithm Today: String matching II – Suffix-trees Linear time construction – Applications Recitation: More on Suffix Trees Finite State Machines Regular Expression Matching Where have we gotten so far? Last time Fundamental preprocessing in linear time Searching for pattern p in linear time: O( Text ) Today’s challenge: Can we do better? Searching for any pattern p in linear time O( pattern ) After pre-processing the text once T= P= b a a b a c a b a b a d a b a d a b a d b a a b a c a b a b a d a b a d a b Length n m Text T=‘ Pattern P=‘Knuth’ More involved pre-processing step Fundamental pre-processing only searched for: – Common prefix / suffix at any position – Redundancy with beginning/end of string Suffix trees – Redundancy across all substrings starting at every position over the remainder of the list • Example: – Suffix tree of xabxac x a b a b x a c x a c c b x a c c c Suffix tree definition Definition: Suffix tree T for string S (of length n) Rooted, directed tree T, n leaves, numbered 1..n Path to leaf i spells out the suffix S[i..], by concatenating edge labels Common prefixes share common paths, diverge to form internal nodes Î Effectively exhibit common prefixes of every suffix Î Explores full substring redundancy structure of S x a b a b x a c x a c c b x a c c 1 2 3 4 5 x a b x a c b x a c x a c a c c 1 3 4 5 6 c 6 a b x a c 2 x a b x a c b x a c a c c 1 3 5 6 a b x a c 2 x a c 4 Exact string matching with suffix trees Given the suffix tree for text T Search pattern P in O(pattern) time – For every character in P, traverse the appropriate path of the tree, reading one character each time – If P is not found in a path, P does not occur in T – If P is found in its entirety, then all occurrences of P
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern