# suffix - Exact String Matching 15-853 Algorithms in the...

1 15-853 Page 1 15-853: Algorithms in the Real World Suffix Trees 15-853 Page 2 Exact String Matching Given a text T of length m and pattern P of length n “Quickly” find an occurrence (or all occurrences) of P in T A Naïve solution: Compare P with T[i…i+n] for all i --- O(nm) time How about O(n+m) time? (Knuth Morris Pratt) How about O(m) preprocessing time and O(n) search time? 15-853 Page 3 Suffix Trees Preprocess the text in O(m) time and search in O(n) time • Idea: Construct a tree containing all suffixes of text along the paths from the root to the leaves For search, just follow the appropriate path 15-853 Page 4 Suffix Trees x a b x a c c a b a x c c c c a x b A suffix tree for the string x a b x a c 3 6 5 2 4 1 Search for the string a b x

2 15-853 Page 5 Constructing Suffix trees Naive O(m 2 ) algo For every i, add the suffix S[i .. m] to the current tree c a x b 3 a b a x c 2 x a b x a c 1 15-853 Page 6 Constructing Suffix trees Naive O(m 2 ) algo For every i, add the suffix S[i .. m] to the current tree x a b x a c c a b a x c c a x b 3 2 4 1 15-853 Page 7 Constructing Suffix trees Naive O(m 2 ) algo For every i, add the suffix S[i .. m] to the current tree x a b x a c c a b a x c c c a x b 3 c 6 5 2 4 1 15-853 Page 8 Ukkonen’s linear-time algorithm We will start with an O(m 3 ) algorithm and then give a series of improvements In stage i, we construct a suffix tree T i for S[1..i] Incrementing T i to T i+1 naively takes O(i 2 ) time because we insert each of the i suffixes Thus a total of O(m 3 ) time
3 15-853 Page 9 Going from T i to T i+1 In the j th substage of stage i+1, we insert S[j..i+1] into T i . Let S[j..i] = β.

