Unformatted text preview: More Exact Matching (Following Gusfeld Chapter 2) KnuthMorrisPratt KnuthMorrisPratt (KMP) Shift by more than 1 place, if possible, upon mismatch. Def. spm i ( P ) = the length of the longest substring of P that ends at i > 1 and matches a prefix of P and such that P [ i +1] P [ spm i + 1]. ( spm stands for suffix, prefix, mismatch.) i P : spm i spm i x y P [ i+ 1] x T : P : s p m i s p m i y can shift by: i spm i KMP Algorithm: Suppose mismatch at i +1 of P : a KMP p x T : P : spm p1 spm p1 y can set new p to spm p1 +1 c = p = 1 // ptrs into T and P, respectively while c T  P + p: while P[p] = T[c] and p n: // compare P and T p++ c++ if p = n + 1: print Found at, c  n // if found if p = 1: // failure at start means inc c c++ else : p = spm p1 + 1 // shift by n  spm p1 (even if p=n+1) c new p KMP Running Time Pseudocode runs in O( T ) time (making at most 2 T  comparisons): In each iteration of the outer while loop, at most one character is compared that was compared in a previous iteration. Total comparisons:  T  + s , where s = # of times through the outer while loop. s T since P is shifted by 1 each time. Therefore: O( T ) for the pseudocode on previous page. Recall: Fundamental Preprocessing P = a a rdv a rk: Z 2 = 1, Z 6 = 1...
