{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

StringAlg

# StringAlg - STRING ALGORITHMS(Cormen Leiserson Riveset and...

This preview shows pages 1–3. Sign up to view the full content.

STRING ALGORITHMS (Cormen, Leiserson, Riveset, and Stein, 2001, ISBN: 0-07-013151-1 (McGraw Hill), Chapter 32, p906) String processing problem Input: Two strings T and P. Problem: Find if P is a substring of T. Example (1): Input: T = gtgatcagatcact, P = tca Output: Yes. gtga tca ga tca ct, shift=4, 9 Example (2): Input: T = 189342670893, P = 1673 Output: No. Naïve Algorithm (T, P) suppose n = length(T), m = length(P); for shift s=0 through n-m do if (P[1..m] = = T[s+1 .. s+m]) then // actually a for-loop runs here print shift s; End algorithm. Complexity: O((n-m+1)m), or O(max{ nm, m 2 } ) A special note: we allow O(k+1) type notation in order to avoid O(0) term, rather, we want to have O(1) (constant time) in such a boundary situation. Note: Too many repetition of matching of characters. Rabin-Karp scheme Consider a character as a number in a radix system, e.g., English alphabet as in radix-26. Pick up each m-length "number" starting from shift=0 through (n-m). So, T = gtgatcagatcact, in radix-4 (a/0, t/1, g/2, c/3) becomes gtg = '212' in base-4 = 32+4+2 in decimal, tga = '120' in base-4 = 16+8+0 in decimal, …. Then do the comparison with P - number-wise.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Advantage: Calculating strings can reuse old results. Consider decimals: 4359 and 3592 3592 = (4359 - 4*1000)*10 + 2 General formula: t s+1 = d (t s - d m-1 T[s+1]) + T[s+m+1], in radix-d, where t s is the corresponding number for the substring T[s..(s+m)]. Note, m is the size of P. The first-pass scheme: (1) preprocess for (n-m) numbers on T and 1 for P, (2) compare the number for P with those computed on T. Problem: in case each number is too large for comparison Solution: Hash , use modular arithmetic, with respect to a prime q.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}