stringsearch

stringsearch - S t r in g S e ar ch S t r in g S ear chin g...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226 String Searching Reference: Chapter 19, Algorithms in C, 2 nd Edition, Robert Sedgewick. 2 String Search String search. Given a pattern string, find first match in text. Model. Can't afford to preprocess the text. Parameters. N = length of text, M = length of pattern. i n a h a y Text s t a c k a n e e d l e i n a n e e d l e Pattern M = 6, N = 21 typically N >> M 3 Applications Applications. ! Parsers. ! Lexis/Nexis. ! Spam filters. ! Virus scanning. ! Digital libraries. ! Screen scrapers. ! Word processors. ! Web search engines. ! Natural language processing. ! Carnivore surveillance system. ! Computational molecular biology. ! Feature detection in digitized images. 4 Brute Force: Typical Case h a y n e e d s a n n e e d l e x n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e n e e d l e
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
5 Brute Force Brute force. Check for pattern starting at every text position. public static int search ( String pattern , String text ) { int M = pattern . length (); int N = text . length (); for ( int i = 0 ; i < N - M ; i ++) { int j ; for ( j = 0 ; j < M ; j ++) { if ( text . charAt ( i + j ) != pattern . charAt ( j )) break ; } if ( j == M ) return i ; // return offset i of match } return -1 ; // not found } 6 Brute Force: Worst Case a a a a a a a a a a a a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b a a a a a b 7 Analysis of Brute Force Analysis of brute force. ! Running time depends on pattern and text. ! Slow if M and N are large, and have lots of repetition. Implementation Typical Worst Brute 1.1 N M N † assumes appropriate model character comparisons Search for M-character pattern in N-character text 8 Screen Scraping Goal. Find current stock price of Google. http://finance.yahoo.com/q?s=goog NYSE symbol
Background image of page 2
9 Screen Scraping Goal. Find current stock price of Google. ! s.indexOf(t, i) : index of first occurrence of pattern t in string s , starting at offset i . ! Read raw html from http://finance.yahoo.com/q?s=goog . ! Find first string delimited by <b> and </b> after Last Trade . public class StockQuote { public static void main ( String [] args ) { String name = "http://finance.yahoo.com/q?s=" ; In in = new In ( name + args [ 0 ]); String input = in . readAll (); int start = input . indexOf ( "Last Trade:" , 0 ); int from = input . indexOf ( "<b>" , start ); int to = input . indexOf ( "</b>" , from ); String price = input . substring ( from + 3 , to ); System . out . println ( price ); } } % java StockQuote goog 475.90 10 Algorithmic Challenges Theoretical challenge. Linear-time guarantee.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 11

stringsearch - S t r in g S e ar ch S t r in g S ear chin g...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online