regular - Robert Sedgewick and Kevin Wayne Copyright 2005

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Robert Sedgewick and Kevin Wayne Copyright 2005 http://www.Princeton.EDU/~cos226 Regular Expressions Reference: Chapter 7.1-7.4, Introduction to Computer Science , R. Sedgewick and K. Wayne. 2 Pattern Matching String search. Search for given string in a large text file. Regular expression. ! Natural and compact way to express multiple text patterns. ! Quintessential programmer's tool. Ex. Fragile X syndrome is a common cause of mental retardation. ! Human genome contains triplet repeats of CGG or AGG , starting with GCG and ending with CTG . ! Number of repeats is variable, and correlated with syndrome. ! Use regular expression to specify pattern: GCG(CGG|AGG)*CTG . 3 Pattern Matching Applications Test if a string matches some pattern. ! Process natural language. ! Scan for virus signatures. ! Search for information using Google. ! Access information in digital libraries. ! Retrieve information from Lexis/Nexis. ! Search-and-replace in a word processors. ! Filter text (spam, NetNanny, Carnivore, malware). ! Validate data-entry fields (dates, email, URL, credit card). ! Search for markers in human genome using PROSITE patterns. Parse text files. ! Compile a Java program. ! Crawl and index the Web. ! Read in data stored in ad hoc input file format. ! Automatically create Java documentation from Javadoc comments. 4 Regular Expressions: Basic Operations Regular expression. Notation to specify a set of strings. every other string aabaab aabaab Concatenation every other string aaaab abaab a(a|b)aab Parentheses (ab)*a ab*a aa | baab .u.u.u. Regular Expression aa abbba a ababababa ab ababa aa abbba Closure Union Wildcard Operation every other string aa baab succubus tumultuous cumulus jugulum No Yes 5 Regular Expressions: Examples Regular expression. Notation is surprisingly expressive. b bb baabbbaa bbb aaa bbbaababbaa a* | (a*ba*ba*ba*)* multiple of three b s 111111111 403982772 1000234 98701234 .*0.... fifth to last digit is subspace subspecies raspberry crispbread .*spb.* contains the trigraph spb gcgcgg cggcggcggctg gcgcaggctg gcgctg gcgcggctg gcgcggaggctg gcg(cgg|agg)*ctg fragile X syndrome indicator Regular Expression No Yes 6 Generalized Regular Expressions Generalized regular expressions. ! Additional operations typically added for convenience. ! Ex: [a-e]+ is shorthand for (a|b|c|d|e)(a|b|c|d|e)* . 111111111 166-54-111 08540-1321 19072-5541 [0-9]{5}-[0-9]{4} Exactly k decade rhythm [^aeiou]{6} Negations camelCase 4illegal word Capitalized [A-Za-z][a-z]* Character classes ade bcde abcde abcbcde a(bc)+de One or more Regular Expression Operation No Yes 7 Regular Expressions in Java Validity checking. Is input in the set described by the re ?...
View Full Document

Page1 / 9

regular - Robert Sedgewick and Kevin Wayne Copyright 2005

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online