This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Scanners Wednesday, September 1, 2010 Scanners • Sometimes called lexers • Recall: scanners break input stream up into a set of tokens • IdentiFers, reserved words, literals, etc. • What do we need to know? • How do we deFne tokens? • How can we recognize tokens? • How do we write scanners? Wednesday, September 1, 2010 Regular expressions • Regular sets: set of strings deFned by regular expressions • Strings are regular sets (with one element): purdue 3.14159 • So is the empty string: ! (sometimes use " instead) • Concatentations of regular sets are regular: purdue3.14159 • To avoid ambiguity, can use ( ) to group regexps together • A choice between two regular sets is regular, using | : ( purdue | 3.14159 ) • 0 or more of a regular set is regular, using * : ( purdue )* • Some other notation used for convenience: • Use Not to accept all strings except those in a regular set • Use ? to make a string optional: x ? equivalent to ( x | ! ) • Use + to mean 1 or more strings from a set: x + equivalent to xx * • Use [ ] to present a range of choices: [ 1- 3 ] equivalent to ( 1 | 2 | 3 ) Wednesday, September 1, 2010 Examples of regular expressions • Numbers: D = [- 9 ]+ • Words: L = [ A- Za- z ]+ • Literals (integers or ¡oats): - ? D +(. D *)? • IdentiFers: ( _ | L )( _ | L | D )* • Comments (as in Micro): -- Not( \n )* \n • More complex comments (delimited by ##, can use # inside comment): ## (( # | ! )Not( # ))* ## Wednesday, September 1, 2010 ¢inite automata • ¢inite state machine which will only accept a string if it is in the set deFned by the regular expression (a b c+)+ a b c a c start state transition state fnal state Wednesday, September 1, 2010 ! transitions • Transitions between states that aren’t triggered by seeing another character • Can optionally take the transition, but do not have to • Can be used to link states together ! Wednesday, September 1, 2010 a ! A B ! ! A B ! ! ! ! A Expression FA a ! AB A|B A* Building a FA from a regexp Mini-exercise: how do we build an FA that accepts Not(A)? Wednesday, September 1, 2010 NFAs to DFAs • Note that if a ¡nite automaton has a !-transition in it, it may be non-deterministic (do we take the transition? or not?) • More precisely, FA is non-deterministic if, from one state reading a single character could result in transition to multiple states • How do we deal with non-deterministic ¡nite automata (NFAs)? • Group nodes that can be reached by the same character into a single node • Algorithm in textbook, page 82 • Note: this can result in very large DFAs!...
View Full Document
This note was uploaded on 02/19/2012 for the course ECE 468 taught by Professor Test during the Fall '08 term at Purdue.
- Fall '08