This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Scanners Wednesday, September 1, 2010 Scanners • Sometimes called lexers • Recall: scanners break input stream up into a set of tokens • IdentiFers, reserved words, literals, etc. • What do we need to know? • How do we deFne tokens? • How can we recognize tokens? • How do we write scanners? Wednesday, September 1, 2010 Regular expressions • Regular sets: set of strings deFned by regular expressions • Strings are regular sets (with one element): purdue 3.14159 • So is the empty string: λ (sometimes use ɛ instead) • Concatentations of regular sets are regular: purdue3.14159 • To avoid ambiguity, can use ( ) to group regexps together • A choice between two regular sets is regular, using  : ( purdue  3.14159 ) • 0 or more of a regular set is regular, using * : ( purdue )* • Some other notation used for convenience: • Use Not to accept all strings except those in a regular set • Use ? to make a string optional: x ? equivalent to ( x  λ ) • Use + to mean 1 or more strings from a set: x + equivalent to xx * • Use [ ] to present a range of choices: [ 1 3 ] equivalent to ( 1  2  3 ) Wednesday, September 1, 2010 Examples of regular expressions • Numbers: D = [ 9 ]+ • Words: L = [ A Za z ]+ • Literals (integers or Foats):  ? D +(. D *)? • Identi¡ers: ( _  L )( _  L  D )* • Comments (as in Micro):  Not( \n )* \n • More complex comments (delimited by ##, can use # inside comment): ## (( #  λ )Not( # ))* ## Wednesday, September 1, 2010 Finite automata • Finite state machine which will only accept a string if it is in the set de¡ned by the regular expression (a b c+)+ a b c a c start state transition state fnal state Wednesday, September 1, 2010 λ transitions • Transitions between states that aren’t triggered by seeing another character • Can optionally take the transition, but do not have to • Can be used to link states together ! Wednesday, September 1, 2010 a ! A B ! ! A B ! ! ! ! A Expression FA a λ AB AB A* Building a FA from a regexp Miniexercise: how do we build an FA that accepts Not(A)? Wednesday, September 1, 2010 NFAs to DFAs • Note that if a ¡nite automaton has a λtransition in it, it may be nondeterministic (do we take the transition? or not?) • More precisely, FA is nondeterministic if, from one state reading a single character could result in transition to multiple states • How do we deal with nondeterministic ¡nite automata (NFAs)? • Group nodes that can be reached by the same character into a single node • Algorithm in textbook, page 82 • Note: this can result in very large DFAs!...
View
Full Document
 Fall '08
 TEST
 Formal language, Regular expression, Regular language, Nondeterministic finite state machine, Automata theory

Click to edit the document details