lecture-03

# lecture-03 - Scanners Wednesday, September 1, 2010 Scanners...

This preview shows pages 1–9. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Scanners Wednesday, September 1, 2010 Scanners • Sometimes called lexers • Recall: scanners break input stream up into a set of tokens • IdentiFers, reserved words, literals, etc. • What do we need to know? • How do we deFne tokens? • How can we recognize tokens? • How do we write scanners? Wednesday, September 1, 2010 Regular expressions • Regular sets: set of strings deFned by regular expressions • Strings are regular sets (with one element): purdue 3.14159 • So is the empty string: λ (sometimes use ɛ instead) • Concatentations of regular sets are regular: purdue3.14159 • To avoid ambiguity, can use ( ) to group regexps together • A choice between two regular sets is regular, using | : ( purdue | 3.14159 ) • 0 or more of a regular set is regular, using * : ( purdue )* • Some other notation used for convenience: • Use Not to accept all strings except those in a regular set • Use ? to make a string optional: x ? equivalent to ( x | λ ) • Use + to mean 1 or more strings from a set: x + equivalent to xx * • Use [ ] to present a range of choices: [ 1- 3 ] equivalent to ( 1 | 2 | 3 ) Wednesday, September 1, 2010 Examples of regular expressions • Numbers: D = [- 9 ]+ • Words: L = [ A- Za- z ]+ • Literals (integers or Foats): - ? D +(. D *)? • Identi¡ers: ( _ | L )( _ | L | D )* • Comments (as in Micro): -- Not( \n )* \n • More complex comments (delimited by ##, can use # inside comment): ## (( # | λ )Not( # ))* ## Wednesday, September 1, 2010 Finite automata • Finite state machine which will only accept a string if it is in the set de¡ned by the regular expression (a b c+)+ a b c a c start state transition state fnal state Wednesday, September 1, 2010 λ transitions • Transitions between states that aren’t triggered by seeing another character • Can optionally take the transition, but do not have to • Can be used to link states together ! Wednesday, September 1, 2010 a ! A B ! ! A B ! ! ! ! A Expression FA a λ AB A|B A* Building a FA from a regexp Mini-exercise: how do we build an FA that accepts Not(A)? Wednesday, September 1, 2010 NFAs to DFAs • Note that if a ¡nite automaton has a λ-transition in it, it may be non-deterministic (do we take the transition? or not?) • More precisely, FA is non-deterministic if, from one state reading a single character could result in transition to multiple states • How do we deal with non-deterministic ¡nite automata (NFAs)? • Group nodes that can be reached by the same character into a single node • Algorithm in textbook, page 82 • Note: this can result in very large DFAs!...
View Full Document

## This note was uploaded on 02/19/2012 for the course ECE 468 taught by Professor Test during the Fall '08 term at Purdue.

### Page1 / 36

lecture-03 - Scanners Wednesday, September 1, 2010 Scanners...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online