lecture-02

# lecture-02 - Scanners Friday, August 26, 2011 Scanners...

This preview shows pages 1–10. Sign up to view the full content.

Scanners Friday, August 26, 2011

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Scanners Sometimes called lexers Recall: scanners break input stream up into a set of tokens IdentiFers, reserved words, literals, etc. What do we need to know? How do we deFne tokens? How can we recognize tokens? How do we write scanners? Friday, August 26, 2011
Regular expressions Regular sets: set of strings deFned by regular expressions Strings are regular sets (with one element): purdue 3.14159 So is the empty string: λ (sometimes use ɛ instead) Concatentations of regular sets are regular: purdue3.14159 To avoid ambiguity, can use ( ) to group regexps together A choice between two regular sets is regular, using | : ( purdue | 3.14159 ) 0 or more of a regular set is regular, using * : ( purdue )* Some other notation used for convenience: Use Not to accept all strings except those in a regular set Use ? to make a string optional: x ? equivalent to ( x | λ ) Use + to mean 1 or more strings from a set: x + equivalent to xx * Use [ ] to present a range of choices: [ 1 - 3 ] equivalent to ( 1 | 2 | 3 ) Friday, August 26, 2011

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Examples of regular expressions Numbers: D = [ 0 - 9 ]+ Words: L = [ A - Za - z ]+ Literals (integers or Foats): - ? D +(. D *)? Identi±ers: ( _ | L )( _ | L | D )* Comments (as in LITTLE): -- Not( \n )* \n More complex comments (delimited by ##, can use # inside comment): ## (( # | λ )Not( # ))* ## Friday, August 26, 2011
How do we build a scanner? Idea: represent each token as a regular expression Match token if regular expression matches Big problem: string of characters can have multiple tokens Simpler problem for now: decide if a regular expression matches the entire string Friday, August 26, 2011

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Finite automata Finite state machine which will only accept a string if it is in the set de±ned by the regular expression (a b c+)+ a b c a c start state transition state fnal state Friday, August 26, 2011
λ transitions Transitions between states that aren’t triggered by seeing another character Can optionally take the transition, but do not have to Can be used to link states together ! Friday, August 26, 2011

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Non-deterministic FAs (NFAs) What happens when we have an FA that offers multiple choices? FA is non-deterministic if, from one state reading a single character could result in transition to multiple states If a ±nite automaton has a λ -transition in it, it may be non-deterministic (do we take the transition? or not?) 1 2 4 3 5 ! a a, b a a b Friday, August 26, 2011
Simulating NFAs To run NFA, simulate every possible path Intuition: deterministic FAs (DFAs) have a “pointer” that follows the single path from one state to the next When we come to a non-deterministic choice, we can “split” the pointer into two, one for each path Termination conditions If any pointer is in an accept state at the end of input, the NFA accepts (intuitively: there was one possible path that

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 02/19/2012 for the course ECE 468 taught by Professor Test during the Fall '08 term at Purdue.

### Page1 / 39

lecture-02 - Scanners Friday, August 26, 2011 Scanners...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online