Ch3 - 1 ! Lexical Analysis and" Lexical Analyzer...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Lexical Analysis and Lexical Analyzer Generators Chapter 3 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2011
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 The Reason Why Lexical Analysis is a Separate Phase • Simplifes the design oF the compiler – LL(1) or LR(1) parsing with 1 token lookahead would not be possible (multiple characters/tokens to match) • Provides eFfcient implementation – Systematic techniques to implement lexical analyzers by hand or automatically From specifcations – Stream buFFering methods to scan input • Improves portability – Non-standard symbols and alternate character encodings can be normalized (e.g. trigraphs)
Background image of page 2
3 Interaction of the Lexical Analyzer with the Parser Lexical Analyzer Parser Source Program Token, tokenval Symbol Table Get next token error error
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Attributes of Tokens Lexical analyzer < id , “ y ”> < assign , > < num , 31> < + , > < num , 28> < * , > < id , “ x ”> y := 31 + 28*x Parser token tokenval (token attribute)
Background image of page 4
5 Tokens, Patterns, and Lexemes • A token is a classifcation oF lexical units – ±or example: id and num Lexemes are the specifc character strings that make up a token – ±or example: abc and 123 Patterns are rules describing the set oF lexemes belonging to a token – ±or example: “ letter followed by letters and digits” and non-empty sequence of digits
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Specifcation oF Patterns For Tokens: Defnitions • An alphabet ! is a fnite set oF symbols (characters) • A string s is a fnite sequence oF symbols From ! " s " denotes the length oF string s # denotes the empty string, thus "#" = 0 • A language is a specifc set oF strings over some fxed alphabet !
Background image of page 6
7 Specifcation oF Patterns For Tokens: String Operations • The concatenation oF two strings x and y is denoted by xy • The exponentation oF a string s is defned by s 0 = # s i = s i- 1 s For i > 0 note that s # = # s = s
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Specifcation oF Patterns For Tokens: Language Operations Union L $ M = { s " s % L or s % M } Concatenation LM = { xy " x % L and y % M } Exponentiation L 0 = { # }; L i = L i -1 L Kleene closure L * = $ i =0,…, L i Positive closure L + = $ i =1,…, L i
Background image of page 8
9 Specifcation oF Patterns For Tokens: Regular Expressions • Basis symbols: # is a regular expression denoting language { # } a % ! is a regular expression denoting { a } • IF r and s are regular expressions denoting languages L ( r ) and M ( s ) respectively, then r " s is a regular expression denoting L ( r ) $ M ( s ) rs is a regular expression denoting L ( r ) M ( s ) r * is a regular expression denoting L ( r ) * – ( r ) is a regular expression denoting L ( r ) • A language defned by a regular expression is called a regular set
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10 Specifcation oF Patterns For Tokens: Regular Defnitions • Regular defnitions introduce a naming convention: d 1 ' r 1 d 2 ' r 2 d n ' r n where each r i is a regular expression over ! $ { d 1 , d 2 , …, d i -1 } • Any d j in r i can be textually substituted in r i to obtain an equivalent set oF defnitions
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/01/2012 for the course COP 5621 taught by Professor Vanengelen during the Spring '11 term at FSU.

Page1 / 52

Ch3 - 1 ! Lexical Analysis and&quot; Lexical Analyzer...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online