{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

L2-Lexical1 - Introduction to Compiler Design Lexical...

Info icon This preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Introduction to Compiler Design Lexical Analysis I Professor Yi-Ping You Department of Computer Science http://www.cs.nctu.edu.tw/~ypyou/ Page 1 Introduction to Compiler Design, Spring 2010
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Supplements to Text Book http://dragonbook.stanford.edu/ Errata sheet http://infolab.stanford.edu/~ullman/dragon/ errata1.html http://infolab.stanford.edu/~ullman/dragon/ errata html errata.html Page 2 Introduction to Compiler Design, Spring 2010
Image of page 2
Outline Lexical Analysis and Tokens R g l E i Regular Expressions Interaction between Scanner & Parser Implementation of Lexical Analysis T iti Di Transition Diagram Page 3 Introduction to Compiler Design, Spring 2010
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
The Structure of a Compiler Lexical Analyzer Source Code (scanner) Syntax Analyzer (parser) Semantic Analyzer Intermediate Code Generator Symbol Table Error Handler Code Optimizer Code Generator Page 4 Introduction to Compiler Design, Spring 2010 Target Code
Image of page 4
Lexical Analysis Process if (b == 0) a = b; Preprocessed source code, read char by char Lexical Analysis/Scanner if ( b == 0 ) a = b ; Transform multi-character input stream to token stream KWif ( ID == NUM ) ID = ID SEMI Reduce length of program representation (remove spaces) Page 5 Introduction to Compiler Design, Spring 2010
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Examples of Tokens Token smallest logically cohesive sequence of characters of interest in source program T k L Token Lexeme Single-character operators = + - > M lti h t t <> > Multi-character operators := == <> -> Keywords if else while break Id tifi i bl fl 1 Identifiers my_variable flag1 Numeric constants/literals 123 45.67 8.9e+05 Ch t lit l ‘ ’ ‘ ’ ‘\’ Character literals ‘a’ ‘~’ ‘\’ String literals “abcd” Page 6 Introduction to Compiler Design, Spring 2010
Image of page 6
Tokens, Patterns and Lexemes A token is a pair a token name and an optional attribute value E.g., identifier , keyword , operator A pattern is a description of the form that the lexemes of a token may take Regular expression E.g., [A Za z_][A Za z0 9_]* A l i f h t i th lexeme is a sequence of characters in the source program that matches the pattern for a t k token E.g., x , y ; if , else , while ; + , - , < Page 7 Introduction to Compiler Design, Spring 2010
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Attributes for Tokens: Example E = M * C ** 2 < id , pointer to symbol table entry for E > < assign-op > < id , pointer to symbol table entry for M > < mult-op > < id , pointer to symbol table entry for C > < exp-op > < number , integer value 2 > Page 8 Introduction to Compiler Design, Spring 2010
Image of page 8