lecture03 - CS 143 Lecture 3 1 Lexical Analysis Lecture 3...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 143 Lecture 3 1 Lexical Analysis Lecture 3 CS 143 Lecture 3 2 Outline • I nformal sketch of lexical analysis – I dentifies tokens in input string • I ssues in lexical analysis – Lookahead – Ambiguities • Specifying lexers – Regular expressions – Examples of regular expressions CS 143 Lecture 3 3 Lexical Analysis • What do we want to do? Example: if (i == j) Z = 0; else Z = 1; • The input is just a string of characters: \tif (i == j)\n\t\tz = 0;\n\telse\n\t\tz = 1; • Goal: Partition input string into substrings – Where the substrings are tokens CS 143 Lecture 3 4 What’s a Token? • A syntactic category – I n English: noun, verb, adjective, … – I n a programming language: I dentifier, I nteger, Keyword, Whitespace, … CS 143 Lecture 3 5 Tokens • Tokens correspond to sets of strings. • I dentifier: strings of letters or digits, starting with a letter • I nteger: a non-empty string of digits • Keyword: “else” or “if” or “begin” or … • Whitespace: a non-empty sequence of blanks, newlines, and tabs CS 143 Lecture 3 6 What ar e Tokens For ? • Classify program substrings according to role • Output of lexical analysis is a stream of tokens . . . • . . . which is input to the parser • Parser relies on token distinctions – An identifier is treated differently than a keyword CS 143 Lecture 3 7 Designing a Lexical Analyzer : Step 1 • Define a finite set of tokens – Tokens describe all items of interest – Choice of tokens depends on language, design of parser CS 143 Lecture 3 8 Example • Recall \tif (i == j)\n\t\tz = 0;\n\telse\n\t\tz = 1; • Useful tokens for this expression: I nteger, Keyword, Relation, I dentifier, Whitespace, (, ), =, ; • Note: ( , ), = , ; are tokens, not characters, here CS 143 Lecture 3 9 Designing a Lexical Analyzer : Step 2 • Describe which strings belong to each token • Recall: – I dentifier: strings of letters or digits, starting with a letter – I nteger: a non-empty string of digits...
View Full Document

This note was uploaded on 05/05/2010 for the course COMPILER AC 1 taught by Professor Sergio during the Spring '10 term at Institute of Management Technology.

Page1 / 40

lecture03 - CS 143 Lecture 3 1 Lexical Analysis Lecture 3...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online