lecture03

lecture03 - 1 Profs. Aiken CS 143 Lecture 3 1 Lexical...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Profs. Aiken CS 143 Lecture 3 1 Lexical Analysis Lecture 3 Profs. Aiken CS 143 Lecture 3 2 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples of regular expressions Profs. Aiken CS 143 Lecture 3 3 Lexical Analysis What do we want to do? Example: if (i == j) Z = 0; else Z = 1; The input is just a string of characters: \tif (i == j)\n\t\tz = 0;\n\telse\n\t\tz = 1; Goal: Partition input string into substrings Where the substrings are tokens Profs. Aiken CS 143 Lecture 3 4 Whats a Token? A syntactic category In English: noun, verb, adjective, In a programming language: Identifier, Integer, Keyword, Whitespace, Profs. Aiken CS 143 Lecture 3 5 Tokens Tokens correspond to sets of strings. Identifier: strings of letters or digits, starting with a letter Integer: a non-empty string of digits Keyword: else or if or begin or Whitespace: a non-empty sequence of blanks, newlines, and tabs Profs. Aiken CS 143 Lecture 3 6 What are Tokens For? Classify program substrings according to role Output of lexical analysis is a stream of tokens . . . . . . which is input to the parser Parser relies on token distinctions An identifier is treated differently than a keyword 2 Profs. Aiken CS 143 Lecture 3 7 Designing a Lexical Analyzer: Step 1 Define a finite set of tokens Tokens describe all items of interest Choice of tokens depends on language, design of parser Profs. Aiken CS 143 Lecture 3 8 Example Recall \tif (i == j)\n\t\tz = 0;\n\telse\n\t\tz = 1; Useful tokens for this expression: Integer, Keyword, Relation, Identifier, Whitespace, (, ), =, ; N.B., ( , ), = , ; are tokens, not characters, here Profs. Aiken CS 143 Lecture 3 9 Designing a Lexical Analyzer: Step 2 Describe which strings belong to each token Recall: Identifier: strings of letters or digits, starting with a letter Integer: a non-empty string of digits...
View Full Document

This note was uploaded on 01/12/2010 for the course CS 143 at Stanford.

Page1 / 7

lecture03 - 1 Profs. Aiken CS 143 Lecture 3 1 Lexical...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online