This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 Prof. Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof. Aiken CS 143 Lecture 5 2 Outline Regular languages revisited Parser overview Contextfree grammars (CFGs) Derivations Ambiguity Prof. Aiken CS 143 Lecture 5 3 Languages and Automata Formal languages are very important in CS Especially in programming languages Regular languages The weakest formal languages widely used Many applications We will also study contextfree languages, tree languages Prof. Aiken CS 143 Lecture 5 4 Beyond Regular Languages Many languages are not regular Strings of balanced parentheses are not regular: ( )  i i i Prof. Aiken CS 143 Lecture 5 5 What Can Regular Languages Express? Languages requiring counting modulo a fixed integer Intuition: A finite automaton that runs long enough must repeat states Finite automaton cant remember # of times it has visited a particular state Prof. Aiken CS 143 Lecture 5 6 The Functionality of the Parser Input: sequence of tokens from lexer Output: parse tree of the program (But some parsers never produce a parse tree . . .) 2 Prof. Aiken CS 143 Lecture 5 7 Example Cool if x = y then 1 else 2 fi Parser input IF ID = ID THEN INT ELSE INT FI Parser output IFTHENELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser Not all strings of tokens are programs . . . . . . Parser must distinguish between valid and invalid strings of tokens We need A language for describing valid strings of tokens A method for distinguishing valid from invalid strings of tokens Prof. Aiken CS 143 Lecture 5 10 ContextFree Grammars Programming language constructs have recursive structure An EXPR is if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool Contextfree grammars are a natural notation for this recursive structure Prof. Aiken CS 143 Lecture 5 11 CFGs (Cont.) A CFG consists of A set of terminals T A set of nonterminals N A start symbol S (a nonterminal) A set of productions 1 2 where and n i X YY Y X N Y T N Prof. Aiken CS 143 Lecture 5 12 Notational Conventions In these lecture notes Nonterminals are written uppercase Terminals are written lowercase The start symbol is the lefthand side of the first production 2 Prof. Aiken CS 143 Lecture 5 7 Example Cool if x = y then 1 else 2 fi Parser input IF ID = ID THEN INT ELSE INT FI Parser output IFTHENELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser Not all strings of tokens are programs . . ....
View
Full
Document
This document was uploaded on 04/06/2012.
 Fall '09

Click to edit the document details