*This preview shows
pages
1–3. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **1 Prof. Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof. Aiken CS 143 Lecture 5 2 Outline • Regular languages revisited • Parser overview • Context-free grammars (CFG’s) • Derivations • Ambiguity Prof. Aiken CS 143 Lecture 5 3 Languages and Automata • Formal languages are very important in CS – Especially in programming languages • Regular languages – The weakest formal languages widely used – Many applications • We will also study context-free languages, tree languages Prof. Aiken CS 143 Lecture 5 4 Beyond Regular Languages • Many languages are not regular • Strings of balanced parentheses are not regular: ( ) | i i i Prof. Aiken CS 143 Lecture 5 5 What Can Regular Languages Express? • Languages requiring counting modulo a fixed integer • Intuition: A finite automaton that runs long enough must repeat states • Finite automaton can’t remember # of times it has visited a particular state Prof. Aiken CS 143 Lecture 5 6 The Functionality of the Parser • Input: sequence of tokens from lexer • Output: parse tree of the program (But some parsers never produce a parse tree . . .) 2 Prof. Aiken CS 143 Lecture 5 7 Example • Cool if x = y then 1 else 2 fi • Parser input IF ID = ID THEN INT ELSE INT FI • Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser • Not all strings of tokens are programs . . . • . . . Parser must distinguish between valid and invalid strings of tokens • We need – A language for describing valid strings of tokens – A method for distinguishing valid from invalid strings of tokens Prof. Aiken CS 143 Lecture 5 10 Context-Free Grammars • Programming language constructs have recursive structure • An EXPR is if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool … • Context-free grammars are a natural notation for this recursive structure Prof. Aiken CS 143 Lecture 5 11 CFGs (Cont.) • A CFG consists of – A set of terminals T – A set of non-terminals N – A start symbol S (a non-terminal) – A set of productions 1 2 where and n i X YY Y X N Y T N Prof. Aiken CS 143 Lecture 5 12 Notational Conventions • In these lecture notes – Non-terminals are written upper-case – Terminals are written lower-case – The start symbol is the left-hand side of the first production 2 Prof. Aiken CS 143 Lecture 5 7 Example • Cool if x = y then 1 else 2 fi • Parser input IF ID = ID THEN INT ELSE INT FI • Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser • Not all strings of tokens are programs . . ....

View
Full
Document