lecture05

Lecture05 - 1 Prof Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof Aiken CS 143 Lecture 5 2 Outline • Regular languages revisited

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Prof. Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof. Aiken CS 143 Lecture 5 2 Outline • Regular languages revisited • Parser overview • Context-free grammars (CFG’s) • Derivations • Ambiguity Prof. Aiken CS 143 Lecture 5 3 Languages and Automata • Formal languages are very important in CS – Especially in programming languages • Regular languages – The weakest formal languages widely used – Many applications • We will also study context-free languages, tree languages Prof. Aiken CS 143 Lecture 5 4 Beyond Regular Languages • Many languages are not regular • Strings of balanced parentheses are not regular: ( ) | i i i Prof. Aiken CS 143 Lecture 5 5 What Can Regular Languages Express? • Languages requiring counting modulo a fixed integer • Intuition: A finite automaton that runs long enough must repeat states • Finite automaton can’t remember # of times it has visited a particular state Prof. Aiken CS 143 Lecture 5 6 The Functionality of the Parser • Input: sequence of tokens from lexer • Output: parse tree of the program (But some parsers never produce a parse tree . . .) 2 Prof. Aiken CS 143 Lecture 5 7 Example • Cool if x = y then 1 else 2 fi • Parser input IF ID = ID THEN INT ELSE INT FI • Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser • Not all strings of tokens are programs . . . • . . . Parser must distinguish between valid and invalid strings of tokens • We need – A language for describing valid strings of tokens – A method for distinguishing valid from invalid strings of tokens Prof. Aiken CS 143 Lecture 5 10 Context-Free Grammars • Programming language constructs have recursive structure • An EXPR is if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool … • Context-free grammars are a natural notation for this recursive structure Prof. Aiken CS 143 Lecture 5 11 CFGs (Cont.) • A CFG consists of – A set of terminals T – A set of non-terminals N – A start symbol S (a non-terminal) – A set of productions 1 2 where and n i X YY Y X N Y T N Prof. Aiken CS 143 Lecture 5 12 Notational Conventions • In these lecture notes – Non-terminals are written upper-case – Terminals are written lower-case – The start symbol is the left-hand side of the first production 2 Prof. Aiken CS 143 Lecture 5 7 Example • Cool if x = y then 1 else 2 fi • Parser input IF ID = ID THEN INT ELSE INT FI • Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser • Not all strings of tokens are programs . . ....
View Full Document

This note was uploaded on 01/12/2010 for the course CS 143 at Stanford.

Page1 / 10

Lecture05 - 1 Prof Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof Aiken CS 143 Lecture 5 2 Outline • Regular languages revisited

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online