lecture05

lecture05 - 1 Prof. Aiken CS 143 Lecture 5 1 Introduction...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 Prof. Aiken CS 143 Lecture 5 1 Introduction to Parsing Lecture 5 Prof. Aiken CS 143 Lecture 5 2 Outline Regular languages revisited Parser overview Context-free grammars (CFGs) Derivations Ambiguity Prof. Aiken CS 143 Lecture 5 3 Languages and Automata Formal languages are very important in CS Especially in programming languages Regular languages The weakest formal languages widely used Many applications We will also study context-free languages, tree languages Prof. Aiken CS 143 Lecture 5 4 Beyond Regular Languages Many languages are not regular Strings of balanced parentheses are not regular: ( ) | i i i Prof. Aiken CS 143 Lecture 5 5 What Can Regular Languages Express? Languages requiring counting modulo a fixed integer Intuition: A finite automaton that runs long enough must repeat states Finite automaton cant remember # of times it has visited a particular state Prof. Aiken CS 143 Lecture 5 6 The Functionality of the Parser Input: sequence of tokens from lexer Output: parse tree of the program (But some parsers never produce a parse tree . . .) 2 Prof. Aiken CS 143 Lecture 5 7 Example Cool if x = y then 1 else 2 fi Parser input IF ID = ID THEN INT ELSE INT FI Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser Not all strings of tokens are programs . . . . . . Parser must distinguish between valid and invalid strings of tokens We need A language for describing valid strings of tokens A method for distinguishing valid from invalid strings of tokens Prof. Aiken CS 143 Lecture 5 10 Context-Free Grammars Programming language constructs have recursive structure An EXPR is if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool Context-free grammars are a natural notation for this recursive structure Prof. Aiken CS 143 Lecture 5 11 CFGs (Cont.) A CFG consists of A set of terminals T A set of non-terminals N A start symbol S (a non-terminal) A set of productions 1 2 where and n i X YY Y X N Y T N Prof. Aiken CS 143 Lecture 5 12 Notational Conventions In these lecture notes Non-terminals are written upper-case Terminals are written lower-case The start symbol is the left-hand side of the first production 2 Prof. Aiken CS 143 Lecture 5 7 Example Cool if x = y then 1 else 2 fi Parser input IF ID = ID THEN INT ELSE INT FI Parser output IF-THEN-ELSE = ID ID INT INT Prof. Aiken CS 143 Lecture 5 8 Comparison with Lexical Analysis Phase Input Output Lexer String of characters String of tokens Parser String of tokens Parse tree Prof. Aiken CS 143 Lecture 5 9 The Role of the Parser Not all strings of tokens are programs . . ....
View Full Document

This document was uploaded on 04/06/2012.

Page1 / 10

lecture05 - 1 Prof. Aiken CS 143 Lecture 5 1 Introduction...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online