03-parse

03-parse - The role of the parser Syntax analysis source...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
The role of the parser code source tokens errors scanner parser IR Parser performs context-free syntax analysis guides context-sensitive analysis constructs an intermediate representation produces meaningful error messages attempts error correction Copyright c ± 2007 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proFt or commercial advantage and that copies bear this notice and full citation on the Frst page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior speciFc permission and/or fee. Request permission to publish from hosking@cs.purdue.edu. CS502 Parsing 1 Syntax analysis Context-free syntax is speciFed with a context-free grammar . ±ormally, a C±G G is a 4-tuple ( V t , V n , S , P ) , where: V t is the set of terminal symbols in the grammar. ±or our purposes, V t is the set of tokens returned by the scanner. V n , the nonterminals , is a set of syntactic variables that denote sets of (sub)strings occurring in the language. These are used to impose a structure on the grammar. S is a distinguished nonterminal ( S V n ) denoting the entire set of strings in L ( G ) . This is sometimes called a goal symbol . P is a Fnite set of productions specifying how terminals and non-terminals can be combined to form strings in the language. Each production must have a single non-terminal on its left hand side. The set V = V t V n is called the vocabulary of G CS502 Parsing 2 Notation and terminology a , b , c ,... V t A , B , C V n U , V , W V ! , " , # V * u , v , w V * t If A # then ! A " !#" is a single-step derivation using A # Similarly, * and + denote derivations of 0 and 1 steps If S * " then " is said to be a sentential form of G L ( G )= { w V * t | S + w } , w L ( G ) is called a sentence of G Note, L ( G { " V * | S * " } V * t CS502 Parsing 3 Syntax analysis Grammars are often written in Backus-Naur form (BN±). Example: 1 ² goal ³ :: = ² expr ³ 2 ² expr ³ :: = ² expr ³² op ³² expr ³ 3 | num 4 | id 5 ² op ³ :: =+ 6 | - 7 | * 8 | / This describes simple expressions over numbers and identiFers. In a BN± for a grammar, we represent 1. non-terminals with angle brackets or capital letters 2. terminals with typewriter font or underline 3. productions as in the example CS502 Parsing 4
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Scanning vs. parsing Where do we draw the line? term :: =[ a - zA - z ]([ a - zA - z ] | [ 0 - 9 ]) * | 0 | [ 1 - 9 ][ 0 - 9 ] * op :: =+ | - | * | / expr :: =( term op ) * term Regular expressions are used to classify: identiFers, numbers, keywords REs are more concise and simpler for tokens than a grammar more efFcient scanners can be built from REs (D±As) than grammars Context-free grammars are used to count: brackets: () , begin ...
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 28

03-parse - The role of the parser Syntax analysis source...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online