c04 - CS421 COMPILERS AND INTERPRETERS CS421 Syntax...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS421 COMPILERS AND INTERPRETERS Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 1 of 36 Syntax Analysis • Convert the list of tokens into a parse tree (“hierarchical” analysis) •T h e syntactic structure is specified using context-free grammars [in lexical anlaysis, the lexical structure is specified using regular expressions] •A parse tree (also called concrete syntax ) is a graphic representation of a derivation that shows the hierarchical structure of the language • Other secondary tasks: syntax error detection and recovery token get next token lexical analyzer parser source program parse tree abstract syntax Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 2 of 36 Tokens ---> Parse Tree Tokens: FUNCTION ID( do_nothing1 ) LPAREN ID( a ) COLON ID( int ) COMMA ID( b ) COLON ID( string ) RPAREN EQ ID( do_nothing2 ) LPAREN INT( 1 ) PLUS ID( a ) RPAREN The parse tree captures the syntactic structure ! fundec RPAREN tyfields LPAREN ID FUNCTION EQ exp tyf COMMA PLUS RPAREN LPAREN exp exp expl ID exp ID INT COLON ID ID tyf COLON ID ID source program : function do_nothing1(a:int,b:string) = do_nothing2(1+a) ParseTree: Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 3 of 36 Main Problems • How to specify the syntactic structure of a programming language ? by using Context-Free Grammars (CFG) ! •H ow t o parse ? i.e., given a CFG and a stream of tokens, how to build its parse tree ? 1. bottom-up parsing 2. top-down parsing • How to make sure that the parser generates a unique parse tree ? (the ambiguity problem) • Given a CFG, how to build its parser quickly ? using YACC ---- the parser generator • How to detect, report, and recover syntax errors ? Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 4 of 36 Grammars grammar is a precise, understandable specification of programming language syntax (but not semantics !) •G r amm a r is normally specified using Backus-Naur Form (BNF) --- 1. a set of rewriting rules (also called productions ) 2. a set of non-terminals and a set of terminals non-terminals ---- stmt, expr terminals ---- if, then, else, +, *, (, ), id 3. lists are specified using recursion stmt -> begin stmt-list end stmt-list -> stmt | stmt ; stmt-list stmt -> if expr then stmt else stmt expr -> expr + expr | expr * expr | ( expr ) | id “or”
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CS421 COMPILERS AND INTERPRETERS Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 5 of 36 Context-Free Grammars (CFG) •A context-free grammar is defined by the following (T,N,P,S): T is vocabulary of terminals, N is set of non-terminals, P is set of productions (rewriting rules), and S is the start symbol (also belong to N ). • Example: a context-free grammar G=(T,N,P,S) T = { +, *, (, ), id }, N = { E }, P = { E -> E + E, E -> E * E, E -> ( E ), E -> id }, S = E •W r i t t e n i n BN F : E -> E + E | E * E | ( E ) | id • All regular expressions can also be described using CFG Copyright 1994 - 2010 Zhong Shao, Yale University Syntax Analysis : Page 6 of 36 Context-Free Languages (CFL) • Each context-free gammar G=(T,N,P,S) defines a context-free language L = L(G) •T h e C F L L(G) contains all sentences of teminal symbols (from T ) --- derived by repeated application of
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 01/06/2012.

Page1 / 9

c04 - CS421 COMPILERS AND INTERPRETERS CS421 Syntax...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online