Syntax

Syntax - COP4020 Programming Languages Syntax Prof. Robert...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: COP4020 Programming Languages Syntax Prof. Robert van Engelen COP4020 Spring 2011 2 2/17/11 Overview ! Tokens and regular expressions ! Syntax and context-free grammars ! Grammar derivations ! More about parse trees ! Top-down and bottom-up parsing ! Recursive descent parsing COP4020 Spring 2011 3 2/17/11 Tokens ! Tokens are the basic building blocks of a programming language " Keywords, identifiers, literal values, operators, punctuation ! We saw that the first compiler phase (scanning) splits up a character stream into tokens ! Tokens have a special role with respect to: " Free-format languages : source program is a sequence of tokens and horizontal/vertical position of a token on a page is unimportant (e.g. Pascal) " Fixed-format languages : indentation and/or position of a token on a page is significant (early Basic, Fortran, Haskell) " Case-sensitive languages : upper- and lowercase are distinct (C, C++, Java) " Case-insensitive languages : upper- and lowercase are identical (Ada, Fortran, Pascal) COP4020 Spring 2011 4 2/17/11 Defining Token Patterns with Regular Expressions ! The makeup of a token is described by a regular expression (RE) ! A regular expression r is one of " A character (an element of the alphabet ! ), e.g. ! = { a , b , c } a " Empty, denoted by ! " Concatenation : a sequence of regular expressions r 1 r 2 r 3 r n " Alternation : regular expressions separated by a bar r 1 | r 2 " Repetition : a regular expression followed by a star (Kleene star) r * COP4020 Spring 2011 5 2/17/11 Example Regular Definitions for Tokens ! digit " 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ! unsigned _ integer " digit digit * ! signed _ integer " (+ | - | ! ) unsigned _ integer ! relop " < | <= | <> | > | >= | = ! letter " a | b | | z | A | B | Z ! id " letter ( letter | digit )* ! Cannot use recursive definitions! digits " digit digits | digit COP4020 Spring 2011 6 2/17/11 Finite State Machines = Regular Expression Recognizers 2 1 6 3 4 5 7 8 return ( relop , LE ) return ( relop , NE ) return ( relop , LT ) return ( relop , EQ ) return ( relop , GE ) return ( relop , GT ) start < = > = > = other other * * 9 start letter 10 11 * other letter or digit return ( gettoken (), install_id ()) relop " < | <= | <> | > | >= | = id " letter ( letter | digit ) * Non-Deterministic Finite State Automata ! An NFA is a 5-tuple ( S , # , $ , s , F ) where S is a finite set of states # is a finite set of symbols, the alphabet $ is a mapping from S % # to a set of states s & S is the start state F ' S is the set of accepting (or final) states COP4020 Spring 2011 7 2/17/11 start a 1 3 2 b b a b S = {0,1,2,3} # = { a , b } s 0 = 0 F = {3} From a Regular Expression to an NFA COP4020 Spring 2011 8 2/17/11 N ( r 2 ) N ( r 1 ) f i !...
View Full Document

Page1 / 47

Syntax - COP4020 Programming Languages Syntax Prof. Robert...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online