4-Lexical%20and%20Syntax%20Analysis

4-Lexical%20and%20Syntax%20Analysis - 4. Lexical and Syntax...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
ICS 313 - Fundamentals of Programming Languages 1 4. Lexical and Syntax Analysis 4.1 Introduction ± Language implementation systems must analyze source code, regardless of the specific implementation approach ± Nearly all syntax analysis is based on a formal description of the syntax of the source language (BNF) ± The syntax analysis portion of a language processor nearly always consists of two parts: ` A low-level part called a lexical analyzer (mathematically, a finite automaton based on a regular grammar) ` A high-level part called a syntax analyzer, or parser (mathematically, a push-down automaton based on a context-free grammar, or BNF) ± Reasons to use BNF to describe syntax: ` Provides a clear and concise syntax description ` The parser can be based directly on the BNF ` Parsers based on BNF are easy to maintain
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ICS 313 - Fundamentals of Programming Languages 2 4.1 Introduction (continued) ± Reasons to separate lexical and syntax analysis: ` Simplicity - less complex approaches can be used for lexical analysis; separating them simplifies the parser ` Efficiency - separation allows optimization of the lexical analyzer ` Portability - parts of the lexical analyzer may not be portable, but the parser always is portable 4.2 Lexical Analysis ± A lexical analyzer is a pattern matcher for character strings ± A lexical analyzer is a “front-end” for the parser ± Identifies substrings of the source program that belong together - lexemes ± Lexemes match a character pattern, which is associated with a lexical category called a token ± sum is a lexeme; its token may be IDENT
Background image of page 2
ICS 313 - Fundamentals of Programming Languages 3 4.2 Lexical Analysis (continued) ± The lexical analyzer is usually a function that is called by the parser when it needs the next token ± Three approaches to building a lexical analyzer: 1. Write a formal description of the tokens and use a software tool that constructs table-driven lexical analyzers given such a description 2. Design a state diagram that describes the tokens and write a program that implements the state diagram 3. Design a state diagram that describes the tokens and hand- construct a table-driven implementation of the state diagram ± We only discuss approach 2 4.2 Lexical Analysis (continued) ± State diagram design: ` A naive state diagram would have a transition from every state on every character in the source language - such a diagram would be very large! ± In many cases, transitions can be combined to simplify the state diagram ` When recognizing an identifier, all uppercase and lowercase letters are equivalent - Use a character class that includes all letters ` When recognizing an integer literal, all digits are equivalent - use a digit class ` Reserved words and identifiers can be recognized together (rather than having a part of the diagram for each reserved word) ¾ Use a table lookup to determine whether a possible identifier is in fact a reserved word
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/15/2010 for the course ICS ics103 taught by Professor Alvi during the Spring '07 term at King Fahd University of Petroleum & Minerals.

Page1 / 11

4-Lexical%20and%20Syntax%20Analysis - 4. Lexical and Syntax...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online