02._Lexical_Analysis

02._Lexical_Analysis - Lexical Analysis Teodor...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Lexical Analysis Teodor Przymusinski, UC Riverside Compiler Construction CS152 - Based on the slides by Leonidas Fegaras, UTA Lexical Analysis A scanner groups input characters into tokens input token value keyword let identifier x equal = identifier x star * let x = x *(a+123) left-paren ( identifier a plus + integer 123 right-paren ) Tokens are typically represented by numbers 2 Compiler Construction CS152 - Based on the slides by Leonidas Fegaras, UTA Parser Each time the parser needs a token, it sends a request to the scanner the scanner reads as many characters from the input stream as necessary to construct a single token when a single token is formed, the scanner is suspended and returns the token to the parser the parser will repeatedly call the scanner to read all the tokens from the input stream scanner parser get token token source file get next character AST 3 Compiler Construction CS152 - Based on the slides by Leonidas Fegaras, UTA Tasks of a Scanner A typical scanner: recognizes the keywords of the language these are the reserved words that have a special meaning in the language, such as the word class in Java recognizes special characters, such as ( and ), or groups of special characters, such as := and == recognizes identifiers, integers, reals, decimals, strings, etc ignores whitespaces (tabs, blanks, etc) and comments recognizes and processes special directives (such as the #include "file" directive in C) and macros 4 Compiler Construction CS152 - Based on the slides by Leonidas Fegaras, UTA Scanner Generators Input: a scanner specification describes every token using Regular Expressions (REs) e.g., the RE [a-z][a-zA-Z0-9]* recognizes all identifiers with at least one alphanumeric letter whose first letter is lower-case alphabetic handles whitespaces and resolve ambiguities Output: the actual scanner Scanner generators compile regular expressions into efficient programs (finite state machines) You will use a scanner generator for Java, called JLex, for the project 5 Compiler Construction CS152 - Based on the slides by Leonidas Fegaras, UTA Regular Expressions are a very convenient form of representing (possibly infinite) sets of strings, called regular sets e.g., the RE (a | b)*aa represents the infinite set {aa,aaa,baa,abaa, ... } a RE is one of the following: Name RE Designation epsilon {} symbol a {a} for some character a concatenation AB the set { rs | r A , s B }, where rs is a concatenation, and A and B designate the REs from A and B alternation A | B the set A B , where A and B designate the REs from A and B repetition A* the set | A | (AA) | (AAA) | ... (an infinite set) e.g., the RE (a | b)c designates { rs | r {a} {b}, s {c} }, which is equal to {ac,bc} Shortcuts:...
View Full Document

Page1 / 27

02._Lexical_Analysis - Lexical Analysis Teodor...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online