ece251-l2 - ECE 251 Assignment #2: Be the ANTLR Patrick Lam...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ECE 251 Assignment #2: Be the ANTLR Patrick Lam Due: November 5 1 Problem Description One of the themes Ive been repeating this term is that compiler technology is useful for many tasks besides building compilers for general-purpose programming languages. SQL, or the Structured Query Language, is a ubiquitous domain-specific language for talking to databases. Basic SQL is not difficult to pick up, but it is beyond the scope of this course. However, the parsing of SQL is very much on-topic for this course, and it is actually fairly simple. In this lab, you will build a lexer and parser for a small SQL subset by hand, using the recursive- descent parser construction techniques we saw in class. Please do not use a parser generator for this assignment; I would like you to build at least one parser by hand in this course. You may wish to consult the SQLite documentations syntax diagrams for information on SQL: We will only be implementing a subset of this language. 2 Task 1: Lexical Analysis Weve seen that the two first tasks in creating a compiler are lexical analysis and parsing. The first task will be to create a lexer for your language, which will account for 20% of the marks for this lab. Specifically, I provide a class which splits the stream of characters into a stream of words. Your task is to create Token s for these words, by plugging in the appropriate regular expressions into the Token class. Ive put up a simple test suite for lexical analysis. You should also create a couple of test cases (but, this time, you dont need to hand them in). Ill only run the test cases that I post. Token specifications. SQL is case-insensitive. Your lexer must differentiate between keywords, identifiers, and literals (boolean, numeric and string). The enum type Token.Type contains all of the tokens that you need to recognize. Keywords are obvious. Identifiers start with a letter (a-z) or an underscore, and continue with letters, underscores, and digits, or contain arbitrary characters between two double-quote marks ( " ). (To include a double quote, write two double quotes.) 1 Boolean literals may be the strings TRUE or FALSE.Boolean literals may be the strings TRUE or FALSE....
View Full Document

This note was uploaded on 10/28/2010 for the course ECE 493 taught by Professor Lam during the Spring '09 term at Waterloo.

Page1 / 5

ece251-l2 - ECE 251 Assignment #2: Be the ANTLR Patrick Lam...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online