csc402-ln004

csc402-ln004 - Multi-Symbol Words - Lexical Analysis In our...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Multi-Symbol Words - Lexical Analysis In our exp0 programming language we only had words of length one However, most programming languages have words of lengths more than one The lexical structure of a programming language specifies how symbols are combined to form words Not to be confused with the phrase structure which tells how words are combined to form phrases and sentences The lexical structure of a programming language can be specified with regular expressions The parser for the lexical structure of a programming language is called a lexical analyzer Multi-Symbol Words - Lexical Analysis This gives us the following hierarchy: symbol word phrase sentence Lexical structure (regular expressions) Phrase structure (grammars) Regular Expressions (RE) REs can be defined inductively as follows: Each letter a through z and A through Z constitutes a RE and matches that letter Each number 0 through 9 constitutes a RE and matches that number If A is a RE, then (A) is also a RE and matches A If A and B are REs, then AB is also a RE and matches the concatenation of A and B. If A and B are REs, then A|B is also an RE and matches A or B If A is a RE, then A? is also a RE and matches zero or one instances of A If A is a RE, then A* is also a RE and matches zero or more instances of A If A is a RE, then A+ is also a RE and matches one or more instances of A Regular Expressions (RE) Special RE Classes: (a..z) - any single character between a and z (A..Z) - any single character between A and Z (0..9) - any single digit between 0 and 9 . - the dot matches any character, not to be confused with the .. operator above. Also, any other character can be considered a RE. If it is a character that appears in the syntax of REs, then it needs to be escaped i.e., \++ Examples (print|write) -?(0..9)+ (a..z|A..Z)+(0..9)* Regular Expressions (RE) Exercises: Write a RE for character strings that start and end with a single digit....
View Full Document

Page1 / 19

csc402-ln004 - Multi-Symbol Words - Lexical Analysis In our...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online