ECS 120 Lesson 2 – Alphabets, Languages and Grammars, Pt. 2; Finite State Machines, Pt. 1 Oliver Kreylos Monday, April 2nd, 2001 1 Languages A (formal) language L over an alphabet Σ is a set of words over Σ; that is, L is a subset of Σ * . Examples: The set of all english words over the latin alphabet: { a , Aardvark , aback ,..., Zulu , zymotic } . The set of all correctly formed arithmetic expressions over the alphabet { a , ( , ) , + , * } : { a , (a) , a+a , a*(a+a) ,... } . The set of all syntactically correct C programs over the ASCII alphabet: { main() {} , int main() {return 0;} ,... } . In natural languages, words are only the building blocks for more complex constructs, e. g., phrases and sentences, whereas in formal languages words usually have no relationship to each other and stand on their own. To specify a language, one has to specify exactly which words over the alphabet are included in the language. There are at least three diﬀerent methods to do this: Provide a mathematical description of the set of words, either by listing all words explicitly or giving a property each word must have, e. g., L = ± w ASCII * ² ² w is a syntactically correct C program ³ . Provide a grammar that can construct all words in the language. 1

Provide an automaton that can decide whether a given word is in the language or not. The ﬁrst method is very general – too general to be generated or checked by automatic means. 2 Grammars A grammar describes how to create words of a language. Formally, a gram- mar G is a 4-tuple G = ( V, Σ ,R,S ), where 1. V is a ﬁnite set of variables or nonterminals . 2. Σ is an alphabet of terminals disjoint from V , i. e., V Σ = . 3. R is a ﬁnite set of substitution rules or productions of the form u v , where u,v ( V Σ) * and u / Σ * , i. e., u contains at least one variable. 4.
