This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: ECS 120 Lesson 10 – ContextFree Grammars Oliver Kreylos Friday, April 20th, 2001 In the beginning of the course, we encountered the Chomsky Hierarchy , a framework to classify languages into four different categories, CH0 to CH3. Until now, we have only had a closer look at the most restrictive level of the Chomsky Hierarchy: CH3, the class of regular languages. In the last lesson, we found out that there are languages which are not in CH3, for example the language n 1 n n ≥ or the language R (Σ) of regular expressions over the alphabet Σ. Today we will look at CH2, the class of contextfree languages (CFL), which includes all regular languages, and also the two nonregular languages mentioned above. Contextfree languages are very important in practical applications; al most every programming language has an underlying contextfree structure, and a parser , a program to analyze the contextfree structure of an input string, is a major component of any compiler for these programming lan guages. The parser’s job is to break up a source program into its syntactical parts, e. g., modules, functions, loops, blocks and expressions, and to invoke the code generator , the program that converts syntactical structures into machine code, on these parts independently. Contextfree languages are specified either by contextfree grammars (CFG) or by pushdown automata , a generalization of finite state machines. We will start by having a closer look at contextfree grammars. 1 Definition of ContextFree Grammars We already defined a grammar G as a fourtuple G = ( V, Σ ,R,S ), where V is a finite set of variables, Σ is an alphabet with V ∩ Σ = ∅ , R is a finite set of rules and S ∈ V is the start symbol. For contextfree grammars, all rules 1 in R are of the form A → w , where A ∈ V is a variable and w ∈ ( A ∪ Σ) * is a string of variables and terminals. If there are several rules with the same lefthand side, e. g., A → w 1 ,A → w 2 ,...,A → w n , those are often combined into a single rule A → w 1  w 2  ···  w n . The vertical bar is treated like an “or.” If u,v,w ∈ ( A ∪ Σ) * are strings of variables and terminals, and A → w ∈ R is a rule of G , then uAv ⇒ uwv ( uAv yields uwv ). If u = v , or there exists an n ≥ 0 and a sequence ( u 1 ,u 2 ,...,u n ) ∈ ( ( A ∪ Σ) * ) n of strings of variables and terminals such that u ⇒ u 1 ⇒ u 2 ⇒ ··· ⇒ u n ⇒ v , then u * ⇒ v . The language of grammar G is L ( G ) := w ∈ Σ * S * ⇒ w . 2 Example: The Language L = n 1 n n ≥ Consider the grammar G = ( { S } , { , 1 } ,R,S ) , where R = { S → S 1  } . We will now prove that this grammar exactly generates the nonregular language L ....
View
Full Document
 Spring '07
 Filkov
 Formal language, Formal grammar, parse tree, start symbol

Click to edit the document details