Unformatted text preview: ContextFree Languages Sample Solutions
Designing CFLs
Problem 1: Give a contextfree grammar that generates the following language over contains more 1s than 0s 0 1& ' 0 ( 1 ) 4 03 8A2 2 4 5( 3 4 3 [email protected]( 8A2 2 6 D4 0 3 ' ' 4 @( 3 ( ' ' 0 ( 1 B 4 03 ( 8 A2 ' ' 6 74 0 3 2 4 @( 3 ' 8 92 4 03 6 4 3 75( 8 A2 ' ' 0 6 G4 0 3 2 ' 2 : Solution: Idea: this is similar to the language where the number of 0s is equal to the number of 1s, except we must ensure that we generate at least one 1, and we must allow an arbitrary number of 1s to be generated anywhere in the derivation. The following grammar accomplishes this task:
& # % $ #" 8 92 2 4 5( 3 6 4 3 G5( ' 0 E ( 892 2 ' 4 3 [email protected]( 0 1 ( ' 8 2 ' 2 4 3 6 C4 3 Proof of correctness: it should be clear that this grammar cannot generate any strings not in . The production for guarrantees that any string contains at least one 1, and any time a 0 is generated, at least one additional 1 is generated with it. We must argue that the grammar generates all strings with more 1s generate all strings containing a number of 1s greater than or equal to the than 0s. The productions for number of 0s (proven below). The production for asserts that any string in can be written where and . This is true: if begins with a 1, we can say that . If begins with a 0, we can use a counter which is incremented by 1 for each 0 encountered and decremented by 1 for each 1 encountered, and at some point in the string this counter must become 1 upon encountering a 1 since contains more 1s than 0s. Let the part of prior to this point be and the part of after this point be ; clearly, this breakdown of satisfies the requirements stated above. Now, to show that generates all strings such that , the same "counter" argument will work. If begins with a 0, it must be of the form where and . If, on the other hand, begins with a 1, it must either be the case that where and , or it is the case that where . Both of these cases are handled by the transitions.
' ' ! 1 Problem 2: Give a contextfree grammar generating the language the complement of the language
T UR 6 S 5%RP I Q I H Idea: we can break this language into the union of several simpler languages: . That is, all strings of a's followed by b's in which the number . of a's and b's differ, unioned with all strings not of the form First, we can achieve the union of the CFGs for the three languages:
Y QW `hPH r q 5 5 pi 4 fFH 3 1 4 gFH 3 4 fFH 3 5 c Q d H Q d Q Q d d Now, the set of strings Finally, is easily generated as follows:
y H y Q Q 5H y r Problem 3: Give a CFG to generate and either or
T U c c a 6 c a Y Q W H Is the grammar ambiguous? Why or why not? Solution: Idea: this language is simply the union of and . We can create simple grammars for the separate languages and union them: q pi c q c a 6 c a Y Q W H For , we simply ensure that the number of a's equals the number of b's:
& & tQ H Similarly for ensuring that the number of b's equals the number of c's:
& q H & Q q This grammar is ambiguous. For , we may use either or to generate . 2 ( q 5 Q EQ 5 tQ 5 H q q q w5 Similarly for : H u H tQ H s y I I Q I H x( c Q d Q d Q d 4 xDH 3 H 4 xfH 3 Q 4 xfH 3 b sa Y Q W H c e Y QW H sa "`vP 6 c a `vP Y QW H e Fa Y Q W [email protected] c H d f Solution: is generated by a simple CFG: b Y QW H Da "`XP V Solution: This was done in class, but it may be useful to see the solution written formally. must be even length, and its two halves must differ in at least one bit. This means Idea: any string can be written or where and . But this is the same as saying or where and . Formulated this way, we can easily write a grammar for the language: @0 8 j @f y i0 d y 8 5f d ' f 0 d EPR1 g E"1 h 0 f d ' f 0 d 1RE e ' ' 0 f d 15"E h ' ' Problem 5: Give a simple description of the language generated by the following grammar in English, then use that description to give a CFG for the complement of that language.
H k k xQ EQ H & RH Q k k k Solution: Clearly, generates . , then, generates strings like and . Thus we can get strings like where , and we can also get strings like where , but cannot get where . Furthermore, we can generate any string beginning with a or ending with an , and every string beginning with and ending with that is not of the form . This, then, is exactly the complement of the language . A grammar for the complement of this language (which is, of course, just ) is simply
H R 6 S I Q I H T & tQ H Q i c R 6 S I Q I H H c a Y QW nXPH Problem 6. Write a contextfree grammar for the language . s ( g is a substring of Solution: Strings in this language share the property that they start with a string w followed by a , followed by anything, followed by , followed by anything. So we want strings of the form . Let generate the part, and let generate the final part. Thus we want derivations that proceed as follows: 4 wD 3 d T 4 zy 3 r 4 y 3 D u x d d p u d r t v4 if 3 l p r r u x x i 4 iD 3 t 4 d r A grammar accomplishing this is: p l U{ Since the recursion with nonterminal ends only when the transition , must generate a string whose beginning and end are mirror images. Since generates , the nonterminal generates all strings of the form . Note that this also covers the case where . Since is followed by in the transition for the toplevel nonterminal , the grammar generates all strings of the form . 3 & r d t1 4 3 D p 4 dy 3 rt 4 l 3 l d p dj 3 l p p c ma e I Q 4 lH 3 Q I H Q d ( Y QW H Q Y QW H I H 4 lH 3 I H Q Q d jp r lq d 4 v 3 ( l p o # & ~}"t y y 8 #$ ! 8 y Y QW H bFa Q d 4 lH 3 Problem 4: Show that 0 ( @0 ( ( @0 ( 0 is a context free language. k for Chomsky Normal Form (CNF)
Problem 7: Solution:
& x & $t Convert the following CFG into CNF, using the procedure given in Theorem 2.6. 2. Next eliminate the rule , resulting in new rules corresponding to
& 8 3. Now eliminate the redundant rule and the rule
& : 4. Now remove the unit rule : t x & 8 6. Finally convert the 00 and rules:
& 8 2 8 2 x f x 8 This grammar satisfies all the requirements for Chomsky Normal Form. 4 & $t 8 2 8 91A2 t 8 2 8 A192 t 8 f 8 92 8 5. Then remove the unit rule : & & x & $t & 8 t t t 8 8 & & 8 1. First add a new start symbol and the rule : : Closure Properties i q Problem 8: Let be a contextfree language and be a regular language. Prove that the language is contextfree. Then use the above to show that the language given below is not a CFL. and contains equal numbers of 's, 's and 's R Q Q 6 Q H Q H sV Solution: We have a CFL and a regular language and we want to show that is contextfree. , to recognize . Since is given Since is given to be a CFL we know that there exists a PDA, say to be regular we have a DFA, say , to recognize . To prove that is a CFL we demonstrate a pushdown automaton, call it , that recognizes . The proof is by construction. We construct from and . The construction is similar to the proof of showing that the class of regular languages are closed under the union (or intersection) operation on pg. of the text. Let recognize , where . recognize , where . Let Construct to recognize , where . 1. 2. 3. 4. and is defined as follows: for each ; each and each 5. 6. and Note that the above construction works only because one of the machines being simulated (the DFA above) does not need a stack. Observe that we may need to maintain stacks if we attempted to simulate PDA's instead, and that a PDA cannot do that. Now to show that the given language is not a CFL, we will make the assumption that it is and then derive a contradiction. Under this assumption we are guaranteed (from the part above) that if we intersected some regular language with , then the resulting language would be a CFL. So if we show that for some regular language and some language which is not a CFL that, , then we have derived the contradiction. To see what this and might be consider all these languages, , and as capturing "some property". From the definition of we see that this property is "equality" of 's, 's and 's. For lets try the canonical example of the language that is not a CFL, viz. . has the property of "equality" as well as "order" of (zero or more) 's followed by (zero or more) 's followed by (zero or more) 's. Now it is easy to see what we want of ; that should have the property of "order". (and we know that this is regular). This is Since we have a contradiction, it must be that is not a CFL.
S I I Q I H H H 11RH Q 5 yQ 4 H q 3 Q H 3 4 "q 4 $ U 3 zH Q H q 3 m4 4 t3 q 4 t3 q q is wlog assumed the same in and . let ze 4 PFl% X3 4 q q q l% DX3 q 4 t% X3 q q q D q q q A U 4 t3 q q U 4 t3 4 q 3 q " Designing PDAs
Problem 9: Give an informal description and state diagram for the language that is, is a palindrome T U Solution: This is fairly simple: we can push the first half of , nondeterministically guess where its middle is, and start popping the stack for the second half of , making sure the second half matches what we pop is odd or even, though. off the stack. We have to worry about the case where The state diagram for is shown in figure 1.
q0 , q1 , 0, 1, 0, 1, 0 1 0, 0 1, 1 $ Figure 1: State diagram for 8 From the start state , we push a $ onto the stack to mark its bottom. In state , we push the first half of is odd. Then we nondeterministically guess where onto the stack, not including the middle symbol if the middle occurs, at which point we can either move to state without consuming any input if the length is odd. In state , we pop each stack symbol from of is even, or simply ignore the middle symbol if the stack, ensuring that it matches the current input symbol. Finally, if all goes well, we will reach the end of with an empty stack (top symbol = $) and accept. Otherwise the PDA will always crash.
q q 6 j D7 j q3 , $ q2 Problem 10: Give an informal English description of a PDA for the language . language R S 6 S I Q I H q Q Q H Solution: A PDA for this language can be motivated by the CFG for it, which was described in the sample solutions part I, and displayed in Problem 2.25. As a reminder, here is the CFG:
H k k xQ EQ H & RH Q k k k Recall that this CFG generates strings of the form pr . All we have to do to accept strings of this form is to push the first a's onto the stack in state , and nondeterministically switch to a new state when that is done. At this point we have two branches: 1. If the next symbol is a b, we "flush" that input, go to state , then continue flushing the part of the . We nondeterministically guess when this is done and move to state string corresponding to , which pops b's corresponding to the number of a's that were pushed at the beginning of the string, finally switching to an accept state if the correct number of b's were matched. 2. If the next symbol was not a b, on the other hand, we allow the machine to switch from to , nondeterministically "flush" the part of the string (in this case our input string must be of the form ) then consume the a on the way to state which as before pops b's and accepts if everything matches correctly. Problem 11: Give an informal description of a pushdown automaton that recognizes and either or
T U c c a 6 c 1W Q W a H S } Q d 4 tH 3 I 1 4 9mH 3 I H Q H Q d q Q d 4 gH 3 r S Solution: Our PDA is fairly simple. First it pushes a bottomofstack marker then nondeterministically proceeds to either state or . From state , we accept strings in which the number of 's is equal to the number of 's by pushing each encountered. Then when the automaton sees the first , it begins matching the 's against the 's already on the stack, oneforone, until the bottomofstack marker is reached. If the number of 's and 's matches (note this could also be 0), the machine then reads and ignores all 's following the last , accepting on the end of input. This part of the machine crashes if the number of 's is not . equal to the number of 's or the input string is not of the form From state , we accept strings in which the number of 's is equal to the number of 's by first ignoring any 's occuring at the beginning of the string, pushing the following 's onto the stack, then popping a for each encountered at the end of the string. Again, this part of the machine must allow strings where the number of 's and 's is zero, and crash when the input string is not of the form or when the number of popped 's is not the same as the number of 's at the end of the string.
Q Q Q H Y Q Q H H Q 1}H Q Q Q Q Y W H H Q H %Y Y W 7 I 4 jH 3 I H Q H Q d the complement of the I } 4 H 3 Q I H Q Q d Assuming, as in Theorem 2.12, that a shorthand notation allows us to write an entire string to the stack in one PDA step, this task simply reduces to forming transition rules that implement the productions in the grammar. Figure 2 shows the PDA.
qstart , E$ , E , E , T , T , F , F E+T T TxF F (E) a a, a +, + ), ) (, ( x, x q loop , $ q
accept The transitions for the rules of the grammar allow us to nondeterministically replace grammar nonterminals on the stack with their corresponding righthandsides; the transitions for the terminals of the grammar ( ) allow matching of input symbols to grammar terminals. There will be an accepting path through the PDA on string if and only if can be generated by the grammar . H 3 4 8 Figure 2: PDA recognizing the language generated by B V 4 H A 3 Solution: The CFG is: Problem 12: Convert the CFG in Theorem 2.12. given in Exercise 2.1 to an equivalent PDA using the procedure given NonContextFree Languages
Problem 13: Show that Solution: ' W 0 (W f q ' f f S ( ' q 0 ( ' q q 0 A%Q I H ( Am f 0 ( q f f ' q 0 ( ' q is not contextfree. Q nUH ' 0 ( f g q 4 3 f 2 AE 0 Using the pumping lemma, assume the contrary, and let . The lemma says for any . Clearly, if either or straddles the boundary between a's and b's, pumping will generate strings not in . If and are composed entirely of a's, then , which is not in since the number of b's is no longer the square of the number of a's. The same argument holds if . The only remaining case is when is one or more a's and is one or more b's. If this is the case, then we have , where . But since and (becaue ). Since cannot be any perfect square, it certainly cannot be for any . Since every case results in a contradiction, is not contextfree. Problem 14: Show that is not contextfree. F q 4 c 3 s{c e Q Y j nH q q 4 3 ' q 0 ( q e q bx c f 6 H } $ 0 c q ' 5 nH Q Y 0 ( f AF q 6 a ' q Q b c Pf 0 e z is contextfree. Hence, the Solution: For the sake of contradiction, let us first assume that , where is the pumping length. Since pumping lemma can be applied to it. Let and , we know that can be broken into 5 parts, , satisfying the following conditions: 1. for each 2. 3. b 0 @Pf ; ; and .
q q q a If we choose , then
4 3 s i @Pf 0 f As we can see, the length of is between the squares of two consecutive integers and, thus, cannot be a square of an integer. Consequently, . So the assumption that is a CFL is false. A2 92 Problem 15: Decide whether answer. is a CFL and prove your is not context free. We can prove this with the pumping lemma. Let . Clearly Solution: this string is in . If , then , which is not in because the number of b's is more than twice the number of a's. If , then . This string cannot be in because there are at least twice as many b's as a's. If, on the other hand, contains both a's and b's, then the situation is a little more complicated. If then we can pump down to get , for which and with . But , since whenever , so is not in . If , on the other hand, we can pump up to get a number of b's more than the number of a's: if we use the string , and where . Then 9 4 3 y4 ' 4 3 A2 0 ( f 3 ' 2 9E c 6 F4 0 ( f 3 e 4 3 c ' 92 ' c 4 3 w4 0 ( f 3 E0 ( f A2 A2 @( f 3 4 @( f 3 4 c s4 c 3 ( $ q 0 ( f Q c C l H e c ' A2 A4 $ 3 ( b c EEH Y 0 ( f ' ( e c yVe ' A2 A4 3 ( he c l 92 2 4 "f 3 0 b 4 Pf 3 0 0 ( f ' b 5 Q H 0 ( f b c Y Q q q q b x c Q H q AF 4 @( 3 T 4 e 5( 3 q Ri"f b 0 AF 4 e 5( 3 and 6 ' 0 ( f ` H ' q Q H 0 ( q W P0 ( "f W 6 a Am s i0 ( f Solution: In order to show that is not a CFL, we will proceed by contradiction. Assume is a CFL. Let be the pumping length given by the pumping lemma and let . Because is a member of and , the pumping lemma says that can be split into satisfying the following conditions: ' 1. for each 2. 3.
p b 0 @Pf , , , and . p For convenience, we also write as , where and stand for the strings to the left and to the right of , respectively. First of all, we can note that cannot contain , since would contain more than one and . Then, we can think of three possible cases for the string : would not be in . Therefore, cannot be a substring of F F b for some 10 ' Because every possible way of splitting the input into is a CFL is false and the proof is complete. tion that s yields a contradiction, the initial assump c Y EH contains an equal number of symbols from both and . In this case, because of conditions 2 and 3 of the pumping lemma, and such that . Therefore is not in the language. Y nQ f Q Y H p Y Q nUH 0 ( f ' q 0 ( q f c 5s where . Therefore, b t contains more symbols from than from . In this case (pumping down), substring of and the entire string is not in .
' 8 0 ( 8 f 0 Pf XG p contains more symbols from than from where In this case, . entire string is not in s ' p m q 0 ( q f 0 Pf . and the cannot be a p s ' 0 ( f Q p Q H H 0 Pf ' q 0 ( q f p s XG 0 Pf W 0 (W f b P Problem 16: Let is a substring of . Show that s ( ( D p s is not a contextfree language. s 6 a s i0 ( f s 0 Pf ...
View
Full Document
 Summer '99
 Paturi
 Formal language, Formal languages, Contextfree grammar, contextfree language

Click to edit the document details