This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Parse Trees
Definitions
Relationship to Left and Rightmost Derivations
Ambiguity in Grammars
1 Parse Trees
x Parse trees are trees labeled by symbols of a particular CFG.
x Leaves: labeled by a terminal or ε.
x Interior nodes: labeled by a variable. Children are labeled by the right side of a production for the parent. x Root: must be labeled by the start symbol.
2 Example: Parse Tree
S > SS  (S)  ()
S
S
( S S
( ) ( ) )
3 Yield of a Parse Tree
x The concatenation of the labels of the leaves in lefttoright order That is, in the order of a preorder traversal. is called the yield of the parse tree.
x Example: yield of is (())() S S
( S S)
( ) ( )
4 Parse Trees, Left and Rightmost Derivations
x For every parse tree, there is a unique leftmost, and a unique rightmost derivation.
x We’ll prove:
1. If there is a parse tree with root labeled A and yield w, then A =>*lm w.
2. If A =>*lm w, then there is a parse tree with root A and yield w.
5 Proof – Part 1
x Induction on the height (length of the longest path from the root) of the tree.
A
x Basis: height 1. Tree looks like
x A > a1…an must be a a1 . . .
production.
x Thus, A =>*lm a1…an.
6 an Part 1 – Induction
x Assume (1) for trees of height < h, and A
let this tree have height h:
x By IH, Xi =>*lm wi.
X1 . . . Xn Note: if Xi is a terminal, then Xi w1
wn
= wi. x Thus, A =>lm X1…Xn =>*lm w1X2…Xn =>*lm w1w2X3…Xn =>*lm … =>*lm w1…wn.
7 Proof: Part 2
x Given a leftmost derivation of a terminal string, we need to prove the existence of a parse tree.
x The proof is an induction on the length of the derivation. 8 Part 2 – Basis
x If A =>*lm a1…an by a onestep derivation, then there must be a parse tree
A
a1 . . . an 9 Part 2 – Induction
x Assume (2) for derivations of fewer than k > 1 steps, and let A =>*lm w be a k
step derivation.
x First step is A =>lm X1…Xn.
x Key point: w can be divided so the first portion is derived from X1, the next is derived from X2, and so on.
If Xi is a terminal, then wi = Xi. 10 Induction – (2)
x That is, Xi =>*lm wi for all i such that Xi is a variable.
And the derivation takes fewer than k steps. x By the IH, if Xi is a variable, then there is a parse tree with root Xi and yield wi.
A
x Thus, there is a parse tree X1 . . . Xn
w11
1 wn Parse Trees and Rightmost Derivations
x The ideas are essentially the mirror image of the proof for leftmost derivations.
x Left to the imagination. 12 Parse Trees and Any Derivation
x The proof that you can obtain a parse tree from a leftmost derivation doesn’t really depend on “leftmost.”
x First step still has to be A => X1…Xn.
x And w still can be divided so the first portion is derived from X1, the next is derived from X2, and so on.
13 Ambiguous Grammars
x A CFG is ambiguous if there is a string in the language that is the yield of two or more parse trees.
x Example: S > SS  (S)  ()
x Two parse trees for ()()() on next slide. 14 Example – Continued
S S S S S
( S
) ( (
) S
) ( S
S )
( S
) 15 ( ) Ambiguity, Left and Rightmost Derivations
x If there are two different parse trees, they must produce two different leftmost derivations by the construction given in the proof.
x Conversely, two different leftmost derivations produce different parse trees by the other part of the proof.
x Likewise for rightmost derivations.
16
Ambiguity, etc. – (2)
x Thus, equivalent definitions of “ambiguous grammar’’ are: 1. There is a string in the language that has two different leftmost derivations.
2. There is a string in the language that has two different rightmost derivations. 17 Ambiguity is a Property of Grammars, not Languages
x For the balancedparentheses language, here is another CFG, which is unambiguous.
B, the start symbol,
derives balanced strings.
B > (RB  ε
R > )  (RR
R generates strings that
have one more right paren
than left.
18 Example: Unambiguous Grammar
B > (RB  ε R > )  (RR
x Construct a unique leftmost derivation for a given balanced string of parentheses by scanning the string from left to right.
If we need to expand B, then use B > (RB if the next symbol is “(” and ε if at the end.
If we need to expand R, use R > ) if the next symbol is “)” and (RR if it is “(”.
19 The Parsing Process
Remaining Input:
(())() Steps of leftmost derivation:
B Next
symbol B > (RB  ε R > )  (RR20 The Parsing Process
Remaining Input:
())()
Next
symbol Steps of leftmost derivation:
B
(RB B > (RB  ε R > )  (RR21 The Parsing Process
Remaining Input:
))()
Next
symbol Steps of leftmost derivation:
B
(RB
((RRB B > (RB  ε R > )  (RR22 The Parsing Process
Remaining Input:
)()
Next
symbol Steps of leftmost derivation:
B
(RB
((RRB
(()RB B > (RB  ε R > )  (RR23 The Parsing Process
Remaining Input:
() Steps of leftmost derivation:
B
(RB
Next
symbol
((RRB
(()RB
(())B
B > (RB  ε R > )  (RR24 The Parsing Process
Remaining Input:
) Steps of leftmost derivation:
B
(())(RB
(RB
Next
symbol
((RRB
(()RB
(())B
B > (RB  ε R > )  (RR25 The Parsing Process
Remaining Input: Steps of leftmost derivation:
B
(())(RB
(RB
(())()B
Next
symbol
((RRB
(()RB
(())B
B > (RB  ε R > )  (RR26 The Parsing Process
Remaining Input: Steps of leftmost derivation:
B
(())(RB
(RB
(())()B
Next
symbol
((RRB
(())()
(()RB
(())B
B > (RB  ε R > )  (RR27 LL(1) Grammars
x As an aside, a grammar such B > (RB  ε R > )  (RR, where you can always figure out the production to use in a leftmost derivation by scanning the given string lefttoright and looking only at the next one symbol is called LL(1).
“Leftmost derivation, lefttoright scan, one symbol of lookahead.”
28 LL(1) Grammars – (2)
x Most programming languages have LL(1) grammars.
x LL(1) grammars are never ambiguous. 29 Inherent Ambiguity
x It would be nice if for every ambiguous grammar, there were some way to “fix” the ambiguity, as we did for the balancedparentheses grammar.
x Unfortunately, certain CFL’s are inherently ambiguous, meaning that every grammar for the language is ambiguous.
30 Example: Inherent Ambiguity
x The language {0i1j2k  i = j or j = k} is inherently ambiguous.
x Intuitively, at least some of the strings of the form 0n1n2n must be generated by two different parse trees, one based on checking the 0’s and 1’s, the other based on checking the 1’s and 2’s.
31 One Possible Ambiguous Grammar
S > AB  CD
A > 0A1  01
B > 2B  2
C > 0C  0
D > 1D2  12 A generates equal 0’s and 1’s
B generates any number of 2’s
C generates any number of 0’s
D generates equal 1’s and 2’s And there are two derivations of every string
with equal numbers of 0’s, 1’s, and 2’s. E.g.:
S => AB => 01B =>012
S => CD => 0D => 012
32 ...
View Full
Document
 Spring '08
 Motwani,R

Click to edit the document details