{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

140_LALR_Parsing

# 140_LALR_Parsing - CS143 Handout 14 Summer 2011 July 6th...

This preview shows pages 1–3. Sign up to view the full content.

CS143 Handout 14 Summer 2011 July 6 th , 2011 LALR Parsing Handout written by Maggie Johnson, revised by Julie Zelenski. Motivation Because a canonical LR(1) parser splits states based on differing lookahead sets, it can have many more states than the corresponding SLR(1) or LR(0) parser. Potentially it could require splitting a state with just one item into a different state for each subset of the possible lookaheads; in a pathological case, this means the entire power set of its follow set (which theoretically could contain all terminals—yikes!). It never actually gets that bad in practice, but a canonical LR(1) parser for a programming language might have an order of magnitude more states than an SLR(1) parser. Is there something in between? With LALR ( lookahead LR ) parsing, we attempt to reduce the number of states in an LR(1) parser by merging similar states. This reduces the number of states to the same as SLR(1), but still retains some of the power of the LR(1) lookaheads. Let’s examine the LR(1) configurating sets from an example given in the LR parsing handout. S' –> S S –> XX X –> aX X –> b I 0 : S' –> •S, \$ S –> •XX, \$ X –> •aX, a/b X –> •b, a/b I 1 : S' –> S•, \$ I 2 : S –> X•X, \$ X –> •aX, \$ X –> •b, \$ I 3 : X –> a•X, a/b X –> •aX, a/b X –> •b, a/b I 4 : X –> b•, a/b I 5 : S –> XX•, \$ I 6 : X –> a•X, \$ X –> •aX, \$ X –> •b, \$ I 7 : X –> b•, \$ I 8 : X –> aX•, a/b I 9 : X –> aX•, \$ Notice that some of the LR(1) states look suspiciously similar. Take I 3 and I 6 for example. These two states are virtually identical—they have the same number of items, the core of each item is identical, and they differ only in their lookahead sets. This observation may make you wonder if it possible to merge them into one state. The

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
same is true of I 4 and I 7 , and I 8 and I 9 . If we did merge, we would end up replacing those six states with just these three: I 36 : X –> a•X, a/b/\$ X –> •aX, a/b/\$ X –> •b, a/b/\$ I 47 : X –> b•, a/b/\$ I 89 : X –> aX•, a/b/\$ But isn’t this just SLR(1) all over again? In the above example, yes, since after the merging we coincidentally end up with the complete follow sets as the lookahead. This is not always the case however. Consider this example: S' –> S S –> Bbb | aab | bBa B –> a I 0 : S' –> •S, \$ S –> •Bbb, \$ S –> •aab, \$ S –> •bBa, \$ B –> •a, b I 1 : S' –> S•, \$ I 2 : S –> B•bb, \$ I 3 : S –> a•ab, \$ B –> a•, b .... In an SLR(1) parser there is a shift-reduce conflict in state 3 when the next input is anything in Follow(B) which includes a and b . In LALR(1), state 3 will shift on a and reduce on b . Intuitively, this is because the LALR(1) state "remembers" that we arrived at state 3 after seeing an a . Thus we are trying to parse either Bbb or aab . In order for that first a to be a valid reduction to B , the next input has to be exactly b since that is the only symbol that can follow B in this particular context. Although elsewhere an expansion of B can be followed by an a , we consider only the subset of the follow set that can appear here, and thus avoid the conflict an SLR(1) parser would have.
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 8

140_LALR_Parsing - CS143 Handout 14 Summer 2011 July 6th...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online