11-LALR-Parsing

11-LALR-Parsing - CS143 Summer 2008 Handout 11 July 09,...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CS143 Handout 11 Summer 2008 July 09, 2008 LALR Parsing Handout written by Maggie Johnson and revised by Julie Zelenski. Motivation Because a canonical LR(1) parser splits states based on differing lookahead sets, it can have many more states than the corresponding SLR(1) or LR(0) parser. Potentially it could require splitting a state with just one item into a different state for each subset of the possible lookaheads; in a pathological case, this means the entire power set of its follow set (which theoretically could contain all terminals—yikes!). It never actually gets that bad in practice, but a canonical LR(1) parser for a programming language might have an order of magnitude more states than an SLR(1) parser. Is there something in between? With LALR ( lookahead LR ) parsing, we attempt to reduce the number of states in an LR(1) parser by merging similar states. This reduces the number of states to the same as SLR(1), but still retains some of the power of the LR(1) lookaheads. Let’s examine the LR(1) configurating sets from an example given in the LR parsing handout. S' –> S S –> XX X –> aX X –> b I 0 : S' –> •S, $ S –> •XX, $ X –> •aX, a/b X –> •b, a/b I 1 : S' –> S•, $ I 2 : S –> X•X, $ X –> •aX, $ X –> •b, $ I 3 : X –> a•X, a/b X –> •aX, a/b X –> •b, a/b I 4 : X –> b•, a/b I 5 : S –> XX•, $ I 6 : X –> a•X, $ X –> •aX, $ X –> •b, $ I 7 : X –> b•, $ I 8 : X –> aX•, a/b I 9 : X –> aX•, $ Notice that some of the LR(1) states look suspiciously similar. Take I 3 and I 6 for example. These two states are virtually identical—they have the same number of items, the core of each item is identical, and they differ only in their lookahead sets. This observation may make you wonder if it possible to merge them into one state. The same is true of I 4 and I 7 , and I 8 and I 9 . If we did merge, we would end up replacing those six states with just these three:
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 I 36 : X –> a•X, a/b/$ X –> •aX, a/b/$ X –> •b, a/b/$ I 47 : X –> b•, a/b/$ I 89 : X –> aX•, a/b/$ But isn’t this just SLR(1) all over again? In the above example, yes, since after the merging we coincidentally end up with the complete follow sets as the lookahead. This is not always the case however. Consider this example: S' –> S S –> Bbb | aab | bBa B –> a I 0 : S' –> •S, $ S –> •Bbb, $ S –> •aab, $ S –> •bBa, $ B –> •a, b I 1 : S' –> S•, $ I 2 : S –> B•bb, $ I 3 : S –> a•ab, $ B –> a•, b .... In an SLR(1) parser there is a shift-reduce conflict in state 3 when the next input is anything in Follow(B) which includes a and b . In LALR(1), state 3 will shift on a and reduce on b . Intuitively, this is because the LALR(1) state "remembers" that we arrived at state 3 after seeing an a . Thus we are trying to parse either Bbb or aab . In order for that first a to be a valid reduction to B , the next input has to be exactly b since that is the only symbol that can follow
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 8

11-LALR-Parsing - CS143 Summer 2008 Handout 11 July 09,...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online