TopDown Parsing
and
Intro to BottomUp Parsing
Predictive Parsers
•
Like recursivedescent but parser can
“predict” which production to use
–
By looking at the next few tokens
–
No backtracking
•
Predictive parsers accept LL(k) grammars
–
L means “lefttoright” scan of input
–
L means “leftmost derivation”
–
k means “predict based on k tokens of lookahead”
–
In practice, LL(1) is used
LL(1) vs. Recursive Descent
•
In recursivedescent,
–
At each step, many choices of production to use
–
Backtracking used to undo bad choices
•
In LL(1),
–
At each step, only one choice of production
–
That is
•
When a nonterminal
A
is leftmost in a derivation
•
The next input symbol is
t
•
There is a unique production
A
to use
–
Or no production to use (an error state)
•
LL(1) is a recursive descent variant without backtracking
Predictive Parsing and Left Factoring
•
Recall the grammar
E
T + E  T
T
int
 int * T  ( E )
•
Hard to predict because
–
For
T
two productions start with
int
–
For
E
it is not clear how to predict
•
We need to leftfactor
the grammar
LeftFactoring Example
•
Recall the grammar
E
T + E  T
T
int
 int * T  ( E )
•
Factor out common prefixes of productions
E
T X
X
+ E 
T
( E )  int Y
Y
* T 
LL(1) Parsing Table Example
•
Leftfactored grammar
E
T X
X
+ E 
T
( E )  int Y
Y
* T 
•
The LL(1) parsing table:
int
*
+
(
)
$
E
T X
T X
X
+ E
T
int Y
( E )
Y
* T
leftmost nonterminal
next input token
rhs of production to use
LL(1) Parsing Table Example (Cont.)
•
Consider the
[E, int]
entry
–
“When current nonterminal is
E
and next input is
int
, use production
E
T X
”
–
This can generate an
int
in the first position
•
Consider the
[Y,+]
entry
–
“When current nonterminal is
Y
and current token
is
+
, get rid of
Y
”
–
Y
can be followed by
+
only if
Y
LL(1) Parsing Tables. Errors
•
Blank entries indicate error situations
•
Consider the
[E,*]
entry
–
“There is no way to derive a string starting with
*
from nonterminal
E
”
Using Parsing Tables
•
Method similar to recursive descent, except
–
For the leftmost nonterminal
S
–
We look at the next input token
a
–
And choose the production shown at
[S,a]
•
A stack records frontier of parse tree
–
Nonterminals that have yet to be expanded
–
Terminals that have yet to matched against the input
–
Top of stack = leftmost pending terminal or nonterminal
•
Reject on reaching error state
•
Accept on end of input & empty stack
LL(1) Parsing Algorithm
initialize stack = <S $> and next
repeat
case stack of
<X, rest>
: if T[X,*next] = Y
1
…Y
n
then stack
<Y
1
… Y
n
rest>;
else
error ();
<t, rest>
: if t == *next ++
then
stack
<rest>;
