[go: up one dir, main page]

0% found this document useful (0 votes)
19 views91 pages

Bottom Up Parsing

The document discusses bottom-up parsing techniques in computer science, specifically focusing on shift-reduce parsing and LR parsing. It explains key concepts such as handles, reductions, and the mechanics of parsing actions, including how to resolve conflicts that may arise during parsing. The document also outlines the structure and functioning of LR parsers, emphasizing their ability to handle a wide range of context-free grammars.

Uploaded by

Bijay Nag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views91 pages

Bottom Up Parsing

The document discusses bottom-up parsing techniques in computer science, specifically focusing on shift-reduce parsing and LR parsing. It explains key concepts such as handles, reductions, and the mechanics of parsing actions, including how to resolve conflicts that may arise during parsing. The document also outlines the structure and functioning of LR parsers, emphasizing their ability to handle a wide range of context-free grammars.

Uploaded by

Bijay Nag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

CS 335: Bottom-Up Parsing

Swarnendu Biswas

Department of Computer Science and Engineering,


Indian Institute of Technology Kanpur

Sem 2023-24-II
Rightmost Derivation of abbcde

Grammar Input string: abbcde


S →aABe S →aABe
A →Abc | b →aAde
B →d →aAbcde
→abbcde

S S S S S

a A B e a A B e a A B e a A B e

b A b c b A b c b

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 3 / 84


Bottom-Up Parsing

Definition
Bottom-up parsing constructs the parse tree starting from the leaves and working up
toward the root

Grammar Input string: abbcde

reverse of rightmost derivation


S → aABe abbcde
x
S →aABe
→ aAde → aAbcde
A →Abc | b → aAbcde → aAde 
B →d → abbcde → aABe 
→S

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 4 / 84


Bottom-Up Parsing

Grammar Input string: abbcde

reverse of rightmost
S →aABe S → aABe abbcde x
→ aAde → aAbcde

derivation
A →Abc | b
→ aAbcde → aAde 
B →d → abbcde → aABe 
→S

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 5 / 84


Reduction

Bottom-up parsing reduces a string w to the start symbol S


At each reduction step, a chosen substring that is the RHS (or body) of a production is
replaced by the LHS (or head) nonterminal

rightmost derivation

bottom-up parser

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 6 / 84


Handle

 Handle is a substring that matches the body of a production


 Reducing the handle is one step in the reverse of the rightmost derivation

Right sentential form Handle Reducing Production


E → E +T |T
id1 ∗ id2 id1 F → id
T → T ∗F |F F ∗ id2 F T →F
F → ( E ) | id T ∗ id2 id2 F → id
T ∗F T ∗F T →T ∗F
T T E →T
E

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 7 / 84


Handle

 Handle is a substring that matches the body of a production


 Reducing the handle is one step in the reverse of the rightmost derivation

Right sentential form Handle Reducing Production


E → E +T |T
id1 ∗ id2 id1 F → id
T → T ∗F |F F ∗ id2 F T →F
F → ( E ) | id T ∗ id2 id2 F → id
T ∗F T ∗F T →T ∗F
T T E →T
E

Although T is the body of the production E → T , T is not a handle in the sentential


form T ∗ id2
The leftmost substring that matches the body of some production need not be a handle
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 7 / 84
Handle


 If S ==⇒ 𝛼Aw ==⇒ 𝛼𝛽w , then A → 𝛽 is a S
rm rm
handle of 𝛼𝛽w
 String w right of a handle must contain
only terminals A

A handle A → 𝛽 in the parse tree for 𝛼𝛽w

 If grammar G is unambiguous, then every right sentential form has only one handle
 If G is ambiguous, then there can be more than one rightmost derivation of 𝛼𝛽w

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 8 / 84


Shift-Reduce Parsing
Shift-Reduce Parsing

 The input string being parsed consists of two parts


▶ Left part is a string of terminals and nonterminals, and is stored in a stack
▶ Right part is a string of terminals to be read from an input buffer
▶ Bottom of the stack and end of the input are represented by $

 Shift-reduce parsing is a type of bottom-up parsing with two primary actions, shift
and reduce
▶ Shift-Reduce actions
Shift Shift the next input symbol from the right string onto the top of the stack
Reduce Identify a string on top of the stack that is the body of a production and replace
the body with the head
▶ Other actions are accept and error

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 9 / 84


Shift-Reduce Parsing

 Initial
Stack Input
$ w$

Shift * Reduce

 Goal
Stack Input
$S $

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 10 / 84


Example of Shift-Reduce Parsing

Stack Input Action


E → E +T |T
$ id1 ∗ id2 $ Shift
T → T ∗F |F $id1 ∗id2 $ Reduce by F → id
F → ( E ) | id $F ∗id2 $ Reduce by T →F
$T ∗id2 $ Shift
$T ∗ id2 $ Shift
$T ∗ id2 $ Reduce by F → id
$T ∗ F $ Reduce by T →T ∗F
$T $ Reduce by E →T
$E $ Accept

or report an error in
case of syntax error

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 11 / 84


Handle on Top of Stack

Is the following scenario possible?

Stack Input Action


...
$𝛼𝛽𝛾 w$ Reduce by A → 𝛾
$𝛼𝛽A w$ Reduce by B → 𝛽
$𝛼BA w$ ...
...

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 12 / 84


Possible Choices in Rightmost Derivation

S S

B B A

y z x y z

1. S ==⇒ 𝛼Az ==⇒ 𝛼𝛽Byz ==⇒ 𝛼𝛽𝛾 yz 2. S ==⇒ 𝛼BxAz ==⇒ 𝛼Bxyz ==⇒ 𝛼𝛾 xyz
rm rm rm rm rm rm

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 13 / 84


Handle on Top of Stack

Is the following scenario possible?

Stack Input Action


...
$𝛼𝛽𝛾 w$ Reduce by A → 𝛾
$𝛼𝛽A w$ Reduce by B → 𝛽
Handle will
$𝛼 BA always. . . eventually
w$ appear
...
on top of the stack, never inside

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 14 / 84


Shift-Reduce Actions

Shift shift the next input symbol from the right string onto the top of the stack
Reduce identify a string on top of the stack that is the body of a production, and
replace the body with the head

How do you decide when to shift and


when to reduce?

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 15 / 84


Steps in Shift-Reduce Parsers

General shift-reduce technique


 If there is no handle on the stack, then shift
 If there is a handle on the stack, then reduce

Bottom-up parsing is essentially the process of identifying handles and reducing


them
 Different bottom-up parsers differ in the way they detect handles

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 16 / 84


Challenges in Bottom-up Parsing

Which action do you pick when both shift and reduce are valid?
Implies a shift-reduce conflict

Which rule to use if reduction is possible by more than one rule?


Implies a reduce-reduce conflict

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 17 / 84


Example of a Shift-Reduce Conflict
E →E + E | E ∗ E | id

id + id ∗ id c+C

Stack Input Action Stack Input Action


$ id + id ∗ id$ Shift $ id + id ∗ id$ Shift
... ...
$E + E ∗id$ Reduce by E → E + E $E + E ∗id$ Shift
$E ∗id$ Shift $E + E ∗ id$ Shift
$E ∗ id$ Shift $E + E ∗ id $ Reduce by E → id
$E ∗ id $ Reduce by E → id $E + E ∗ E $ Reduce by E → E ∗ E
$E ∗ E $ Reduce by E → E ∗ E $E + E $ Reduce by E → E + E
$E $ $E $

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 18 / 84


Resolving Shift-Reduce Conflict

Stmt → if Expr then Stmt


| if Expr then Stmt else Stmt
| other

Stack Input Action


...
$ . . . if Expr then Stmt else . . .

What is a correct thing to do for this grammar


— shift or reduce? We can prioritize shifts.

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 19 / 84


Reduce-Reduce Conflict
M →R + R | R + c | R
R →c

c+c id + id ∗ id

Stack Input Action Stack Input Action


$ c + c$ Shift $ c + c$ Shift
$c + c$ Reduce by R → c $c + c$ Reduce by R → c
$R + c$ Shift $R + c$ Shift
$R + c$ Shift $R + c$ Shift
$R + c $ Reduce by R → c $R + c $ Reduce by M → R + c
$R + R $ Reduce by R → R + R $M $
$M $

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 20 / 84


LR Parsing
LR(k) Parsing

 Popular bottom-up parsing scheme


▶ L is for left-to-right scan of input, R is for reverse of rightmost derivation, k is the number
of lookahead symbols
 LR parsers are table-driven, like the non-recursive LL parser
 LR grammar is one for which we can construct an LR parsing table
 Popularity of LR Parsing
+ Most general non-backtracking shift-reduce parsing method
+ Can recognize almost all language constructs with CFGs
+ Works for a superset of grammars parsed with predictive or LL parsers

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 21 / 84


LR(k) Parsing

 Popular bottom-up parsing scheme


▶ L is for left-to-right scan of input, R is for reverse of rightmost derivation, k is the number
of lookahead symbols
 LR parsers are table-driven, like the non-recursive LL parser
 LR grammar is one for which we can construct an LR parsing table
 Popularity of LR Parsing
+ Most general non-backtracking shift-reduce parsing method
+ Can recognize almost all language constructs with CFGs
+ Works for a superset of grammars parsed with predictive or LL parsers
▶ LL(k) parsing predicts which production to use having seen only the first k tokens of the
right-hand side
▶ LR(k) parsing can decide after it has seen input tokens corresponding to the entire right-hand
side of the production

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 21 / 84


Block Diagram of LR Parser
Input
... ... $

LR Parsing
Stack Output
Program

...

$ ACTION GOTO

Parsing Tables

The LR parsing driver is the same for all LR parsers, only the parsing
tables (i.e., ACTION and GOTO) change across parser types

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 22 / 84


Steps in LR Parsing

 Remember the basic questions: when to shift and when to reduce!


 An LR parser makes shift-reduce decisions by maintaining states
 Information is encoded in a DFA constructed using a canonical LR(0) collection
′ ′ ′
1. Augmented grammar G with new start symbol S and rule S → S
2. Define helper functions Closure() and Goto()

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 23 / 84


LR(0) Item

 An LR(0) item of a grammar G is a production of G


Production Items
with a dot (•) at some position in the body
 An item indicates how much of a production we have A → •XYZ
seen A → X • YZ
A → XYZ
A → XY • Z
▶ Symbols on the left of “•” are already on the stack
A → XYZ •
▶ Symbols on the right of “•” are expected in the input

 A → •XYZ indicates that we expect a string derivable


from XYZ next in the input
 A → X • YZ indicates that we saw a string derivable
from X in the input, and we expect a string derivable
from YZ next in the input
 A → 𝜖 generates only one item A → •

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 24 / 84


Closure Operation

 Let I be a set of items for a grammar G E →E
E → E +T |T
 Closure ( I ) is constructed as follows
T → T ∗F |F
(i) Add every item in I to Closure ( I )
F → ( E ) | id
(ii) If A → 𝛼 • B 𝛽 is in Closure ( I ) and B → 𝛾 is a rule
in G, then add B → •𝛾 to Closure ( I ) if not already ′

added
Suppose I = {E → •E }
(iii) Repeat until no more new items can be added to ′
Closure ( I ) Closure ( I ) = {E → •E ,
E → •E + T ,
E → •T ,
A substring derivable from B 𝛽 will have a prefix derivable
from B by applying one the B productions T → •T ∗ F ,
T → •F ,
F → • (E ) ,
F → •id}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 25 / 84


Goto Operation

 Suppose I is a set of items and X is a grammar E →E
symbol E → E +T |T
 Goto ( I , X ) is the closure of set all items T → T ∗F |F
[A → 𝛼X • 𝛽] such that [A → 𝛼 • X 𝛽] is in I F → ( E ) | id
▶ If I is a set of items for some valid prefix 𝛼, then Suppose

Goto ( I , X ) is the set of valid items for prefix 𝛼X I = {E → E •,
E → E • +T }
Intuitively, Goto ( I , X ) gives the transition of the state I
under input X in the LR(0) automaton Goto ( I , +) = {E → E + •T ,
T → •T ∗ F ,
T → •F ,
F → • (E ) ,
F → •id}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 27 / 84


Algorithm to Compute LR(0) Canonical Collection
 ′

C = Closure {[ S → •S]}
repeat
for each set of items I ∈ C
for each grammar symbol X
if Goto ( I , X ) ≠ 𝜙 and Goto ( I , X ) ∉ C
add Goto ( I , X ) to C
until no new sets of items are added to C

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 28 / 84


Example Computation of LR(0) Canonical Collection

I0 = Closure ( E → •E ) I4 = Goto ( I0 , ‘(’) I7 = Goto ( I2 , ∗) I2 = Goto ( I4 , T )

= {E → •E , = {F → (•E ) , = { T → T ∗ •F , I3 = Goto ( I4 , F )
E → •E + T , E → •E + T , F → •( E ), I4 = Goto ( I4 , ‘(’)
E → •T , E → •T , F → •id} I5 = Goto ( I4 , id)
T → •T ∗ F , T → •T ∗ F , I3 = Goto ( I6 , F )
I8 = Goto ( I4 , E )
T → •F , T → •F , I4 = Goto ( I6 , ‘(’)
= { E → E • +T ,
F → • (E ) , F → • (E ) , I5 = Goto ( I6 , id)
F → ( E •)}
F → •id} F → •id} I4 = Goto ( I7 , ‘(’)
I9 = Goto ( I6 , T ) I5 = Goto ( I7 , id)
I1 = Goto ( I0 , E ) I5 = Goto ( I0 , id)
= {E → E + T •, I6 = Goto ( I8 , +)

= {E → E •, = {F → id•} T → T • ∗F } I7 = Goto ( I9 , ∗)
E → E • +T }
I6 = Goto ( I1 , +) I10 = Goto ( I7 , F )
I2 = Goto ( I0 , T ) = { E → E + •T , = {T → T ∗ F •}
= {E → T •, T → •T ∗ F ,
T → T • ∗F } T → •F , I11 = Goto ( I8 , ‘)’)
I3 = Goto ( I0 , F ) F → •( E ), = {F → ( E )•}
= {T → F •} F → •id}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 30 / 84


LR(0) Automaton

 Canonical LR(0) collection is used for constructing the LR(0) automaton for parsing
 States represent sets of LR(0) items in the canonical LR(0) collection
′  ′
▶ Start state is Closure {[ S → •S]} , where S is the start symbol of the augmented
grammar
▶ State j refers to the state corresponding to the set of items Ij
 By construction, all transitions to state j is for the same symbol X
▶ Each state, except the start state, has a unique grammar symbol associated with it

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 31 / 84


LR(0) Automaton

$
accept I1 T I2 F I3 id I5

E
id

id

id
F *
F +
(
I0 I4 ( I6 I7
( + *
E
(
T
I8 I9 I10 I11
F
)

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 32 / 84


Use of LR(0) Automaton

 How can the LR(0) automaton help with shift-reduce decisions?


 Suppose string 𝛾 of grammar symbols takes the automaton from start state S0 to state
Sj
▶ Shift on next input symbol a if Sj has a transition on a
▶ Otherwise, reduce
▶ Items in state Sj help decide which production to use

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 33 / 84


Structure of LR Parsing Table

 Assume Si is top of the stack and ai is the current input symbol


 Parsing table consists of two parts: an ACTION and a GOTO function
 ACTION table is indexed by state and terminal symbols; ACTION [Si , ai ] can have four
values
(i) Shift ai to the stack, go to state Sj
(ii) Reduce by rule k
(iii) Accept
(iv) Error (empty cell in the table)
 GOTO table is indexed by state and nonterminal symbols

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 34 / 84


Constructing LR(0) Parsing Table

(i) Construct LR(0) canonical collection C = {I0 , I1 , . . . , In } for grammar G
(ii) State i is constructed from Ii
(a) If [A → 𝛼 • A 𝛽] ∈ Ii and GOTO ( Ii , a) = Ij , then set ACTION [i , a] = “Shift j ”
▶ sj means shift and stack state j
(b) If [A → 𝛼•] ∈ Ii , then set ACTION [ i , a] = “Reduce by A → 𝛼” for all a
▶ rj means reduce by rule $j

(c) If [S → S•] ∈ Ii , then set ACTION [i , $] = “Accept”
(iii) If GOTO ( Ii , A) = Ij , then GOTO [i , A] = j
(iv) All entries left undefined are “errors”

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 35 / 84


LR(0) Parsing Table
ACTION GOTO
State
id + ∗ ( ) $ E T F
0 s5 s4 1 2 3
1 s6 Accept
2 r2 r2 s7, r2 r2 r2 r2
3 r4 r4 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
Rule # Rule
8 s6 s11 ′
0 E →E
9 r1 r1 s7, r1 r1 r1 r1 1 E →E+T
2 E →T
10 r3 r3 r3 r3 r3 r3 3 T →T ∗F
4 T →F
11 r5 r5 r5 r5 r5 r5 5 F → (E )
6 F → id

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 36 / 84


LR Parser Configurations

 A LR parser configuration is a pair ⟨s0 s1 . . . sm , ai ai +1 . . . an $⟩


▶ The left half is stack content, and the right half is the remaining input
 Configuration represents the right sentential form X1 X2 . . . Xm ai ai +1 . . . an

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 37 / 84


LR Parsing Algorithm

(i) If ACTION [sm , ai ] = sj , then the new configuration is ⟨s0 s1 . . . sm sj , ai +1 . . . an ⟩


(ii) If ACTION [sm , ai ] = reduce A → 𝛽, then the new configuration is
⟨s0 s1 . . . sm−r s, ai ai +1 . . . an ⟩, where r = |𝛽| and s = GOTO [sm−r , A]
(iii) If ACTION [sm , ai ] = Accept, then parsing is successful
(iv) If ACTION [sm , ai ] = error, then parsing has discovered an error

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 38 / 84


LR Parsing Program

Let a be the first symbol in w $


while (1)
Let s be the top of the stack
if ACTION [ s, a] == shift t
push t onto the stack
let a be the next input symbol
else if ACTION [ s, a] = reduce A → 𝛽
// Reduce with the production A → 𝛽
pop |𝛽| symbols of the stack
let state t now be the top of the stack
push GOTO [ t , A] onto the stack
else if ACTION [ s, a] == Accept
break // parsing is complete
else
invoke error recovery

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 39 / 84


Shift-Reduce Parser with LR(0) Automaton

Stack Input Action


$0 id ∗ id$ Shift
$0 id 5 ∗ id$ Reduce by F → id
$0 F 3 ∗ id$ Reduce by T →F
$0 T 2 ∗ id$ Shift
popped 5 and pushed 3 $0 T 2 ∗ 7 id$ Shift
because I3 = Goto ( I0 , F )
$0 T 2 ∗ 7 id 5 $ Reduce by F → id
$0 T 2 ∗ 7 F 10 $ Reduce by T →T ∗F
$0 T 2 $ Reduce by E →T
$0 E 1 $ Accept

While the stack consisted of only symbols in the shift-


reduce parser, here the stack also contains states from
the LR(0) automaton

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 40 / 84


Viable Prefix
rm rm rm rm rm
 Consider E ==⇒ T ==⇒ T ∗ F ==⇒ T ∗ id ==⇒ F ∗ id ==⇒ id ∗ id
 Not all prefixes of a right sentential form can appear on the stack
▶ id∗ is a prefix of a right sentential form but can never appear on the stack
▶ LR parser will not shift past the handle
▶ Always reduce by F → id before shifting ∗ (see previous slide)
 A viable prefix is a prefix of a right sentential form that can appear on the stack of a
shift-reduce parser
▶ If the stack contains 𝛼, then 𝛼 is a viable prefix if ∃w such that 𝛼w is a right sentential form
 There is no error as long as the parser has viable prefixes on the stack
▶ The parser has not yet read past the handle, and expects that the remaining input could
form a valid sentential form leading to a successful parse

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 41 / 84


Example of a Viable Prefix
Stack Input
S → X1 X2 X3 X4
$ X1 X2 X3 $
A → X1 X2
$ X1 X2 X3 $
$ X1 X2 X3 $
$A X3 $
$AX3 $
X1 X2 X3 can never appear on
the stack

 Suppose there is a production A → 𝛽1 𝛽2 , 𝛼𝛽1 is on the stack, and there is a derivation


′ ∗ ∗
S ==⇒ 𝛼Aw ==⇒ 𝛼𝛽1 𝛽2 w
rm rm
▶ 𝛽2 ≠ 𝜖 implies that the handle 𝛽1 𝛽2 is not at the top of the stack yet, so shift
▶ 𝛽2 = 𝜖 implies that the LR parser can reduce by the handle A → 𝛽1

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 42 / 84


Challenges with LR(0) Parsing

An LR(0) parser works only if each state with a reduce action has only one possible reduce
action and no shift action

Ok Shift-Reduce Conflict Reduce-Reduce Conflict


{L → L, S•} { L →L , S • , {L →S, L• ,
L →S•, L} L →S•}

Takes shift/reduce decisions without any lookahead token


Lacks the power to parse programming language grammars

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 43 / 84


Canonical Collection of Sets of LR(0) Items
Consider the following grammar for adding numbers

Left associative Right associative Shift-Reduce Conflict


S →S + E |E S →E + S |E { S →E • + S ,
E → num E → num S →E •}


FIRST ( S) = {num} I0 = Closure ({S → •S }) I2 = Goto ( I0 , E )

FIRST ( E ) = {num} = {S → •S, = { S → E • +S ,
FOLLOW ( S) = {$} S → •E + S, S → E •}
FOLLOW ( E ) = {+, $} S → •E , I3 = Goto ( I0 , num)
E → •num} = {E → num•}
I1 = Goto ( I0 , S)

= {S → S•} I4 = Goto ( I2 , +)
= { S → E + •S }
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 44 / 84
Simple LR Parsing
Block Diagram of LR Parser
Input
... ... $

LR Parsing
Stack Output
Program

...

$ ACTION GOTO

Parsing Tables

The LR parsing driver is the same for all LR parsers, only the parsing
tables (i.e., ACTION and GOTO) change across parser types

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 45 / 84


SLR(1) Parsing

 Uses LR(0) items and LR(0) automaton, extends LR(0) parser to eliminate a few
conflicts
▶ For each reduction A → 𝛽, look at the next symbol c
▶ Apply reduction only if c ∈ FOLLOW ( A)

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 46 / 84


Constructing SLR Parsing Table

(i) Construct LR(0) canonical collection C = {I0 , I1 , . . . , In } for grammar G
(ii) State i is constructed from Ii
(a) If [A → 𝛼 • A 𝛽] ∈ Ii and GOTO ( Ii , a) = Ij , then set ACTION [i , a] = “Shift j ”
(b) If [A → 𝛼•] ∈ Ii , then set ACTION [ i , a] = “Reduce by A → 𝛼” for all a in FOLLOW ( A)

(c) If [S → S•] ∈ Ii , then set ACTION [i , $] = “Accept”
(iii) If GOTO ( Ii , A) = Ij , then GOTO [i , A] = j constraints on when
(iv) All entries left undefined are “errors” reductions are applied

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 47 / 84


SLR Parsing for Expression Grammar
Rule # Rule
FIRST ( E ) = {(, id}
1 E →E+T
FIRST ( T ) = {(, id}
2 E →T
3 T →T ∗F FIRST ( F ) = {(, id}
4 T →F FOLLOW ( E ) = {$, +, )}
5 F → (E )
6 F → id FOLLOW ( T ) = {$, +, ∗, )}
FOLLOW ( F ) = {$, +, ∗, )}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 48 / 84


Canonical Collection of Sets of LR(0) Items

I0 = Closure ( E → •E ) I4 = Goto ( I0 , ‘(’) I7 = Goto ( I2 , ∗) I2 = Goto ( I4 , T )

= {E → •E , = {F → (•E ) , = { T → T ∗ •F , I3 = Goto ( I4 , F )
E → •E + T , E → •E + T , F → •( E ), I4 = Goto ( I4 , ‘(’)
E → •T , E → •T , F → •id} I5 = Goto ( I4 , id)
T → •T ∗ F , T → •T ∗ F , I3 = Goto ( I6 , F )
I8 = Goto ( I4 , E )
T → •F , T → •F , I4 = Goto ( I6 , ‘(’)
= { E → E • +T ,
F → • (E ) , F → • (E ) , I5 = Goto ( I6 , id)
F → ( E •)}
F → •id} F → •id} I4 = Goto ( I7 , ‘(’)
I9 = Goto ( I6 , T ) I5 = Goto ( I7 , id)
I1 = Goto ( I0 , E ) I5 = Goto ( I0 , id)
= {E → E + T •, I6 = Goto ( I8 , +)

= {E → E •, = {F → id•} T → T • ∗F } I7 = Goto ( I9 , ∗)
E → E • +T }
I6 = Goto ( I1 , +) I10 = Goto ( I7 , F )
I2 = Goto ( I0 , T ) = { E → E + •T , = {T → T ∗ F •}
= {E → T •, T → •T ∗ F ,
T → T • ∗F } T → •F , I11 = Goto ( I8 , ‘)’)
I3 = Goto ( I0 , F ) F → •( E ), = {F → ( E )•}
= {T → F •} F → •id}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 49 / 84


LR(0) Automaton

F I3
$
accept I1 T I2 id I5
F id
E

id

id
+ F *
(
I0 I4 I6 I7
( + *
( (
E
T
I8 I9 I10 I11
F

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 50 / 84


SLR Parsing Table
ACTION GOTO
State
id + ∗ ( ) $ E T F
0 s5 s4 1 2 3
1 s6 Accept
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10 Rule # Rule
8 s6 s11 0 E

→E
1 E →E+T
9 r1 s7 r1 r1 2
3
E
T
→T
→T ∗F
10 r3 r3 r3 r3 4
5
T
F
→F
→ (E )
11 r5 r5 r5 r5 6 F → id

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 51 / 84


Moves of an LR Parser on id ∗ id + id
Stack Input Action
$0 id ∗ id + id$ Shift 5
$0 id 5 ∗ id + id$ Reduce by F → id
$0 F 3 ∗ id + id$ Reduce by T →F
$0 T 2 ∗ id + id$ Shift 7
$0 T 2 ∗ 7 id + id$ Shift 5
$0 T 2 ∗ 7 id 5 + id$ Reduce by F → id
$0 T 2 ∗ 7 F 10 + id$ Reduce by T →T ∗F
$0 T 2 + id$ Reduce by E →T
$0 E 1 + id$ Shift 6
$0 E 1 + 6 id$ Shift 5
$0 E 1 + 6 id 5 $ Reduce by F → id
$0 E 1 + 6 F 3 $ Reduce by T →F
$0 E 1 + 6 T 9 $ Reduce by E →E+T
$0 E 1 $ Accept

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 52 / 84


Limitations of SLR Parsing

 If an SLR parse table for a grammar does not have multiple entries in any cell, then the
grammar is unambiguous
 Every SLR(1) grammar is unambiguous, but there are unambiguous grammars that are
not SLR(1)

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 53 / 84


Example to Highlight Limitations of SLR Parsing

Unambiguous grammar FIRST ( S) = {∗, id}


FIRST ( L) = {∗, id}
S →L = R | R FIRST ( R) = {∗, id}
FOLLOW ( S) = {$, =}
L → ∗ R | id
FOLLOW ( L) = {$, =}
R→L FOLLOW ( R) = {$, =}

Example derivation
S → L = R → ∗R = R

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 54 / 84


Canonical LR(0) Collection

I0 = Closure ( S → •S) I3 = Goto ( I0 , R) I6 = Goto ( I2 , =)

= {S → •S, = {S → R•} = {S → L = •R,
S → •L = R, R → •L,
S → •R, I4 = Goto ( I0 , R) L → • ∗ R,
L → • ∗ R, = {L → ∗ • R, L → id}
L → •id, R → •L,
I7 = Goto ( I4 , R)
R → •L} L → • ∗ R,
= {L → ∗R•}
L → •id}
I1 = Goto ( I0 , S)
′ I5 = Goto ( I0 , id) I8 = Goto ( I4 , L)
= {S → S•}
= {L → •id} = {R → L•}
I2 = Goto ( I0 , L)
I9 = Goto ( I6 , R)
= {S → L• = R,
R → L•}
= {S → L = R•}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 55 / 84


SLR Parsing Table

ACTION GOTO
State
= ∗ id $ S L R
0 s4 s5 1 2 3
1 Accept
2 s6, r6 r6
3
4 s4 s5 8 7
5 r5 r5
6 s4 s5 8 9
7 r4 r4
8 r6 r6
9 r2

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 56 / 84


Shift-Reduce Conflict with SLR Parsing

I0 = Closure ( S → •S) I3 = Goto ( I0 , R) I6 = Goto ( I2 , =)

= {S → •S, = {S → R•} = {S → L = •R,
S → •L = R, R → •L,
S → •R, I4 = Goto ( I0 , R) L → • ∗ R,
L → • ∗ R, = {L → ∗ • R, L → id}
L → •id, R → •L,
I7 = Goto ( I4 , R)
R → •L} L → • ∗ R,
= {L → ∗R•}
(i) ACTION [ 2, =] = Shift 6, or L → •id}
I1 = Goto ( I0 , S)

(ii) →ACTION
= {S S•} [ 2, =] = Reduce RI5→= LGoto ( I0 , id) = ∈ FOLLOW ( R)
because I8 = Goto ( I4 , L)
= {L → •id} = {R → L•}
I2 = Goto ( I0 , L)
I9 = Goto ( I6 , R)
= {S → L• = R,
R → L•}
= {S → L = R•}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 57 / 84


Moves of an SLR Parser on id = id

Stack Input Action Stack Input Action


$0 id = id Shift 5 $0 id = id$ Shift 5
$0id5 = id Reduce by L → id $0id5 = id$ Reduce by L → id
$0L2 = id Reduce by R → L $0L2 = id$ Shift 6
$0R3 = id Error $0L2 = 6 id$ Shift 5
$0L2 = 6id5 $ Reduce by L → id
$0L2 = 6L8 $ Reduce by R → L
No right sentential form begins $0L2 = 6R9 $ Reduce by S → L = R
with R = . . . $0S1 $ Accept

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 58 / 84


Moves of an SLR Parser on id = id

Stack Input Action Stack Input Action


$0 id = id Shift 5 $0 id = id$ Shift 5
$0id5 = id Reduce by L → id $0id5 = id$ Reduce by L → id
$0L2 = id Reduce by R → L $0L2 = id$ Shift 6
$0R3 = id Error $0L2 = 6 id$ Shift 5
State i calls for a reduction by A →
$0L2𝛼 =if6id5
the set of $items Ii con-
Reduce by L → id
$ 0L2
tains items [A → 𝛼•] and a ∈ FOLLOW ( A) = 6L8 $ Reduce by R → L
$0L2 = 6R9 $ Reduce by S → L = R
 Suppose 𝛽A is a viable prefix$at 0S1the top of the$ stack
Accept
 There may be no right sentential form where a follows 𝛽A
▶ An LR parser should not reduce by A → 𝛼 in such cases

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 59 / 84


Moves of an SLR Parser on id = id

Stack Input Action Stack Input Action


$0 id = id Shift 5 $0 id = id$ Shift 5
$0id5 = id Reduce by L → id $0id5 = id$ Reduce by L → id
$0L2 = id Reduce by R → L $0L2 = id$ Shift 6
$0R3 = id Error $0L2 = 6 id$ Shift 5
$0L2 = 6id5 $ Reduce by L → id
SLR parser cannot remember the$left0L2 =context
6L8 $ Reduce by R → L
$0L2 = 6R9 $ Reduce by S → L = R
 SLR(1) states only tell us about
$0S1the sequence on$ top of the
Accept
stack, not what is below on the stack

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 59 / 84


Canonical LR Parsing
LR(1) Item
 An LR(1) item of a CFG G is a string of the form [A → 𝛼 • 𝛽, a], with a as one symbol
lookahead
▶ A → 𝛼𝛽 is a production in G, and a ∈ T ∪ {$}
 Suppose [A → 𝛼 • 𝛽, a] where 𝛽 ≠ 𝜖, then the lookahead is not required
 If [A → 𝛼•, a], reduce only if the next input symbol is a
▶ Set of possible terminals will always be a subset of A but can be a proper subset

 An LR(1) item [A → 𝛼 • 𝛽, a] is valid for a Input ... ... $


viable prefix 𝛾 if there is a derivation
∗ Stack
S ==⇒ 𝛿Aw ==⇒ 𝛿𝛼𝛽w , where LR Parsing
rm rm Output
(i) 𝛾 = 𝛿𝛼, and Program
(ii) a is the first symbol in w , or w = 𝜖 and
a=$
... ACTION GOTO

$
Parsing Tables
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 60 / 84
Computing Closure and Goto for LR(1) Collection

Closure ( I ) Goto ( I , X )

repeat J=𝜙
for each item [ A → 𝛼 • B 𝛽, a] ∈ I for each item [ A → 𝛼 • X 𝛽, a] ∈ I

for each production B → 𝛾 ∈ G add item [ A → 𝛼X • 𝛽, a] to set J
for each terminal b ∈ FIRST (𝛽a) return Closure ( J )
add [ B → •𝛾, b] to set I
until no more items are added to I
return I

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 61 / 84


Constructing LR(1) Sets of Items

C = Closure ({[ S → •S, $]})
repeat
for each set of items I ∈ C
for each grammar symbol X
if Goto ( I , X ) ≠ 𝜙 and Goto ( I , X ) ∉ C
add Goto ( I , X ) to C
until no new sets of items are added to C

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 62 / 84


Example Construction of LR(1) Items

I0 = Closure ({[ S → •S, $]}) I4 = Goto ( I0 , d)
Rule # Rule ′
= {C → d•, c/d}

= {S → •S, $,
0 S →S S → •CC, $,
1 S → CC C → •cC, c/d, I5 = Goto ( I2 , S)
2 C → cC C → •d, c/d} = {S → CC•, $}
3 C→d
I1 = Goto ( I0 , S)
′ I6 = Goto ( I2 , c)
= {S → S•, $} = {C → c • C, $,
generates the regular language C → •cC, $,
c∗ dc∗ d I2 = Goto ( I0 , C) C → •d, $}
= {S → C • C, $,
I7 = Goto ( I2 , d)
C → •cC, $,
C → •d, $}
= {C → d•, $}

I3 = Goto ( I0 , c) I8 = Goto ( I3 , C)
= {C → c • C, c/d, = {C → cC•, c/d}
C → •cC, c/d,
C → •d, c/d} I9 = Goto ( I6 , C)
= {C → cC•, $}
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 63 / 84
LR(1) Automaton
$
accept I1 I4 I5
C
S d c
d
C
I0 I2 I3 I8
c C

d
c
I7 I6 I9
d C
c

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 64 / 84


Construction of Canonical LR(1) Parsing Tables

 Construct C = {I0 , I1 , . . . , In }
 State i of the parser is constructed from Ii
▶ If [A → 𝛼 • a 𝛽, b] is in Ii and Goto ( Ii , a) = Ij , then set ACTION [i , a] = “Shift j ”

▶ If [A → 𝛼•, a] is in Ii and A ≠ S , then set ACTION [i , a] = “Reduce by A → 𝛼•”

▶ If [S → S•, $] is in Ii , then set ACTION [i , $] = “Accept”
 If Goto ( Ii , A) = Ij , then GOTO [i , A] = j

 Initial state of the parser is constructed from the set of items containing [S → •S, $]

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 65 / 84


Canonical LR(1) Parsing Table and Moves of a CLR Parser on cdcd

ACTION GOTO
State Stack Input Action
c d $ S C
0 s3 s4 1 2 $0 cdcd$ Shift 3
1 Accept $0c3 dcd$ Shift 3
2 s6 s7 5 $0c3d4 cd$ Reduce by C → d
3 s3 s4 8 $0c3C8 cd$ Reduce by C → cC
4 r3 r3 $0C2 cd$ Shift 6
5 r1 $0C2c6 d$ Shift 7
6 s6 s7 9 $0C2c6d7 $ Reduce by C → d
7 r3 $0C2c6C9 $ Reduce by C → cC
8 r2 r2 $0C2C5 $ Reduce by S → CC
9 r2 $0S1 $ Accept

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 66 / 84


Canonical LR(1) Parsing

 If the parsing table has no multiply-defined cells, then the corresponding grammar G is
LR(1)
 Every SLR(1) grammar is an LR(1) grammar
▶ Canonical LR parser may have more states than SLR

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 67 / 84


LALR Parsing
Example Construction of LR(1) Items

I0 = Closure ({[ S → •S, $]}) I4 = Goto ( I0 , d) I7 = Goto ( I2 , d)

= {S → •S, $, = {C → d•, c/d} = {C → d•, $}
S → •CC, $,
C → •cC, c/d, I5 = Goto ( I2 , S) I8 = Goto ( I3 , C)
C → •d, c/d} = {S → CC•, $} = {C → cC•, c/d}
I1 = Goto ( I0 , S)
′ I6 = Goto ( I2 , c) I9 = Goto ( I6 , C)
= {S → S•, $} = {C → c • C, $, = {C → cC•, $}
C → •cC, $,
I2 = Goto ( I0 , C) C → •d, $}
= {S → C • C, $,
C → •cC, $,
C → •d, $}
I3 = Goto ( I0 , c) I3 and I6 , I4 and I7 , and I8 and I9 only
= {C → c • C, c/d,
C → •cC, c/d, differ in the second components
C → •d, c/d}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 68 / 84


Lookahead LR (LALR) Parsing

 CLR(1) parser has numerous states


 Lookahead LR (LALR) parser merges sets of LR(1) items that have the same core (set
of LR(0) items, i.e., first component)
▶ LALR parsers have fewer states, the same as SLR
 LALR parser is used in many parser generators (e.g., Bison)

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 69 / 84


Construction of LALR Parsing Table

 Construct C = {I0 , I1 , . . . , In }, the collection of set of LR(1) items


 For each core present in LR(1) items, find all sets having the same core and replace
these sets with their union

 Let C = {J0 , J1 , . . . , Jn } be the resulting sets of LR(1) items (also called LALR collection)
 Construct ACTION table as was done earlier, parsing actions for state i is constructed
from Ji
 Let J = I1 ∪ I2 ∪ · · · ∪ Ik , where the cores of I1 , I2 , . . . , Ik are the same
▶ Cores of Goto ( I1 , X ), Goto ( I2 , X ), . . . ,Goto ( Ik , X ) will also be the same
▶ Let K = Goto ( I1 , X ) ∪ Goto ( I2 , X ) ∪ . . . Goto ( Ik , X ), then K = Goto ( J , X )

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 70 / 84


LALR Grammar

If there are no parsing action conflicts, then the grammar is LALR(1)

Rule # Rule I36 = Goto ( I2 , c)


′ = {C → c • C, c/d/$,
0 S →S
C → •cC, c/d/$,
1 S → CC
C → •d, c/d/$}
2 C → cC
3 C→d I47 = Goto ( I0 , d)
= {C → d•, c/d/$}

I89 = Goto ( I3 , C)
= {C → cC•, c/d/$}

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 71 / 84


LALR Parsing Table

ACTION GOTO
State Stack Input Action
c d $ S C
0 s36 s47 1 2 $0 cdcd$ Shift 36
1 Accept $0c36 dcd$ Shift 47
2 s36 s47 5 $0c36d47 cd$ Reduce by C → d
36 s36 s47 89 $0c36C89 cd$ Reduce by C → cC
47 r3 r3 r3 $0C2 cd$ Shift 36
5 r1 $0C2c36 d$ Shift 47
89 r2 r2 r2 $0C2c36d47 $ Reduce by C → d
$0C2c36C89 $ Reduce by C → cC
$0C2C5 $ Reduce by S → CC
$0S1 $ Accept

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 72 / 84


Notes on LALR Parsing

 LALR parser behaves like the CLR parser except for difference in stack states

Merging LR(1) items can never produce shift/reduce conflicts


 Suppose there is a shift-reduce conflict on lookahead a due to items [B → 𝛽 • 𝛼𝛾, b]
and [A → 𝛼•, a]
 But the merged state was formed from states with same cores, which implies
[B → 𝛽 • a𝛾, c ] and [A → 𝛼•, a] must have already been in the same state, for some
value of c

Merging items may produce reduce/reduce conflicts

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 73 / 84


Reduce-Reduce Conflicts due to Merging

LR(1) grammar
′ {[A → c•, d], [ B → c•, e]} {[A → c•, e], [ B → c•, d]}
S →S
S → aAd | bBd | aBe | bAe
A →c
B →c
{[A → c•, d/c], [ B → c•, d/e]}
Example strings: acd, ace, bcd, bce

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 74 / 84


Dealing with Errors with LALR Parsing

CLR Parsing Table LALR Parsing Table

ACTION GOTO ACTION GOTO


State State
c d $ S C c d $ S C
0 s3 s4 1 2 0 s36 s47 1 2
1 Accept 1 Accept
2 s6 s7 5 2 s36 s47 5
3 s3 s4 8 36 s36 s47 89
4 r3 r3 47 r3 r3 r3
5 r1 5 r1
6 s6 s7 9 89 r2 r2 r2
7 r3
8 r2 r2 Rule # Rule
9 r2
0 S →S

1 S → CC
2 C → cC
3 C→d

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 75 / 84


Comparing Moves of CLR and LALR Parsers

Consider an erroneous input ccd

CLR Parser LALR Parser

Stack Input Action Stack Input Action


$0 ccd$ Shift 3 $0 ccd$ Shift 36
$0c3 d$ Shift 3 $0c36 cd$ Shift 36
$0c3c3 d$ Shift 4 $0c36c36 d$ Shift 47
$0c3c3d4 $ Error $0c36c36d47 $ Reduce by C → d
$0c36c36C89 $ Reduce by C → cC
$0c36C89 $ Reduce by C → cC
$0C2 $ Error

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 76 / 84


Comparing Moves of CLR and LALR Parsers

Consider an erroneous input ccd

CLR Parser LALR Parser

Stack Input Action Stack Input Action


$0
 CLR parser will not even$reduce
ccd$ Shift 3 0
before ccd
reporting an error
$ Shift 36
$0c3  SLRdand LALR
$ Shift 3 parser may reduce severalcdtimes
$0c36 before
$ Shift 36
$0c3c3 reporting an 4error, but will
d$ Shift never shift an derroneous
$0c36c36 $ Shift 47input
$0c3c3d4 symbol
$ onto
Error the stack $0c36c36d47 $ Reduce by C → d
$0c36c36C89 $ Reduce by C → cC
$0c36C89 $ Reduce by C → cC
$0C2 $ Error

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 77 / 84


Using Ambiguous Grammars
Dealing with Ambiguous Grammars
LR(1) grammar I2 = Goto ( I0 , ‘(’) I5 = Goto ( I0 , ∗)
= {E → (•E ), = { E → E ∗ •E ,
′ E → •E + E , E → •E + E ,
E →E
E → •E ∗ E , E → •E ∗ E ,
E → E + E | E ∗ E | ( E ) | id
E → •( E ), E → •( E ),
Grammar does not distinguish between E → •id} E → •id}
the associativity and precedence of the
two operators I3 = Goto ( I0 , id) I6 = Goto ( I2 , E )
= {E → id•} = {E → ( E •),

I0 = Closure ({[ E → •E ]}) E → E • +E ,
′ I4 = Goto ( I0 , +)
= {E → •E , E → E • ∗E }
= { E → E + •E ,
E → •E + E , E → •E + E , I7 = Goto ( I4 , E )
E → •E ∗ E , E → •E ∗ E , = {E → E + E •,
E → •( E ), E → •( E ), E → E • +E ,
E → •id} E → •id} E → E • ∗E }
I1 = Goto ( I0 , E ) I9 = Goto ( I6 , ‘)’)
′ I8 = Goto ( I5 , E )
= {E → E •, = {E → ( E )•} = {E → E ∗ E •,
E → E • +E , E → E • +E ,
E → E • ∗E } E → E • ∗E }
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 78 / 84
SLR Parsing Table

ACTION GOTO
State
id + ∗ ( ) $ E
0 s3 s2 1
1 s4 s5 Accept
2 s3 s2
3 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s9
7 s4, r1 s5, r1 r1 r1
8 s4, r2 s5, r2 r2 r2
9 r3 r3 r3 r3

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 79 / 84


Moves of an SLR Parser on id + id ∗ id

Stack Input Action


$0 id + id ∗ id$ Shift 3
$0id3 + id ∗ id$ Reduce by E → id
$0E1 + id ∗ id$ Shift 4
$0E1+4 id ∗ id$ Shift 3
$0E1+4id3 ∗ id$ Reduce by E → id 3
$0E1+4E7 ∗ id$

What can the parser do to resolve the


ambiguity?

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 80 / 84


SLR Parsing Table
ACTION GOTO
State
id + ∗ ( ) $ E
0 s3 s2 1
1 s4 s5 Accept
2 s3 s2
3 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s9
7 s4, r1 s5, r1 r1 r1
8 s4, r2 s5, r2 r2 r2
9 r3 r3 r3 r3

Why did the parser make these


choices?
Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 81 / 84
Comparison of Parsing Techniques
Relationship Among Grammars

LL(0)
LR(0)
Ambiguous
SLR grammars
LL(1)
LALR(1)
LR(1)
LL(k)
LR(k)

Unambiguous grammars

Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 82 / 84


Comparison of Parsing Techniques

 Ambiguous grammars are not LR


 Among grammars,
▶ LL(0) ⊂ LL(1) ⊂ . . . ⊂ LL(k)1
▶ LR(0) ⊂ SLR(1) ⊂ LALR(1) ⊂ LR(1)
▶ SLR(1) = LR(0) items + FOLLOW
▶ SLR(1) parsers can parse a larger number of grammars than LR(0)
▶ Any grammar that can be parsed by an LR(0) parser can be parsed by an SLR(1) parser
▶ SLR(k) ⊂ LALR(k) ⊂ LR(k)
▶ LL(k) ⊂ LR(k)
▶ Bottom-up parsing is a more powerful technique compared to top-down parsing
▶ LR grammars can handle left recursion
▶ Detects errors as soon as possible, and allows for better error recovery
▶ Automated parser generators such as Yacc and Bison implement LALR parsing

1D. Rosenkrantz and R. Stearns. Properties of Deterministic Top-Down Grammars.


Swarnendu Biswas (IIT Kanpur) CS 335: Bottom-Up Parsing Sem 2023-24-II 83 / 84
References

A. Aho et al. Compilers: Principles, Techniques, and Tools. Sections 4.5–4.8, 2nd edition,
Pearson Education.
K. Cooper and L. Torczon. Engineering a Compiler. Sections 3.4–3.6, 2nd edition, Morgan
Kaufmann.

You might also like