[go: up one dir, main page]

0% found this document useful (0 votes)
75 views11 pages

Parsers

Shift-reduce parsing is a bottom-up parsing method that constructs a parse tree from the leaves up by reducing substrings to productions. It uses a stack to shift input symbols and reduce substrings based on grammar productions. Top-down parsing attempts to construct a parse tree from the root down but can have issues with backtracking and left recursion in the grammar. Operator precedence parsing uses relations between operators to resolve ambiguity and avoid shift-reduce conflicts.

Uploaded by

Ashok Madaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views11 pages

Parsers

Shift-reduce parsing is a bottom-up parsing method that constructs a parse tree from the leaves up by reducing substrings to productions. It uses a stack to shift input symbols and reduce substrings based on grammar productions. Top-down parsing attempts to construct a parse tree from the root down but can have issues with backtracking and left recursion in the grammar. Operator precedence parsing uses relations between operators to resolve ambiguity and avoid shift-reduce conflicts.

Uploaded by

Ashok Madaan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

[Type text] [Type text] [Type text]

Parsing Techniques
Shift- Reduce Parsing
Shift-reduce parsing is a bottom – up parsing method because it attempts to
construct a parse tree from an input string beginning at the leaves and
working up towards the root. This is a process of reducing a string to the
start symbol of a grammar.

For example, consider a grammar with productions:

S  aAcBe (1)
A  Ab|b (2)
Bd (3)

and the input string, w = abbcde

Then start with the input string as:

abbcde
 aAbcBe from (2) and (3)
 aAcBe from (2)
 S from (1)

A substring which is the right side of a production such that the


replacement of that substring by the production’s left side leads to the
reduction to a start symbol by the Handle and this process of finding the
handles is known as Handle-Pruning.

Shift, Reduce, Accept, Error: these operations are used in the


implementation of the shift-reduce parsing by using Stacks.

For e.g., consider the following grammar:

E  E + E ; E  E * E ; E  ( E ) ; E  id

and a string is given as: id1 + id2 * id3

Then its implementation would be done as follows:

Given String Handle Reducing Production


 id1 + id2 * id3 id1 E  id1
 E + id2 * id3 id2 E  id2
 E + E * id3 id3 E  id3
 E+E*E E*E EE*E
 E+E E+E EE+E
 E - Accept
The shift-reduce parsing can be implemented using STACK as follows:

STACK INPUT ACTION


$ id1 + id2 * id3 $ Shift id1
$ id1 + id2 * id3 $ Reduce id1 by E
$E + id2 * id3 $ Shift +
$E+ id2 * id3 $ Shift id2
$ E + id2 * id3 $ Reduce id2 by E
$E+E * id3 $ Shift *
$E+E* id3 $ Shift id3
$ E + E * id3 $ Reduce id3 by E
$E+E*E $ Reduce E * E by E
$E+E $ Reduce E + E by E
$E $ Accept

STACK IMPLEMENTATION

Operator Precedence Parsing


For certain class of grammar, shift-reduce parsing can be used. But in order
to solve the problem of ambiguity in grammar, operator-precedence parsing
is used.

The grammar having the property that no production’s right side is Є or has
two adjacent non-terminals is called as Operator-Precedence Grammar.

For example, consider the grammar as follows:

E  E + E | E * E | ( E ) | id

Relational operators used in operator-precedence grammar are like:

<·, ·>, =·

These operators may be used with pair of terminals as:

a <. b b has higher precedence than a


a =· b b has same precedence as a
a .> b b has lower precedence than a

Rules for Operator Precedence

1) Relations from Associatively and Precedence of Operators:-

a. If operator θ1 has higher precedence than θ2, make

θ1 ·> θ2 and θ2 <· θ1


[Type text] [Type text] [Type text]

b. If θ1 and θ2 operators are of equal precedence then make

θ1 ·> θ2 and θ2 ·> θ1 if they are left associative.


θ1 <· θ1 and θ2 <· θ1 if they are right associative.

c. Make
θ <· id $ <· θ id ·> $
id ·> θ θ ·> $ ) ·> $
θ <· ( ( =· ) ( <· id
) ·> θ $ <· ( id ·> )
( <· θ ( <· (
θ ·> ) ) ·> )

Remember:
id will be ·> for everything;
$ will be <· ;
If direction of parentheses is (, then use <· ;
Else use ·>

2) Calculation of Operator Precedence:-

a. For computing the first relational operator =· :-

Look for the right sides productions with two terminals


separated by nothing or by a non-terminal.

For e.g., F  ( E ) , here we have ( =· )

b. For operator <·, look for the right side production with a
terminal immediately to the left of a non-terminal i.e.,
LEADING;

For e.g., E  E + T;

Here we have: + <· * ; + <· ( ; + <· id

c. For operator ·>, look for the right production with a non-
terminal immediately to the left of a terminal i.e., TRAILING.

For example: Consider a grammar R  E + E | E * E | id and a left-


associative string id + id * id, then

1) Insert $ at the left and the right ends of the string as:

$ id + id * id $
2) Insert the precedence relations:

id + * $
id =· ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· =·

Operator – Precedence Table


3) Make relations:-
$ <· id ·> + <· id ·> * <· id ·> $

4) Reduce the symbols within the brackets <· ·>


$E+E*E$

5) Remove the non-terminals:-


$ + * $

6) Make relations:-
$ <· + <· * ·> $

7) * occurs within brackets <· ·>, so * will be solved first. Thus E * E will
be reduced first giving:-
$E+E$

8) Again remove the non-terminals:-


$ + $

9) Make relations:-
$ <· + ·> $

10) Now solve +. Thus reduce E + E giving:-


$E$

Top-Down Parsing
Top-down parsing can be viewed as an attempt to find a leftmost derivation
for an input string. It can be viewed as attempting to construct a parse tree
for the input starting from the root and creating the nodes of the parse tree
in pre-order.

Top-down parsing have several drawbacks as follows:-

1) Backtracking:

If we make a sequence of erroneous expansions and subsequently


discover a mismatch, we may have to undo the semantic effects of
making these erroneous expansions. This is called backtracking.
[Type text] [Type text] [Type text]

Suppose we have a grammar:

S  cAd
A  ab | a

and an input string w = cad, then the tree using top-down parsing will
be:
S

c A d

a b

Here yield = cabd. This problem does not match with backtracking. To
eliminate the drawback, a technique to write a set of recursive
procedures to recognize its input with no backtracking called as
recursive-descent parsing is used.

Algorithm used to eliminate backtracking:-

Procedure S():
begin
if input symbol = ‘c’ then
begin
ADVANCE();
if A() then
if input symbol = ‘d’ then
begin
ADVANCE();
return TRUE;
end;
end;
return FALSE;
end;

Procedure A():
begin
isave = input-pointer;
if input symbol = ‘a’ then
begin
ADVANCE();
if input symbol = ‘b’ then
begin
ADVANCE();
return TRUE;
end;
end;
input-pointer = isave; /* failure to find ab */
if input symbol = ‘a’ then
begin
ADVANCE();
return TRUE;
end;
else
return FALSE;
end;

2) Left Recursion:-

A grammar G is said to be left-recursive if it has a non-terminal ‘A’


such that there is the production of the form A  AN. It will cause the
parser to go into an infinite loop like:

A N

A N

A N
.
.
.

Solution to the problem is to reduce the productions A  AN | β by


following two productions as:

A  βA1; A1  NA1 | Є

For example, consider the grammar:

E  E + T | T;
T  T * F | F;
F  ( E ) | id

Reducing E  E + T | T, we have
E  TE1
E1  +TE1 | Є

Similarly, for T  T * F | F, we have


T  FT1
T1  *FT1 | Є
[Type text] [Type text] [Type text]

Hence, we have the reduced grammar as:-

E  TE1
E1  +TE1 | Є
T  FT1
T1  * F T1 | Є
F  ( E ) | id

Elimination of left-recursion involves following algorithm:-

1. Arrange the non-terminals of G in some order A1, A2, ….. , An.


2. for i=1 to n
begin
for j=1 to i-1 do
replace each production of the form Ai  Ajγ by the
productions Ai  δ1γ | δ2γ | … | δkγ,
where Aj  δ1 | δ2 | … | δk are all the current Aj-
productions;

eliminate the immediate left-recursion among the Ai-


productions;
end;

3) Left Factoring:-

If we have the productions like A  Nβ | Nγ and the input begins with


N, we don’t know whether to expand ‘A’ to Nβ or Nγ. This is known as
Left-Factoring. This is removed by the following productions:

A  NA1
A1  β | γ

For example, suppose we have the productions as:

S  iCtS | iCtSeS | a
Cb

Reducing the first production we have:

S  iCtSS1 | a
S1  Є | eS
Cb
Predictive Parser
A predictive parser is an efficient way of implementing recursive-descent
parsing by handling the stack of activation records explicitly.
The predictive parser has an input, a stack, a parsing table, and an output.
a+b$ Input

X Program
Stack Y Output
Parsing Table
Z
$

Predictive Parser

Input: contains the string to be parsed, followed by $, the end marker.

Stack: contains a sequence of grammar symbols, preceded by $, the


bottom-of-stack marker.

Parsing Table: It is a 2 dimensional array M[A,a], where A is a non-


terminal, and a is a terminal or the symbol $.

Initially, the stack contains the start symbol of the grammar preceded by $.

The parser is controlled by a program that determines X, the symbol on top


of the stack, and a, the current input symbol. These two symbols determine
the action of the parser. There are three possibilities:-

1. If X = a = $, the parser halts and announces successful completion


of parsing.
2. If X = a ≠ $, the parser pops X of the stack and advances the input
pointer to the next input symbol.
3. If X is a non-terminal, the program consults entry M[X,a] of the
parsing table M.

i. If M[X,a] = {X  UVW}, the parser replaces X on top of the


stack by WVU (with U on top).
ii. If M[X,a] = error, the parser calls an error-recovery routine.

The behavior of the parser can be described in terms of its configurations,


which gives the stack contents and the remaining input.
Initially, the parser configuration is:

Stack Input
$S w$

where S  start symbol of the grammar, and w  string to be parsed.


[Type text] [Type text] [Type text]

4.5 FIRST and FOLLOW


During top-down parsing, FIRST and FOLLOW allow us to choose which
production to apply, based on the next input symbol. During panic-mode
error recovery, sets of tokens produced by FOLLOW can be used as
synchronizing tokens.

The functions FIRST and FOLLOW, will indicate the proper entries in the
table for G, if such a parsing table for G exists.

FIRST(X), where X is any string of grammar symbols, is defined to be the set


of terminals that begin strings derived from a.

Leave the points marked yellow below and only go through the topic as done
in class:

Algorithm to calculate FIRST:


1. If X is a terminal then FIRST (X) is {X}.
2. If X is a non-terminal and X  aX, then add ‘a’ to FIRST (X),
If X  Є, then add Є to FIRST (X).
3. If X  Y1, Y2, … , Yk is a production and Є is in FIRST (Y1), then add
every non – Є symbol in FIRST (Y1) to FIRST (X).
If Є is in FIRST (Yj) V j, then add Є to FIRST (X).

FOLLOW(A), for non-terminal A, may be defined to be the set of terminals a


that can appear immediately to the right of A in some sentential form; that
is, the set of terminals a such that there exists a derivation of the form S 
NAaβ, for some and β.

Algorithm to calculate FOLLOW:

1. $ is in Fo(S) where S  start symbol.


2. If there is a production A  NBβ where β ≠ Є, then add everything in
the FIRST (β) but not Є into the Fo(B).
3. If there is a production A  NB or A  NBβ where FIRST (β) contains
Є, then add everything in Fo(A) to Fo(B).

For example, consider the following productions of a grammar:

E  TE1 …1
E1  +TE1 | Є …2
T  +FT1 …3
T1  *FT1 | Є …4
F  ( E ) | id …5

Here,

FIRST (E) = { + } //(=FIRST(T)) rule 3; production 1


FIRST (E1) = { +, Є } //rule 2; production 2
FIRST (T) = { + } //rule 2; production 3
FIRST (T1) = { *, Є } //rule 2; production 4
FIRST (F) = { (, id } //rule 1,2; production 5
and

Fo(E) = { $, ) } //rule 1, 2; production 5


Fo(E1) = { $, ) } //rule 3; production 1
Fo(T) = { + } //rule 2; production 2
Fo(T1) = { + } //rule 3; production 3
Fo(F) = { * } //rule 2; production 4

Construction of Parsing Table


Input: Grammar G.

Output: Parsing Table, M.

Method:
1. For each production A  N, perform steps 2 and 3.
2. For each terminal ‘a’ in FIRST (N), add A  N to M[A,a].
3. If Є is in FIRST (N), add A  N to M[A,b] for each terminal ‘b’ in
Fo(A).

If Є is in FIRST (N), and $ is in Fo (A), add A  N to M[A,$].

4. Make each undefined entry of M as error.

For example, consider the given grammar as:

E  TE1,
E1  +TE1 | Є,
T  FT1,
T1  *FT1 | Є,
F  ( E ) | id.

id + * ( ) $
E E  TE1 E  TE1
E1 E1  +TE1 E1  Є E1  Є
T T  FT1 T  FT1
T1 T1 Є T1  *FT1 T1  Є T1  Є
F F  id F(E)

Fi (E) = { (, id } Fo(E) = { ),$ }


Fi (E1) = { +, Є } Fo(E’ ) = { ),$ }
Fi (T) = { (, id } Fo(T) = { +,),$ }
Fi (T1) = { *, Є } Fo(T’ ) = { +,),$ }
Fi (F) = { (, id } Fo(F) = { *,+,)$ }
[Type text] [Type text] [Type text]

Suppose a string is given as:


id + id * id
The stack implementation / parsing table for the given grammar is shown
as follows:

Stack Implementation / Parsing Table:-

Stack Input Output/Action


Push E  TE1 in stack in reverse
$E id + id * id $
order
$ E1T id + id * id $ Push T  F T1
$ E1T1F id + id * id $ Push F  id
$ E1T1id id + id * id $ Remove id
$ E1T1 + id * id $ Push T1  Є
$ E1 + id * id $ Push E1  +TE1
$ E1T+ + id * id $ Remove +
$ E1T id * id $ Push T  F T1
$ E1T1F id * id $ Push F  id
$ E1T1id id * id $ Remove id
$ E1T1 * id $ Push T1  *FT1
$ E1T1F* * id $ Remove *
$ E1T1F id $ Push F  id
$ E1T1id id $ Remove id
$ E1T1 $ Push T1  Є
$ E1 $ Push E1  Є
$ $ Accept

If there exists more than one entries for a cell in the parsing table, then the
grammar is not LL(1) grammar.

You might also like