0% found this document useful (0 votes)

12 views104 pages

Lecture04 TopDownParsing 2

The document discusses top-down parsing techniques, specifically focusing on recursive-descent and LL(1) parsing methods. It explains the structure and construction of LL(1) parsing tables, including error recovery and left recursion removal. Additionally, it provides examples and algorithms for implementing these parsing strategies effectively.

Uploaded by

mahmoudfatma3060

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views104 pages

Lecture04 TopDownParsing 2

Uploaded by

mahmoudfatma3060

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

COMPILER CONSTRUCTION

Principles and Practice

Kenneth C. Louden
4. Top-Down Parsing

PART TWO
Contents
PART ONE
4.1 Top-Down Parsing by Recursive-Descent
4.2 LL(1) Parsing [More]

PART TWO
4.3 First and Follow Sets [More]
4.4 A Recursive-Descent Parser for the TINY
Language [More]
4.5 Error Recovery in Top-Down Parsers
4.1 Top-Down Parsing by
Recursive-Descent
4.2 LL(1) Parsing
4.2.1 The Basic Method of LL(1)
Parsing
Main idea
• LL(1) Parsing uses an explicit stack rather than
recursive calls to perform a parse
• An example:
– a simple grammar for the strings of balanced
parentheses:
S→(S) S∣ε
• The following table shows the actions of a top-
down parser given this grammar and the string ( )
Table of Actions
Steps Parsing Stack Input Action
1 $S ()$ S→(S) S
2 $S)S( ()$ match
3 $S)S )$ S→ε
4 $S) )$ match
5 $S $ S→ε
6 $ $ accept
General Schematic
• A top-down parser begins by pushing the start symbol onto
the stack
• It accepts an input string if, after a series of actions, the
stack and the input become empty
• A general schematic for a successful top-down parse:
$ StartSymbol Inputstring$
… … //one of the
two actions
… … //one of the two actions
$ $ accept
Two Actions
• The two actions
– Generate: Replace a non-terminal A at the top of the stack by a
string α(in reverse) using a grammar rule A →α, and
– Match: Match a token on top of the stack with the next input token.
• The list of generating actions in the above table:
S => (S)S [S→(S) S]
=> ( )S [S→ε]
=> ( ) [S→ε]
• Which corresponds precisely to the steps in a leftmost
derivation of string ( ).
• This is the characteristic of top-down parsing.
4.2.2 The LL(1) Parsing Table
and Algorithm
Purpose and Example of LL(1)
Table
• Purpose of the LL(1) Parsing Table:
– To express the possible rule choices for a non-terminal A
when the A is at the top of parsing stack based on the
current input token (the look-ahead).
• The LL(1) Parsing table for the following simple
grammar:
S→(S) S∣∣ε

M[N,T] ( ) $

S S→(S) S S→ε
ε S→ε
ε
The General Definition of Table
• The table is a two-dimensional array
indexed by non-terminals and terminals
• Containing production choices to use at the
appropriate parsing step called M[N,T]
• N is the set of non-terminals of the grammar
• T is the set of terminals or tokens (including $)
• Any entrances remaining empty
• Representing potential errors
Table-Constructing Rule
• The table-constructing rule
– If A→α is a production choice, and there is a
derivation α=>* aβ, where a is a token, then
add A→α to the table entry M[A,a];
– If A→α is a production choice, and there are
derivations α=>* ε and S$=>* βAaγ, where S is
the start symbol and a is a token (or $), then
add A→α to the table entry M[A,a];
A Table-Constructing Case
• The constructing-process of the following table
– For the production : S→(S)S, α=(S)S, where a=(,
this choice will be added to the entry M[S, (].
– Since: S=>(S)S$，，rule 2 applied with
A = S, a = ), so add the choice S→ε to M[S, )]
– Since S$=>* S$, S→ε is also added to M[S, $].

M[N,T] ( ) $

S S→(S) S S→ε
ε S→ε
ε
Properties of LL(1) Grammar
• Definition of LL(1) Grammar
– A grammar is an LL(1) grammar if the
associated LL(1) parsing table has at most
on production in each table entry
• An LL(1) grammar cannot be ambiguous
A Parsing Algorithm Using the
LL(1) Parsing Table
(* assumes $ marks the bottom of the stack and the end of the input *)
push the start symbol onto the top the parsing stack;

while the top of the parsing stack ≠ $ and

the next input token ≠ $ do

if the top of the parsing stack is terminal a and the next

input token = a
then (* match *)
pop the parsing stack;
advance the input;
A Parsing Algorithm Using the
LL(1) Parsing Table
else if the top of the parsing stack is non-terminal A
and the next input token is terminal a
and parsing table entry M[A,a] contains production A→
X1X2…Xn
then (* generate *)
pop the parsing stack;
for i:=n downto 1 do
push Xi onto the parsing stack;
else error;
if the top of the parsing stack = $
and the next input token = $
then accept
else error.
Example: If-Statements
• The LL(1) parsing table for simplified
grammar of if-statements:

Statement → if-stmt | other

If-stmt → if (exp) statement else-part
Else-part → else statement | ε
Exp → 0 | 1
M[N,T] If Other Else 0 1 $
Statement Statement Statement
→ if- →
stmt other
If-stmt If-stmt →
if (exp)
stateme
nt else-
part
Else-part Else-part Else-
→ part
else →ε ε
state
ment

Else-part
→ε ε
Exp Exp → Exp →
0 1
Notice for Example: If-Statement

• The entry M[else-part, else] contains two entries, i.e.

the dangling else ambiguity.
• Disambiguating rule: always prefer the rule that
generates the current look-ahead token over any other,
and thus the production
Else-part → else statement
ove
Else-part →ε
• With this modification, the above table will become
unambiguous
– The grammar can be parsed as if it were an LL(1) grammar
The parsing based LL(1) Table
• The parsing actions for the string:
If (0) if (1) other else other

• ( for conciseness, statement= S, if-stmt=I,

else-part=L, exp=E, if=I, else=e, other=o)
Steps Parsing Stack Input Action
1 $S i(0)i(1)oeo$ S→I
2 $I i(0)i(1)oeo$ I→i(E)SL
3 $LS)E(i i(0)i(1)oeo$ Match
4 $ LS)E( (0)i(1)oeo $ Match
5 $ LS)E 0)i(1)oeo $ E→o
o
Match
Match
S→I
I→i(E)SL
Match
Match
E→1
Match
match
S→o
match
L→eS
Match
S→o
match
L→ε
ε
22 $ $ accept
4.2.3 Left Recursion Removal
and Left Factoring
Repetition and Choice Problem
• Repetition and choice in LL(1) parsing suffer from
similar problems to be those that occur in recursive-
descent parsing
– and for that reason we have not yet been able to give an LL(1)
parsing table for the simple arithmetic expression grammar of
previous sections.
• Solve these problems for recursive-descent by using
EBNF notation
– We cannot apply the same ideas to LL(1) parsing;
– instead, we must rewrite the grammar within the BNF notation into
a form that the LL(1) parsing algorithm can accept.
Two standard techniques for
Repetition and Choice
• Left Recursion removal
exp → exp addop term | term
(in recursive-descent parsing, EBNF: exp→ term
{addop term})
• Left Factoring
If-stmt → if ( exp ) statement
∣ if ( exp ) statement else statement
(in recursive-descent parsing, EBNF:
if-stmt→ if (exp) statement [else statement])
Left Recursion Removal
• Left recursion is commonly used to make operations
left associative, as in the simple expression grammar,
where
exp → exp addop term | term
• Immediate left recursion:
The left recursion occurs only within the production of a
single non-terminal.
exp → exp + term | exp - term |term
• Indirect left recursion:
Never occur in actual programming language grammars,
but be included for completeness.
A → Bb |…
B → Aa |…
CASE 1: Simple Immediate Left
Recursion
• A → Aα| β
Where, αandβare strings of terminals and non-terminals;
βdoes not begin with A.

• The grammar will generate the strings of the form. βα n

• We rewrite this grammar rule into two rules:

A → βA’
To generate β first;
A’ → αA’| ε
To generate the repetitions of α, using right recursion.
Example

• exp → exp addop term | term

• To rewrite this grammar to remove left
recursion, we obtain
exp → term exp’
exp’ → addop term exp’ | ε
CASE2: General Immediate Left
Recursion

A → Aα1| Aα2| … |Aαn|β1|β2|…|βm

Where none ofβ1,…,βm begin with A.

The solution is similar to the simple case:

A →β1A’|β2A’| …|βmA’
A’ → α1A’| α2A’| … |αnA’|ε
Example
• exp → exp + term | exp - term |term

• Remove the left recursion as follows:

exp → term exp’
exp’ → + term exp’ | - term exp’ |ε
CASE3: General Left Recursion
• Grammars with noε-productions and no cycles

(1) A cycle is a derivation of at least one step that

begins and ends with same non-terminal:
A=>α=>A

(2) Programming language grammars do have ε-

productions, but usually in very restricted forms.
Algorithm for General Left
Recursion Removal
For i:=1 to m do
For j:=1 to i-1 do
Replace each grammar rule choice of the form
Ai→ Ajβ by the rule
Ai→α1β|α2β| … |αkβ,
where Aj→α1|α2| … |αk is the current rule for Aj.
Explanation:
(1) Picking an arbitrary order for all non-terminals, say, A1,…,Am;
(2) Eliminates all rules of the form Ai→ Ajγwith j≤i;
(3) Every step in such a loop would only increase the index, and thus
the original index cannot be reached again.
Example
Consider the following grammar:
A→Ba| Aa| c
B→Bb| Ab| d
Where, A1=A, A2=B and m=2
(1) When i=1, the inner loop does not execute, So
only to remove the immediate left recursion of
A
A→BaA’| c A’
A’→aA’| ε
B→Bb| Ab| d
Example
(2) when i=2, the inner loop execute once, with j=1;To eliminate the
rule B→Ab by replacing A with it choices
A→BaA’| c A’
A’→aA’| ε
B→Bb| BaA’b|cAb| d

(3) We remove the immediate left recursion of B to obtain

Now, the grammar has no left recursion.

Notice

• Left recursion removal not changes the

language, but
– Change the grammar and the parse tree
• This change causes a complication for the
parser
Example
Simple arithmetic expression After removal of the left
grammar recursion

exp → term exp’

expr → expr addop term∣∣term
exp’→ addop term exp’∣∣ε
addop → +|-
addop → + -
term → term mulop factor ∣ factor
term → factor term’
mulop →*
term’ → mulop factor term’∣∣ε
factor →(expr) ∣ number
mulop →*
factor →(expr) ∣ number
Parsing Tree
• The parse tree for the expression 3-4-5
– Not express the left associativity of subtraction.
exp

term exp’

addop term
factor
exp’
- factor
number
addop term exp’
(3)
number
(4)
- factor ε

number
(5)
Syntax Tree
• Nevertheless, a parse should still construct
the appropriate left associative syntax tree
-

- 5

3 4

• From the given parse tree, we can see

how the value of 3-4-5 is computed.
Left-Recursion Removed
Grammar and its Procedures
• The grammar with its left recursion removed,
exp and exp’ as follows:
exp → term exp’
exp’→ addop term exp’∣∣ε Procedure exp’
Begin
Case token of
Procedure exp +: match(+);
Begin term;
Term; exp’;
Exp’; -: match(-);
End exp; term;
exp’;
end case;
end exp’
Left-Recursion Removed
Grammar and its Procedures
• To compute the value of the expression, exp’
needs a parameter from the exp procedure
exp → term exp’
exp’→ addop term exp’∣∣ε
function exp’(valsofar:integer):integer;
function exp:integer; Begin
If token=+ or token=- then
var temp:integer;
Case token of
Begin +: match(+);
Temp:=Term; valsofar:=valsofar+term;
Return Exp’(temp); -: match(-);
End exp; valsofar:=valsofar-term;
end case;
return exp’(valsofar);
The LL(1) parsing table for the new expression
M[N,T] ( number ) + - * $
Exp exp exp →
→ term
term exp’
exp’
Exp’ exp’ → exp’ → exp’ → exp’ →
ε addop addop ε
term term
exp’ exp’
Addop addop addop
→ + → -
Term term term →
→ factor
factor term’
term’
Term’ term’ term’ term’ term’ term’
→ε →ε →ε → →ε
mulop
factor
term’
Mulop mulop
→*
factor factor factor
→ →
(expr) number
Left Factoring
• Left factoring is required when two or more grammar rule choices
share a common prefix string, as in the rule
A→αβ|αγ

• Example:
stmt-sequence→stmt; stmt-sequence | stmt
stmt→s
• An LL(1) parser cannot distinguish between the production
choices in such a situation

• The solution in this simple case is to “factor” the α out on the left
and rewrite the rule as two rules:
A→αA’
A’→β|γ
Algorithm for Left Factoring a
Grammar
While there are changes to the grammar do
For each non-terminal A do
Let α be a prefix of maximal length that is shared
By two or more production choices for A

If α≠εthen
Let A →α1|α2|…|αn be all the production choices for A
And suppose thatα1,α2,…,αk shareα, so that
A →αβ1|αβ2|…|αβk|αK+1|…|αn, the βj’s share
No common prefix, andαK+1,…,αn do not share α
Replace the rule A →α1|α2|…|αn by the rules
A →αA’|αK+1|…|αn
A ‘→β1|β2|…|βk
Example 4.4
• Consider the grammar for statement
sequences, written in right recursive
form:
Stmt-sequence→stmt; stmt-sequence | stmt
Stmt→s
• Left Factored as follows:
Stmt-sequence→stmt stmt-seq’
Stmt-seq’→; stmt-sequence | ε
Example 4.4
• Notices:
– if we had written the stmt-sequence rule left
recursively,
– Stmt-sequence→stmt-sequence ;stmt | stmt
• Then removing the immediate left
recursion would result in the same rules:
Stmt-sequence→stmt stmt-seq’
Stmt-seq’→; stmt-sequence | ε
Example 4.5
• Consider the following grammar for if-
statements:
If-stmt → if ( exp ) statement
∣ if ( exp ) statement else statement
• The left factored form of this grammar is:
If-stmt → if (exp) statement else-part
Else-part → else statement | ε
Example 4.6
• An arithmetic expression grammar with right
associativity operation:
exp → term+exp |term
• This grammar needs to be left factored, and we obtain
the rules
exp → term exp’
exp’→ + exp∣∣ε
• Suppose we substitute term exp’ for exp, we then
obtain:
exp → term exp’
exp’→ + term exp’∣∣ε
Example 4.7
• An typical case where a grammar fails to be
LL(1)
Statement → assign-stmt| call-stmt| other
Assign-stmt→identifier:=exp
Call-stmt→indentifier(exp-list)
• Where, identifier is shared as first token of
both assign-stmt and call-stmt and,
• thus, could be the lookahead token for either.
• But not in the form can be left factored.
Example 4.7
• First replace assign-stmt and call-stmt by the
right-hand sides of their definition productions:
Statement → identifier:=exp | indentifier(exp-list)
| other
• Then, we left factor to obtain
Statement → identifier statement’ | other
Statement’ →:=exp |(exp-list)
• Note:
– this obscures the semantics of call and assignment
by separating the identifier from the actual call or
assign action.
4.2.4 Syntax Tree Construction in
LL(1) Parsing
Difficulty in Construction
• It is more difficult for LL(1) to adapt to syntax tree
construction than recursive descent parsing

• The structure of the syntax tree can be obscured by

left factoring and left recursion removal

• The parsing stack represents only predicated

structure, not structure that have been actually
seen
Solution
• The solution
– Delay the construction of syntax tree nodes to the
point when structures are removed from the parsing
stack.
• An extra stack is used to keep track of syntax
tree nodes, and
• The “action” markers are placed in the parsing
stack to indicate when and what actions on the
tree stack should occur
Example
• A barebones expression grammar with
only an addition operation.
E →E + n |n
/* be applied left association*/
• The corresponding LL(1) grammar with
left recursion removal is.
E →n E’
E’ →+nE’|ε
To compute the arithmetic value of
the expression
• Use a separate stack to store the intermediate values of the
computation, called the value stack;
– Schedule two operations on that stack:
• A push of a number;
• The addition of two numbers.
• PUSH can be performed by the match procedure, and
• ADDITION should be scheduled on the stack, by pushing a special
symbol (such as #) on the parsing stack.
– This symbol must also be added to the grammar rule that match a +,
namely, the rule for E’:
• E’ →+n#E’|ε
• Notes: The addition is scheduled just after the next number,
but before any more E’ non-terminals are processed. This
guaranteed left associativity.
The actions of the parser to compute the value of
the expression 3+4+5

Parsing Stack Input Action Value Stack

$E 3+4+5$ E→n E’ $
$E’n 3+4+5$ Match/push $
$E’ +4+5$ E’ →+n#E’ 3$
$E’#n+ +4+5$ Match 3$
$E’#n 4+5$ Match/push 3$
$E’# +5$ Addstack 43$
$E’ +5$ E’ →+n#E’ 7$
$E’#n+ +5$ Match 7$
$E’#n 5$ Match/push 7$
$E’# $ Addstack 57$
$E’ $ E’ →ε 12$
$ $ Accept 12$
4.3 First and Follow Sets

The LL(1) parsing algorithm is based on the LL(1)

parsing table
The LL(1) parsing table construction involves the
First and Follow sets
4.3.1 First Sets
Definition
• Let X be a grammar symbol( a terminal or
non-terminal) or ε. Then First(X) is a set of
terminals or ε, which is defined as follows:
1. If X is a terminal or ε, then First(X) = {X};
2. If X is a non-terminal, then for each production choice
X→X1X2…Xn,
First(X) contains First(X1)-{ε}.
If also for some i<n, all the set First(X1)..First(Xi) contain
ε,the first(X) contains First(Xi+1)-{ε}.
IF all the set First(X1)..First(Xn) contain ε, the First(X)
contains ε.
Definition
• Let α be a string of terminals and non-
terminals, X1X2…Xn. First(α) is defined
as follows:
1.First(α) contains First(X1)-{ε};
2.For each i=2,…,n, if for all k=1,..,i-1, First(Xk)
contains ε, then First(α)
contains First(Xk)-{ε}.
3. IF all the set First(X1)..First(Xn) contain ε, the
First(α) contains ε.
Algorithm Computing First (A)
• Algorithm for computing First(A) for all
non-terminal A:
For all non-terminal A do First(A):={ };
While there are changes to any First(A) do
For each production choice A→X1X2…Xn do
K:=1; Continue:=true;
While Continue= true and k<=n do
Add First(Xk)-{ε} to First(A);
If ε is not in First(Xk) then Continue:= false;
k:=k+1;
If Continue = true then addεto First(A);
Algorithm Computing First (A)

• Simplified algorithm in the absence of

ε-production.
For all non-terminal A do First(A):={ };
While there are changes to any First(A) do
For each production choice A→X1X2…Xn do
Add First(X1) to First(A);
About Nullable Non-Terminal
• Definition: A non-terminal A is nullable if there exists a
derivation A=>ε.
• Theorem: A non-terminal A is nullable if and only if First(A)
contains ε.
• Proof : 1. If A is nullable, then First(A) contains ε.
– As A => *ε, we use induction on the length of a derivation.
– (1) A=> ε, then there must be a production A→ε,by definition,
– First(A) contain First(ε)={ε}.
– (2) Assume the truth of the statement for derivation of length < n,
– and let A=>X1…XK=>*ε be a derivation of length n;
– All the Xi must be non-terminals;
• Implying that each Xi =>*ε, and in fewer than n steps.
• Thus, by the induction assumption, for each i First(Xi )={ ε}
• Finally, by definition, First(A) must containε.
Example
• Simple integer expression• Write out each choice
grammar separately in order:

(1) exp → exp addop term

exp → expr addop term
(2) exp → term
∣
∣term
(3) addop → +
addop → +|- (4) addop → -
term → term mulop factor (5) term → term mulop factor
∣ factor (6) term → factor
mulop →* (7) mulop →*
factor →(expr) ∣ number (8) factor →(exp)
(9) factor →number
First Set for Above Example
• We can use the simplified algorithm as
there exists noε-production
• The First sets are as follows:
First(exp)={(,number}
First(term)={(,number}
First(factor)={(,number}
First(addop)={+,-}
First(mulop)={*}
The computation process for above First Set

Grammar Rule Pass 1 Pass 2 Pass 3

expr → expr addop term
expr → term First(exp)={(,number}
addop → + First(addop)={+}
addop → - First(addop)={+,-}
term → term mulop factor
term → factor First(term)={(,number}
mulop →* First(mulop)={*}
factor →(expr) First(factor)={()
factor →number First(factor)={(,number)
Example
• Left factored grammar of if-statement
Statement → if-stmt | other
If-stmt → if (exp) statement else-part
Else-part → else statement | ε
Exp → 0 | 1
• We write out the grammar rule choice separately and number
them:
(1) Statement → if-stmt
(2) Statement → other
(3) If-stmt → if (exp) statement else-part
(4) Else-part → else statement
(5) Else-part →ε
(6) Exp → 0
(7) Exp → 1
The First Set for Above Example
• Note:
– This grammar does have an ε-production, but the only nullable
non-terminal else-part will not in the beginning of left side of any
rule choice and will not complicate the computation process.

• The First Sets:

First(statement)={if,other}
First(if-stmt)={if}
First(else-part)={else,ε}
First(exp)={0,1}
The computation process for above
First Set

Grammar Rule Pass 1 Pass 2

Statement → if-stmt First(statement)={if,other}
Statement → other First(statement)={other}
If-stmt → if (exp) statement else-part First(if-stmt)={if}
Else-part → else statement First(else-part)={else}
Else-part →ε First(else-part)={else,ε}
Exp → 0 First(exp)={1}
Exp → 1 First(exp)={0,1}
Example
• Grammar for statement sequences:
• Stmt-sequence →stmt stmt-seq’
• Stmt-seq’ →; stmt-sequence|ε
• stmt→s
• We list the production choices individually:
• Stmt-sequence →stmt stmt-seq’
• Stmt-seq’ →; stmt-sequence
• Stmt-seq’ →ε
• stmt→s
• The First sets are as follows:
• First(stmt-sequence)={s}
• First(stmt-seq’)={;, ε}
• First(stmt)={s}
4.3.2 Follow Sets
Definition
Given a non-terminal A, the set Follow(A)
is defined as follows.
(1) if A is the start symbol, the $ is in the
Follow(A).
(2) if there is a production B→αAγ then
First(γ)-{ε} is in Follow(A).
(3) if there is a production B→αAγ such that
ε in First(γ), then Follow(A) contains
Follow(B).
Definition
• Note: The symbol $ is used to mark the end of
the input.
– The empty “pseudotoken” ε is never an element of a
follow set.
– Follow sets are defined only for non-terminal.
– Follow sets work “on the right” in production while
First sets work “on the left”in the production.
• Given a grammar rule A →αB, Follow(B) will
contain Follow(A),
– the opposite of the situation for first sets, if
A →Bα, First(A) contains First(B), except for ε.
Algorithm for the computation of
follow sets
• Follow(start-symbol):={$};
• For all non-terminals A≠start-symbol do
follow(A):={ };
• While there changes to any follow sets do
For each production A→X1X2…Xn do
For each Xi that is a non-terminal do
Add First(Xi+1Xi+2…Xn) – {ε} to Follow(Xi)
fεis in First(Xi+1Xi+2…Xn) then
Add Follow(A) to Follow(Xi)
Example
• The simple expression grammar.
(1) exp → exp addop term
(2) exp → term
(3) addop → +
(4) addop → -
(5) term → term mulop factor
(6) term → factor
(7) mulop →*
(8) factor →(exp)
(9) factor →number
Example
• The first sets:
First(exp)={(,number}
First(term)={(,number}
First(factor)={(,number}
First(addop)={+,-}
First(mulop)={*}
• The Follow sets:
Follow(exp)={$,+,-, } }
Follow(addop)={(,number)
Follow(term)={ $,+,-, *,}}
Follow(mulop)={(,number)
Follow(factor)={ $,+,-, *,}}
The progress of above computation
Grammar rule Pass 1 Pass 2
exp → exp addop term Follow(exp)={$,+,- } Follow(term)={ $,+,-, *,）}
Follow(addop)={(,number)
Follow(term)={ $,+,-}
Exp → term
term → term mulop factor Follow(term)={ $,+,-, *} Follow(factor)={ $,+,-, *,）}
Follow(mulop)={(,number)
Follow(factor)={ $,+,-, *}
term →factor
factor →(exp) Follow(exp)={$,+,-,）}
Example
• The simplified grammar of if-statements:
(1) Statement → if-stmt
(2) Statement → other
(3) If-stmt → if (exp) statement else-part
(4) Else-part → else statement
(5) Else-part →ε
(6) Exp → 0
(7) Exp → 1
Example
• The First sets:
First(statement)={if,other}
First(if-stmt)={if}
First(else-part)={else,ε}
First(exp)={0,1}
• Computing the following Follow sets:
Follow(statement)={$,else}
Follow(if-statement)={$,else}
Follow(else-part)={$,else}
Follow(exp)={)}
Example
• The simplified statement sequence grammar.
(1) Stmt-sequence →Stmt Stmt-seq’
(2) Stmt-seq’ →; Stmt-sequence
(3) Stmt-seq’ →ε
(4) Stmt→s
Example

• The First sets are as follows:

First(Stmt-sequence)={s}
First(Stmt)={s}
First(Stmt-seq’)={; , ε}
• And, the Follow sets:
Follow(Stmt-sequence)={$}
Follow(Stmt)={; , $}
Follow(Stmt-seq’)={$}
4.3.3 Constructing LL(1) Parsing
Tables
The table-constructing rules
(1) If A→α is a production choice, and there is a
derivation α=>* aβ, where a is a token, then add
A→α to the table entry M[A,a]
(2) If A→α is a production choice, and there are
derivations α=>* ε and S$=>* βAaγ, where S is
the start symbol and a is a token (or $), then add
A→α to the table entry M[A,a]
• Clearly, the token a in the rule (1) is in First(α),
and the token a of the rule (2) is in Follow(A).
• Thus we can obtain the following algorithmic
construction of the LL(1) parsing table:
Algorithm and Theorem
• Repeat the following two steps for each non-terminal A
and production choice A→α.
– For each token a in First(α), add A→α to the entry M[A,a].
– If ε is in First(α), for each element a of Follow(A) ( a token or $),
add A→α to M[A,a].
• Theorem:A grammar in BNF is LL(1) if the following
conditions are satisfied.
– For every production A→α1|α2|…|αn, First(αi)∩ First(αj) is empty
for all i and j, 1≦i,j≦n, i≠j.
– For every non-terminal A such that First(A) contains ε,
First(A) ∩Follow(A) is empty.
Example
• The simple expression grammar.
exp → term exp’
∣
exp’→ addop term exp’∣ε
addop → + -
term → factor term’
term’ → mulop factor term’∣ε
mulop →*
factor →(expr) ∣ number
The first and follow set

First Sets Follow Sets

First(exp)={(,number) Follow(exp)={$,） }
First(exp’)={+,-, ε} Follow(exp’)={$,）}
First(term)={(,number) Follow(addop)={(,number)
First(term’)={*, ε} Follow(term)={ $,+,-,）}
First(factor)={(,number} Follow(term’)={$,+,-,）}
First(addop)={+,-} Follow(mulop)={(,number}
Follow(factor)={ $,+,-, *,）}
First(mulop)={*}
the LL(1) parsing table
M[N,T] ( number ) + - * $
Exp exp → exp →
term term exp’
exp’
Exp’ exp’→ε exp’ → exp’ → exp’→ε
addop addop
term exp’ term exp’
Addop addop → addop →
+ -
Term term → term →
factor factor
term’ term’
Term’ term’ → term’ → term’ → term’ → term’ →
ε ε ε mulop ε
factor
term’
Mulop mulop →
*
factor factor factor →
→(expr) number
Example
• The simplified grammar of if-statements
Statement → if-stmt | other
If-stmt → if (exp) statement else-part
Else-part → else statement | ε
Exp → 0 | 1
The first and follow set

First Sets Follow Sets

First(statement)={if,other} Follow(statement)={$,else}
First(if-stmt)={if} Follow(if-statement)={$,else}
First(else-part)={else,ε} Follow(else-part)={$,else}
First(exp)={0,1} Follow(exp)={）}
the LL(1) parsing table

M[N,T] If Other Else 0 1 $

Statement Statement → Statement
if-stmt → other
If-stmt If-stmt → if
(exp)
statement
else-part
Else-part Else-part Else-par
→ else t →ε
statement
Else-part
→ε
exp Exp → 0 Exp →
1
Example
• Consider the following grammar with left
factoring applied.
(1) Stmt-sequence →Stmt Stmt-seq’
(2) Stmt-seq’ →; Stmt-sequence | ε
(3) Stmt→s
The first and follow set

First Sets Follow Sets

First(Stmt-sequence)={s} Follow(Stmt-sequence)={$}
First(Stmt)={s} Follow(Stmt)={; , $}
First(Stmt-seq’)={; , ε} Follow(Stmt-seq’)={$}
the LL(1) parsing table

M[N,T] S ; $
Stmt-sequence Stmt-sequence →
Stmt Stmt-seq’
Stmt Stmt→s
Stmt-seq’ Stmt-seq’ → ; Stmt-seq’ →|ε
Stmt-sequence|
4.3.4 Extending the lookahead:
LL(k) Parsers
Definition of LL(k)
• The LL(1) parsing method can be extend to k
symbols of look-ahead.
• Definitions:
– Firstk(α)={wk | α=>* w}, where, wk is the first k
tokens of the string w if the length of w > k, otherwise
it is the same as w.
– Followk(A)={wk | S$=>*αAw}, where, wk is the first
k tokens of the string w if the length of w > k,
otherwise it is the same as w.
• LL(k) parsing table:
– The construction can be performed as that of LL(1).
Complications in LL(k)
• The complications in LL(k) parsing:
– The parsing table become larger; since the number of
columns increases exponentially with k.
– The parsing table itself does not express the complete
power of LL(k) because the follow strings do not occur
in all contexts.
– Thus parsing using the table as we have constructed it
is distinguished from LL(k) parsing by calling it Strong
LL(k) parsing, or SLL(k) parsing.
Complications in LL(k)

• The LL(k) and SLL(k) parsers are

uncommon.
– Partially because of the added complex;
– Primarily because of the fact that a grammar
fails to be LL(1) is in practice likely not to be
LL(k) for any k.
4.4 A Recursive-Descent Parser
For The Tiny Language
The Grammar of the TINY
language in BNF
• program → stmt-sequence
• stmt-sequence→ stmt-sequence; statement | statement
• statement → if-stmt | repeat-stmt | assign-stmt | read-stmt |
write-stmt
• if-stmt→ if exp then stmt-sequence end
• | if exp then stmt-sequence else stmt-sequence end
• repeat-stmt→ repeat stmt-sequence until exp
• assign-stmt→ identifier := exp
• read-stmt → read identifier
• write-stmt → write exp
The Grammar of the TINY
language in BNF
• exp → simple-exp comparison-op simple-exp |
simple-exp
• comparison-op → < | =
• simple-exp → simple-exp addop term | term
• addop → +|-
• term → term mulop factor factor | factor
• mulop → *|/
• factor → (exp) |number |identifier
TINY PARSER CODES
• The TINY parser consists of two code files:
– parse.h and parse.c
– The parse.h: (see appendix B, lines 850-865)
– TreeNode * parse(void)
– The main routine parse will return a pointer to the
syntax tree constructed by the parser.
– The parse.c(see appendix B, lines 900-1114)
• 11 mutually recursive procedure that correspond
directly to the EBNF grammar.
• The operators non-terminals are recognized as
part of their associated expressions.
TINY PARSER CODES
• The static variable token is used to keep the look-ahead token
• The contents of each recursive procedures should be relatively self-
explanatory except stmt-sequence
• The utility procedures used by the recursive parsing procedures in
util.c:( Appendix B, lines 350-526).

• NewStmtNode(line 405-421): take the type of statement as parameter;

•
• Allocate a new statement node of this kind;
• Return a pointer to the newly allocated node.

• NewExpNode(line 423-440): take the type of exp ad parameter;

TINY PARSER CODES
• Allocate a new exp node of this kind;
• Return a pointer to the new allocated node.
• Copystring(line 442-455): take a string as
parameter;
• Allocate a sufficient space for a copy, and copy
the string;
• Return a pointer to the newly allocated copy.
• A procedure PrintTree in util.c (linge 473-506)
writes a linear version of the syntax tree to the
listing, so that we may view the result of a parse.
End of Part Two

THANKS

Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
No ratings yet
Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
86 pages
Chapter 3a - Syntax Analysis
No ratings yet
Chapter 3a - Syntax Analysis
10 pages
LL(1) Parsing Explained
0% (1)
LL(1) Parsing Explained
71 pages
04 Cparsing
No ratings yet
04 Cparsing
26 pages
Top Down Parser 1
No ratings yet
Top Down Parser 1
33 pages
LL 1
No ratings yet
LL 1
73 pages
LL(1) Parser Implementation Guide
No ratings yet
LL(1) Parser Implementation Guide
9 pages
Parsing
No ratings yet
Parsing
33 pages
Top-Down Parsing Techniques
No ratings yet
Top-Down Parsing Techniques
31 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
Module 4
No ratings yet
Module 4
91 pages
Compiler Design: 7. Top-Down Table-Driven Parsing
No ratings yet
Compiler Design: 7. Top-Down Table-Driven Parsing
9 pages
LL(1) Parsing Techniques Guide
No ratings yet
LL(1) Parsing Techniques Guide
22 pages
03 Syntaxanalysis 2 2012 2013
No ratings yet
03 Syntaxanalysis 2 2012 2013
83 pages
Lecture 5 - LL (1) Parsing Algorithm
No ratings yet
Lecture 5 - LL (1) Parsing Algorithm
23 pages
Unit 2
No ratings yet
Unit 2
11 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
Reserch Paper 4
No ratings yet
Reserch Paper 4
8 pages
Parsers
No ratings yet
Parsers
24 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
Top Down Parser
No ratings yet
Top Down Parser
5 pages
Module 4 - Top Down Parsing
No ratings yet
Module 4 - Top Down Parsing
31 pages
Non-Recursive Predictive Parsing
No ratings yet
Non-Recursive Predictive Parsing
14 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Chapter 3 - Compiler Design
No ratings yet
Chapter 3 - Compiler Design
29 pages
Module2 Predictive Parser
No ratings yet
Module2 Predictive Parser
20 pages
Ll1parser 190921075612
No ratings yet
Ll1parser 190921075612
84 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Lec 3
No ratings yet
Lec 3
25 pages
Unit 7
No ratings yet
Unit 7
34 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
1 LL and LR Parsing
No ratings yet
1 LL and LR Parsing
38 pages
CD Chapter-3
No ratings yet
CD Chapter-3
53 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Compiler Parsing Techniques
No ratings yet
Compiler Parsing Techniques
54 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
3 Syntax Analysis - Top Down Parsing
No ratings yet
3 Syntax Analysis - Top Down Parsing
9 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
Compiler Construction: Parsing: Mandar Mitra
No ratings yet
Compiler Construction: Parsing: Mandar Mitra
33 pages
Unit 3
No ratings yet
Unit 3
117 pages
CD - Ch.2
No ratings yet
CD - Ch.2
39 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
Linux
No ratings yet
Linux
12 pages
Lecture Notes On Predictive Parsing: 15-411: Compiler Design Frank Pfenning September 18, 2009
No ratings yet
Lecture Notes On Predictive Parsing: 15-411: Compiler Design Frank Pfenning September 18, 2009
11 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
1 Types of Parsers in Compiler Design
100% (1)
1 Types of Parsers in Compiler Design
4 pages
Lecture 17
No ratings yet
Lecture 17
57 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
84 pages
Reasoning With Horn Clauses
No ratings yet
Reasoning With Horn Clauses
6 pages
Logic-Rondo-2 3 0
No ratings yet
Logic-Rondo-2 3 0
80 pages
Comput Ability Decid Ability Complexity
No ratings yet
Comput Ability Decid Ability Complexity
172 pages
Homework #1 - Discrete Mathematics
No ratings yet
Homework #1 - Discrete Mathematics
4 pages
Wrox Beginning Tutorial Ebooks - Excel 2007 Vba Programmers Reference - 2
No ratings yet
Wrox Beginning Tutorial Ebooks - Excel 2007 Vba Programmers Reference - 2
4 pages
Syntax Directed Translation
No ratings yet
Syntax Directed Translation
3 pages
Regular Language Reversal
No ratings yet
Regular Language Reversal
75 pages
Dowek Principles of Programming Languages c2009
100% (2)
Dowek Principles of Programming Languages c2009
171 pages
Jaeger Thesis
No ratings yet
Jaeger Thesis
161 pages
CD Greeshma
No ratings yet
CD Greeshma
15 pages
Predicative and Impredicative Definitions - Oystein Linnebo
No ratings yet
Predicative and Impredicative Definitions - Oystein Linnebo
6 pages
Ci 10cs56 Flat
No ratings yet
Ci 10cs56 Flat
9 pages
Ambiguity in Grammar 1
No ratings yet
Ambiguity in Grammar 1
23 pages
Lex-Yacc For Exam
100% (1)
Lex-Yacc For Exam
17 pages
Truth-Teller, Liar, Alternator Puzzles
0% (1)
Truth-Teller, Liar, Alternator Puzzles
3 pages
Logic in Computer Science Course Out Line
No ratings yet
Logic in Computer Science Course Out Line
5 pages
Logical Agents I: Introducing The Wumpus
No ratings yet
Logical Agents I: Introducing The Wumpus
15 pages
Lecture 3
No ratings yet
Lecture 3
11 pages
Compiler Assignment
No ratings yet
Compiler Assignment
6 pages
Program To Convert Infix Expression To Postfix in C - Shunting Yard Algorithm
No ratings yet
Program To Convert Infix Expression To Postfix in C - Shunting Yard Algorithm
29 pages
Prolog DCG Exercises
No ratings yet
Prolog DCG Exercises
2 pages
Artificial Intelligence Syllabus
100% (1)
Artificial Intelligence Syllabus
2 pages
Usage
No ratings yet
Usage
389 pages
Eplogic PDF
No ratings yet
Eplogic PDF
137 pages
Class Record: Region Ivision School Name School Id School Year
No ratings yet
Class Record: Region Ivision School Name School Id School Year
26 pages
Introduction To Compiler Design (CD) : Mu-Mit
No ratings yet
Introduction To Compiler Design (CD) : Mu-Mit
22 pages
Pred Resolution
No ratings yet
Pred Resolution
21 pages
This Is Gold
No ratings yet
This Is Gold
47 pages
Lex & Yacc Tutorial for Programmers
No ratings yet
Lex & Yacc Tutorial for Programmers
38 pages
Java Basics and Syntax Guide
No ratings yet
Java Basics and Syntax Guide
77 pages

Lecture04 TopDownParsing 2

Uploaded by

Lecture04 TopDownParsing 2

Uploaded by

COMPILER CONSTRUCTION

Principles and Practice

while the top of the parsing stack ≠ $ and

if the top of the parsing stack is terminal a and the next

Statement → if-stmt | other

• The entry M[else-part, else] contains two entries, i.e.

• ( for conciseness, statement= S, if-stmt=I,

• The grammar will generate the strings of the form. βα n

• We rewrite this grammar rule into two rules:

• exp → exp addop term | term

A → Aα1| Aα2| … |Aαn|β1|β2|…|βm

The solution is similar to the simple case:

• Remove the left recursion as follows:

(1) A cycle is a derivation of at least one step that

(2) Programming language grammars do have ε-

(3) We remove the immediate left recursion of B to obtain

Now, the grammar has no left recursion.

• Left recursion removal not changes the

exp → term exp’

• From the given parse tree, we can see

• The structure of the syntax tree can be obscured by

• The parsing stack represents only predicated

Parsing Stack Input Action Value Stack

The LL(1) parsing algorithm is based on the LL(1)

• Simplified algorithm in the absence of

(1) exp → exp addop term

Grammar Rule Pass 1 Pass 2 Pass 3

• The First Sets:

Grammar Rule Pass 1 Pass 2

• The First sets are as follows:

First Sets Follow Sets

First Sets Follow Sets

M[N,T] If Other Else 0 1 $

First Sets Follow Sets

• The LL(k) and SLL(k) parsers are

• NewStmtNode(line 405-421): take the type of statement as parameter;

• NewExpNode(line 423-440): take the type of exp ad parameter;

You might also like