0% found this document useful (0 votes)

11 views174 pages

3 1 Parsing Syntax Analysis

The document discusses syntax analysis and parsing in compiler design, detailing the role of parsers in verifying the syntactic structure of programming languages through context-free grammars. It covers top-down and bottom-up parsing techniques, including derivations, parse trees, and the challenges associated with each method, such as ambiguity and left recursion. Additionally, it introduces predictive parsing and shift-reduce parsing, emphasizing the importance of grammar productions and parsing tables.

Uploaded by

12drstrange

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views174 pages

3 1 Parsing Syntax Analysis

Uploaded by

12drstrange

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 174

Compiler

(CS3104)
Syntax Analysis,
Parsing

Taken from: https://cse.iitkgp.ac.in/~bivasm/compiler2024.html

Lex – example-1
Input file – input_first

if + 78 else 0

Tokens: if, else, op (+,-), number, other

Parsing
► Every programming language has precise grammar rules that describe the
syntactic structure of well-formed programs
► In C, the rules states a program consists of functions, a function
consist of declarations and statements, a statement consists of
expressions, and so on.

► The task of a parser is to

(a) Obtain strings of tokens from the lexical analyzer and verify that the string
follows the rules of the source language
(b) Parser reports errors and sometimes recovers from it

• Type checking,
semantic analysis and
translation actions can
be interlinked with
parsing

• Implemented as a
single module.
Parsing
► Two major classes of parsing
► top-down and bottom-up
► Input to the parser is scanned from left to right, one symbol at a
time.

► The syntax of programming language constructs can be specified

by context-free grammars
► Grammars systematically describe the syntax of programming
language constructs like expressions and statements.

► Quick recall
Context free grammar
► A CFG is denoted as G = (N, T, P, S)
N : Finite set of non-terminals -- syntactic variables (stmt, expr)
T : Finite set of terminals ---- Tokens, basic symbols from which strings and
programs are formed
S : The start symbol -- set of strings it generates is the language generated
by the grammar
P : Finite set of productions -- specify the manner in which the terminals and
nonterminals can be combined to form strings

Productions

Start symbol:
head body
Task of a parser
Output of the parser is some representation of the parse tree for the stream of
tokens as input, that comes from the lexical analyzer.

• Top-down parser works for LL grammar

• Bottom-up parser works for LR grammars
• Only subclasses of grammars
• But expressive enough to describe most of the syntactic constructs
of modern programming languages.

Concentrate on parsing expressions

• Constructs that begin with keywords like while or int are relatively easy to
parse
• because the keyword guides the parsing decisions
• We therefore concentrate on expressions, which present more of challenge,
because of the associativity and precedence of operators
Derivations
The construction of a parse tree can be conceptualized as derivations

Derivation: Beginning with the start symbol, each rewriting step replaces a
nonterminal by the body of one of its productions.

A sentence of G is a sentential form with no nonterminals.

The language L(G) generated by a grammar G is its set of sentences.
Derivations
The construction of a parse tree can be conceptualized as derivations

Beginning with the start symbol, each rewriting step replaces a nonterminal by
the body of one of its productions.

Consider a grammar G

Derivation

1. Derivation of –(id+id) from start symbol E

2. –(id+id) is a sentence of G
3. At each step in a derivation, there are two choices to be made.
• Which nonterminal to replace? : leftmost derivations
• Accordingly we must choose a production
Derivations-- Rightmost derivations

Consider a grammar G

1. Derivation of –(id+id) from E

2. –(id+id) is a sentence of G
3. At each step in a derivation, there are two choices to be made.
• Which nonterminal to replace?
• Accordingly we must pick a production → Rightmost derivations,
Parse trees
► A parse tree is a graphical representation of a derivation that
exhibits
► the order in which productions are applied to replace
non-terminals

► The internal node is a non-terminal A in the head of the

production
► The children of the node are labelled, from left to right,
by the symbols in the body of the production by which
A was replaced during the derivation

► Same parse tree for leftmost and rightmost derivations

Sentential form
(leaves of a
parse tree)

parse tree for - ( id + id)

Ambiguity
► A grammar that produces more than one parse tree for some sentence is
said to be ambiguous
► An ambiguous grammar is one that produces more than one leftmost
derivation or more than one rightmost derivation for the same sentence.

Two distinct leftmost derivations for the sentence id + id * id

Ambiguity
Ambiguity
Unambiguous grammar
Top-Down Parsing
• Top-down parsing can be viewed as the problem of
• Constructing a parse tree for the input string,
• starting from the root and creating the nodes of the parse tree in
preorder
• Top-down parsing can be viewed as finding a leftmost derivation for an input
string
Top-Down Parsing

parse tree for - ( id + id)

Derivation

parse tree for - (+ id) ???

Top-Down Parsing

Left recursive

Non-Left recursive
Eliminating left recursion.

Generalization

Immediate left recursion

Eliminating left recursion.
Top-Down Parsing

Eliminating left recursion.

Unfolding all the left recursions

Top-Down Parsing
Challenges:
At each step of a top-down parse, the key problem is that of determining
the production to be applied for a nonterminal, say A.

(a) Recursive descent parsing: May require backtracking to find the correct
A-production to be applied
(b) Predictive parsing: No backtracking!
looking ahead at the input a fixed number of symbols (next symbols) – LL(k),
LL(1) grammars
Choose the correct
production
Recursive-Descent Parsing
Nondeterministic

Try other productions!

(a) A recursive-descent parsing consists of a set of procedures, one for each

nonterminal.
(b) Execution begins with the procedure for the start symbol S,
(c) Halts and announces success if S() returns and its procedure body scans the
entire input string.
(d) Backtracking: may require repeated scans over the input
The leftmost leaf, labeled c, matches the first
symbol of input w (i.e. c), so we advance the
input pointer to a

• We have a match for the second input symbol, a,

• So we advance the input pointer to d, the third
input symbol
• Compare d against the next leaf, labeled b
Failure !! Backtrack!
we must reset the input pointer to position a

• The leaf a matches the second input symbol of w (i.e. a)

and the leaf d matches the third input symbol d

• Since S() returns and we have scanned w and produced a

parse tree for w,
• We halt and announce successful completion of parsing
Top-Down Parsing
Challenges:
At each step of a top-down parse, the key problem is that of determining
the production to be applied for a nonterminal, say A.

(a) Recursive descent parsing: May require backtracking to find the correct
A-production to be applied
(b) Predictive parsing: No backtracking!
looking ahead at the input a fixed number of symbols (next symbols) –
LL(k), LL(1) grammars
Basic concept of Predictive parsing

Input string w=abcd

Grammar productions
One sentential form 1. X-> bA…
S=> aXY….
First symbol
2. X->cP ……

Grammar productions
Another sentential form
1. X-> €
S=> aXb 2. X-> ……

We know that b Follows X in any

sentential form
First(A)
How to compute First(X)
Basic concept of Predictive parsing

Input string w=abcd

Grammar productions
One sentential form 1. X-> bA…
S=> aXY….
First symbol
2. X->cP ……

Grammar productions
Another sentential form
1. X-> €
S=> aXb 2. X-> ……

We know that b Follows X in any

sentential form
Follow(A)
How to compute Follow(A)
S-> xAyz y in Follow(A)

S-> xAy Follow(A)=y

->xαBy Follow(B)=Follow(A)
Follow(F)=Follow(T)
Predictive parsing
Challenges:
At each step of a top-down parse, the key problem is that of determining
the production to be applied for a nonterminal, say A.

Parsing table M
LL(1) grammar => avoid confusion!!

First(α) and First(β) Disjoint sets

Basic concept of Predictive parsing

Input string w=abcd

One sentential form Grammar productions
S=> aXY…. 1. X-> bA…

First symbol
2. X-> bY……

Grammar productions
Another sentential form
1. X-> €
S=> aXb 2. X-> ……
3. X->bY….
We know that b Follows X in any
sentential form …. Follow(X)=b
Left Factoring
Left Factoring
Parsing table M

Obvious

Input string w=bacd

One sentential form Grammar productions
S=> bAY…. 1. A-> aX…

First symbol
2. A-> ……
Input string w=abcd
One sentential form Grammar productions
1. A-> α=>€
S=> aAb 2. A-> ……
We know that b Follows A in any
sentential form …. Follow(A)=b
First(FT’)={(,id}

First(*FT’)={*}

Follow(T’)={+,),$}

First((E))={(} First(id)={id}
Example of Non-LL(1) grammar

• For every LL(1) grammar, each parsing-table entry uniquely identifies a

production or signals an error.
• left-recursive or ambiguous grammars are not LL(1)

Input string
ibtibtaea
if b
then
if b
then
a
else
a
Example of Non-LL(1) grammar
Predictive Parsing
• Non-recursive version
• maintaining a stack explicitly, rather than implicitly via
recursive calls

Initial configuration
Recursive-Descent Parsing
Nondeterministic

Try other productions!

(a) A recursive-descent parsing consists of a set of procedures, one for each

nonterminal.
(b) Execution begins with the procedure for the start symbol S,
(c) Halts and announces success if S() returns and its procedure body scans the
entire input string.
(d) Backtracking: may require repeated scans over the input
Predictive Parsing
• Non-recursive version
• maintaining a stack explicitly, rather than implicitly via
recursive calls

Initial configuration
Predictive Parsing

Initial configuration

• The parser considers (i) the symbol

on top of the stack X, and (ii) the
current input symbol a.
• If X is a nonterminal, the parser
chooses an X-production from M[X, a] of
the parsing table.
• Otherwise, it checks for a match
between the terminal X and current
input symbol a.
a

Y1
Leftmost derivation
Predictive Parsing
The stack contains a sequence of grammar symbols

w
Bottom Up Parsing
• A bottom-up parse corresponds to the construction of a parse tree for an
input string
• Beginning at the leaves (the bottom) and working up towards the
root (the top)

Input
Choose the correct
production
Bottom Up Parsing Sentential forms

Derivation --- Rightmost derivation

Bottom-up parsing is therefore to construct a rightmost derivation in

reverse
Reduction
• A specific substring of input matching the body of a production
• Replaced by the nonterminal at the head of that production.

Production
Bcdxy=>Axy
A-> Bcd

• Bottom-up parsing as the process of "reducing" a string w to

the start symbol of the grammar

Challenges
(a) when to reduce and
(b) what production to apply, as the parse proceeds.
Reduction
E*id =>???

Reduction steps

Challenges
(a) when to reduce and
(b) what production to apply, as the parse proceeds.
Handle
• “Handle" is a substring of input that matches the body of a production
• Allows reduction => Towards start symbol=>reverse of a rightmost derivation

Production
Bcdxy=>Axy A-> Bcd
Right sentential forms
Terminals

handle

Identifying the handle is a challenge

Shift Reduce parsing

Bottom-up parsing in which

(a) Stack holds grammar symbols and
(b) Input buffer holds the rest of the string to be parsed.
(c) handle always appears at the top of the stack
Initial config. Final config.
Shift Reduce parsing
Handle always appears at the top of the
stack
A-> βBy
B->ɣ
Handle always appears at the top of the
stack
A-> y
B->ɣ
Conflict
Shift/reduce conflict: Cannot decide whether to shift or to reduce

Reduce/reduce conflict: Cannot decide which of several reductions to

make
Shift/reduce conflict
Shift Reduce parsing
LR Parsing
Challenges in shift-reduce parsing
(a) when to reduce and
(b) what production to apply, as the parse proceeds.

Examples:
Simple LR, LR(1), LALR

• LR parser makes shift-reduce decisions by LR(0) automaton

and maintaining states

• State represent sets of items

Items

Intuitively, an item indicates how much of a production body we

have seen at a given point in the parsing process.
Indicates that we hope to see a string derivable from
XYZ on the next input
Indicates that we have just seen on the input a string
derivable from X and that we hope next to see a string
derivable from YZ
Indicates that we have seen the body XYZ on input
string and that it may be time to reduce XYZ to A
Canonical LR(0) collection
• Sets of items => One state
• Collection of sets of items=> canonical LR(0) collection => Collection
of states

LR(0) automaton: Construct a deterministic finite automaton that is

used to make parsing decisions

To construct the canonical LR(0) collection for a grammar G,

we define (a) augmented grammar and (b) two functions, CLOSURE and
GOTO

Augmented grammar: If G is a grammar with start symbol S, then the

augmented grammar G'
Closure of Item Sets
Similar to I
Closure of Item Sets
Closure of Item Sets

Augmentation
Closure (I)

Can be easily derived from Kernel items

Closure of Item Sets
GOTO of Item Sets
• The second useful function is GOTO(I, X) where I is a set of items and X
is a grammar symbol.
• Defines the transitions in the LR(0) automaton

Assume that

GOTO(I,X) specifies the

I2=GOTO(I1, X) transition from the state for
X I under input X
I1 I2
I3
X I2=GOTO(I3, X)
GOTO of Item Sets
• The second useful function is GOTO(I, X) where I is a set of items and X
is a grammar symbol.
• Defines the transitions in the LR(0) automaton

Assume that

I5 X I8 I2=GOTO(I1, X)
GOTO(I,X) specifies the
transition from the state for
X I under input X
I1 I2
I3
X I2=GOTO(I3, X)
GOTO of Item Sets

I1 set
Canonical LR(0) collection
LR(0) automaton: Construct a deterministic finite automaton that is
used to make parsing decisions

• Sets of items => One state

• Collection of sets of items=> canonical LR(0) collection => Collection
of states
To construct the canonical LR(0) collection for a grammar G,
we define (a) augmented grammar and (b) two functions, CLOSURE and
GOTO

Augmented grammar: If G is a grammar with start symbol S, then the

augmented grammar G'
Canonical collection of sets of items

LR(0) automaton

I1
I1
C I2 C
LR(0) automaton

(a) The states of this

automaton are the
sets of items from the
canonical LR(0)
collection,

(b) the transitions are

given by the GOTO
function

We say "state j" to refer to

the state corresponding to the
set of items Ij.
Symbol
representation : X
LR-Parsing Algorithm

Parsing table
The stack holds a sequence of states
Where a shift-reduce parser shifts a symbol, an LR parser shifts a state
Top of the stack state (s_m) represents the state of the parser
Role of LR(0) automata in shift-reduce decisions

Key Idea Input: w=yaα

Consider we are in state j (maybe after scanning y symbols)
Next input symbol a
• If state j has a transition on a.
• Shift (to state k) on next input symbol a
• Otherwise, we choose to reduce;
• The items in state j will tell us which production to use

a
j k
• All transitions to state k must be for the same grammar symbol a. Thus,
each state has a unique grammar symbol associated with it (except the
start state 0)
• Multiple states may have same grammar symbol
Key Idea States

Reduction
With symbols,
Reduction is implemented by popping the body of the production (the
body is id) from the stack and pushing the head of the production (in
this case, F).
With states, (a) we pop state 5, which brings state 0 to the top and
(b) look for a transition on F, the head of the production.
(c) we push state 3
Shift Reduce parsing

Bottom-up parsing in which

i A j
SLR Parsing
table
LR-parsing algorithm.

S0
Optional
SLR Parsing
table
LR(0) automaton
Constructing SLR-Parsing Tables

• LR parser using an SLR-parsing table as an SLR parser

• Same for LR(1), LALR parser

• Step 1: Given a grammar, G, we augment G to produce G', with

a new start symbol S‘
• Step 2: Construct LR(0) items and LR(0) automata
• We construct canonical collection of sets of items for
G' together with the GOTO function.
• Step 3: Construct the parsing table
• Determine the ACTION and GOTO entries
SLR-Parsing Table: Algorithm
Input string

Ii Ij
a
A->αa.β

Stack: …αa looking for an handle

Key Idea States
SLR-Parsing Table: Algorithm
Input string

α
Ii

Stack: …α… May detected a handle!!

If this is a sentential form.
S=>..Aa…=>αa
Input string

α
Ii

Stack: …α.. May detected a handle!!

• If this is a sentential form.
S=>..Aa…=>αa • a follows A
Input string

α
Ii

Stack: …αa.. May detected a handle!!

• If this is a sentential form.
S=>..Aa…=>αa • a follows A
• a in Follow(A)!
SLR-Parsing Table: Algorithm
Input string

S
Ii

Done!!
SLR-Parsing Table: Algorithm
SLR Parsing
table
SLR-Parsing Table: Algorithm
SLR-Parsing Table: Example
LR(0) automaton
SLR-Parsing Table: Example
Non-SLR: Example
Grammar

Conflicting action!!
Non-SLR: Where is the problem?
id=*id
Right sentential derivation

S-> L=R -> L=L -> L=*R -> L=*L -> L=*id -> id=*id
SLR parsing
Stack: $ Input string: id=*id$

Stack: $ id Input string: =*id$

Stack: $ L Input string: =*id$ (Reduction with R->L??)
Stack: $ L= Input string: *id$
Stack: $ L=*id Input string: $
Stack: $ S Input string: $
Non-SLR: Where is the problem?
id=*id
Right sentential derivation

S-> L=R -> L=L -> L=*R -> L=*L -> L=*id -> id=*id
SLR parsing
Stack: $ Input string: id=*id$

Stack: $ id Input string: =*id$

Stack: $ L Input string: =*id$ (Reduction with R->L??)
Stack: $ R Input string: =*id$ Incorrect!
Stack: $ L=*id Input string: $
Stack: $ S Input string: $
Viable Prefixes
• The LR(0) automaton characterizes the strings of grammar symbols
that can appear on the stack of a shift-reduce parser for the
grammar.

• The stack contents must be a prefix of a right-sentential form.

• If the stack holds α and the rest of the input is x, then a sequence of
reductions will take αx to S.

Not all prefixes of right-sentential forms can appear on the stack

The prefixes of right sentential forms that can appear on the stack of
a shift reduce parser are called viable prefixes.
Handle always appears at the top of the
stack prefix

A-> βBy
B->ɣ
Viable prefix E+T
LR(0) automaton
Viable Prefixes
• the set of valid items for a viable prefix γ is
• Set of items reached from the initial state S along the
path labeled γ in the LR(0) automaton

SLR parsing is based on the fact that LR(0) automata recognize

viable prefixes and valid items.

Viable prefix
Shift

Reduction
SLR says…

βα

stack
IO βα
α Invalid right sentential form
Ii
β A→α.
.....
… S≠> …βAa..=> βαa ......
SLR says…

βα
Avoid
reduction!
stack
IO βα
α Invalid right sentential form
Ii
β A→α
… .....
S≠> …βAa..=> βαa Invalid item ......
Non-SLR: Where is the problem?
id=*id
Right sentential derivation

S-> L=R -> L=L -> L=*R -> L=*L -> L=*id -> id=*id
SLR parsing
Stack: $ Input string: id=*id$

Stack: $ id Input string: =*id$

S-> L=R -> L=L -> L=*R -> L=*L -> L=*id -> id=*id
SLR parsing
Stack: $ Input string: id=*id$

Stack: $ id Input string: =*id$

Stack: $ L Input string: =*id$ (Reduction with R->L??)
Stack: $ R Input string: =*id$ Incorrect!
Stack: $ L=*id Input string: $
Stack: $ S Input string: $
Non-SLR: Example
Grammar

Conflicting action!!
Non-SLR: Where is the problem?

Since ……… *id=id

It is possible to carry extra information in the state that will
allow us to rule out some of these invalid reductions
LR(1) Parser, CLR

Since

It is possible to carry extra information in the state that will

allow us to rule out some of these invalid reductions
• Splitting states
• Each state of an LR parser indicates exactly which input symbols
can follow a handle α for which there is a possible reduction to A
• This extra information is incorporated into the state by redefining
items to include a terminal symbol as a second component.
LR(1) Parser

Look-ahead a is implicit for SLR

LR(1) Sets of Items

by
Closure of Item Sets – LR(1)
Closure of Item Sets
LR(1) automation -- GOTO
LR(1) automation

I1
I1
C I2 C
LR(1) automation

No redundant states
LR(1) Parsing table

b is not important
Input string

Ii Ij
a
A->αa.β

Stack: …αa expecting an handle

LR(1) Parsing table
acc Input: dd

Stack Input
$0 dd$
$0 4 d$
$0 2 d$ C->d
$0 2 7 $
$0 2 5 $ C->d
$0 1 $ S->CC
LALR
• Considerably smaller than the canonical LR tables
• Most common syntactic constructs of programming languages
can be expressed conveniently by an LALR grammar

Same core
items, different
lookahead
• Sets of LR(1) items having the same core, that is, set of first
components,
• Merge these sets with common cores into one set of LALR
items.
Merge

,$
,$
,$
LALR -- GOTO

C
I3 , I6 I8 , I9

• Since the core of GOTO(I,X) depends only on the core,

• Goto's of merged sets can themselves be merged.
• Thus, there is no problem revising the GOTO function as we
merge sets of items.
LR(1) automation -- GOTO
LALR Parsing table
LALR conflicts
LALR item
Shift reduce conflict on a
• Shares same core in LR(1)!!
• Same conflict for LR(1)!

LR(1) items LALR item!

Reduce-reduce conflict on d, e!
No Reduce-reduce conflict on d, e
Efficient Construction of LALR Parsing Tables
We must attach the proper lookaheads to the LR(0) items in the
kernels, to create the kernels of the sets of LALR(l) items.
LR(1) Sets of Items

by
We must attach the proper lookaheads to the LR(0) items in the
kernels, to create the kernels of the sets of LALR(l) items.
LR(1) automation -- GOTO
• We are now ready to attach lookaheads to the kernels of the sets of
LR(0) items to form the sets of LALR(l) items.

• First, we know that $ is a lookahead for S'-> .S in the initial set of

LR(0) items.

• Algorithm gives us all the lookaheads generated spontaneously.

• After listing all those lookaheads, we must allow them to propagate

until no further propagation is possible.

• Keep track of "new“ lookaheads that have propagated into an item

but which have not yet propagated out.
Using Ambiguous Grammars Unambiguous grammar

• This grammar is ambiguous because it does not specify the associativity or

precedence of the operators + and *.

• The unambiguous grammar gives + lower precedence than *, and makes both
operators left associative.

• we might prefer to use the ambiguous grammar

• the parser for the unambiguous grammar will spend a substantial
fraction of its time reducing by the productions E -> T and T -> F,
• whose sole function is to enforce associativity and precedence.

• The parser for the ambiguous grammar will not waste time reducing by these
single productions (productions whose body consists of a single nonterminal).
Conflicts

Follow(E)={+,*}
Conflict resolution
However, these problems can be resolved using the precedence and
associativity information for + and *.
Consider the input id + id * id, which causes a parser to enter state 7 after
processing id + id;

In particular the parser reaches a configuration

If * takes precedence over +, the parser should shift * onto the stack

Thus the relative precedence of + followed by * uniquely determines how

the parsing action conflict between reducing E -> E + E and shifting on * in
state 7 should be resolved.
Conflict resolution
Problems can be resolved using the associativity information for +.
Consider the input id + id + id, which causes a parser to enter state 7 after
processing id + id;

In particular the parser reaches a configuration

Conflict resolution
However, these problems can be resolved using the precedence and associativity
information for + and *.
Consider the input id * id + id, which causes a parser to enter state 8 after
processing id * id;

In particular the parser reaches a configuration

Conflicts

Follow(E)={+,*}
Conflict resolution
The answer is that we should shift else, because it is "associated"
with the previous then.

We conclude that the shift/reduce conflict should be

resolved in favor of shift on input else

M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Syntax & Semantic Analysis Guide
No ratings yet
Syntax & Semantic Analysis Guide
32 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
54 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
Grammars
No ratings yet
Grammars
34 pages
Parsing
No ratings yet
Parsing
33 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Compiler Design 4
No ratings yet
Compiler Design 4
7 pages
Compiler Design: Parsing Techniques
No ratings yet
Compiler Design: Parsing Techniques
18 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
U 2 PPT
No ratings yet
U 2 PPT
91 pages
Syntax Analysis and Parsing Guide
No ratings yet
Syntax Analysis and Parsing Guide
105 pages
Parsers
No ratings yet
Parsers
24 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
Unit - 3 Syntax Analyzer
No ratings yet
Unit - 3 Syntax Analyzer
43 pages
UNIT-2: Parsing
No ratings yet
UNIT-2: Parsing
18 pages
Parsing Techniques Explained
No ratings yet
Parsing Techniques Explained
68 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Chapter 3a - Syntax Analysis
No ratings yet
Chapter 3a - Syntax Analysis
10 pages
Compiler Design Unit II-1
No ratings yet
Compiler Design Unit II-1
46 pages
CS6109 Module 5
No ratings yet
CS6109 Module 5
117 pages
ACD-UNIT-4 KR23 Compiler Phase 2 Syntax Analyzer 3 Semantics Analyzer 4 Intermediate Code Generator by DR Patil Yogita D
No ratings yet
ACD-UNIT-4 KR23 Compiler Phase 2 Syntax Analyzer 3 Semantics Analyzer 4 Intermediate Code Generator by DR Patil Yogita D
42 pages
Cdeprt
No ratings yet
Cdeprt
12 pages
Sukomal Parsing Till MidSem25
No ratings yet
Sukomal Parsing Till MidSem25
78 pages
04 Cparsing
No ratings yet
04 Cparsing
26 pages
Module 2a - With Soln
No ratings yet
Module 2a - With Soln
90 pages
Lecture 13
No ratings yet
Lecture 13
35 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
39 pages
Top Down Parsing
No ratings yet
Top Down Parsing
12 pages
Chapter - 3
No ratings yet
Chapter - 3
46 pages
03 Syntaxanalysis 2 2012 2013
No ratings yet
03 Syntaxanalysis 2 2012 2013
83 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Parser
No ratings yet
Parser
40 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Unit 3
No ratings yet
Unit 3
117 pages
Unit 3
No ratings yet
Unit 3
37 pages
Syntax Analyzer and Parsing Techniques
No ratings yet
Syntax Analyzer and Parsing Techniques
38 pages
Parsing Techniques Explained
No ratings yet
Parsing Techniques Explained
23 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Chapter-3 So Far
No ratings yet
Chapter-3 So Far
50 pages
Syllogism Rules and Types Guide
No ratings yet
Syllogism Rules and Types Guide
1 page
SLR Parser Generator
No ratings yet
SLR Parser Generator
1 page
Telecom Network Configuration
No ratings yet
Telecom Network Configuration
29 pages
Toc Pyq-2024
No ratings yet
Toc Pyq-2024
6 pages
Peano Axioms Questions Sols
No ratings yet
Peano Axioms Questions Sols
3 pages
Logic in Computer Science Course Out Line
No ratings yet
Logic in Computer Science Course Out Line
5 pages
TOC Notes
No ratings yet
TOC Notes
93 pages
Overview-Lex and Yacc Program
No ratings yet
Overview-Lex and Yacc Program
11 pages
Practical File: Department of Computer Science and Engineering
No ratings yet
Practical File: Department of Computer Science and Engineering
32 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Time: 3 Hours Total Marks: 70: Printed Pages: 02 Sub Code: RCS403 Paper Id: 110257 Roll No
No ratings yet
Time: 3 Hours Total Marks: 70: Printed Pages: 02 Sub Code: RCS403 Paper Id: 110257 Roll No
2 pages
ALC Prev - 2023
No ratings yet
ALC Prev - 2023
2 pages
CSC312 Automata Theory: Chapter # 4 Cont
No ratings yet
CSC312 Automata Theory: Chapter # 4 Cont
13 pages
Properties of Regular Languages: Reading: Chapter 4
No ratings yet
Properties of Regular Languages: Reading: Chapter 4
58 pages
Rutgers CS 205 Sp25 Homework 3
No ratings yet
Rutgers CS 205 Sp25 Homework 3
3 pages
Warm Up 2
No ratings yet
Warm Up 2
3 pages
Finalterm Exam-TOA PDF
No ratings yet
Finalterm Exam-TOA PDF
2 pages
Department of Computer Application It and Science Question Bank Mca Iv Semester Theory of Computation Unit-I I. Objective Type
0% (1)
Department of Computer Application It and Science Question Bank Mca Iv Semester Theory of Computation Unit-I I. Objective Type
17 pages
Chapter 4 - Syntax Analysis
100% (1)
Chapter 4 - Syntax Analysis
68 pages
Philosophical Logic: A Contemporary Introduction John Macfarlane - Download The Ebook and Explore The Most Detailed Content
100% (8)
Philosophical Logic: A Contemporary Introduction John Macfarlane - Download The Ebook and Explore The Most Detailed Content
56 pages
Multiplication Properties Worksheet
No ratings yet
Multiplication Properties Worksheet
3 pages
Pumping Lemma For CFG
No ratings yet
Pumping Lemma For CFG
15 pages
BTCOC502 Theory of Computation
No ratings yet
BTCOC502 Theory of Computation
2 pages
Proof Theory of Nelson S Paraconsistent Logic A U 2012 Theoretical Computer
No ratings yet
Proof Theory of Nelson S Paraconsistent Logic A U 2012 Theoretical Computer
38 pages
LR Parsing PDF
No ratings yet
LR Parsing PDF
20 pages
Lexical Analysis Examples
No ratings yet
Lexical Analysis Examples
11 pages
CD Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod Compiler
No ratings yet
CD Notes by Quantum City AIR 107, GATE CS 2024, Shreyas Rathod Compiler
37 pages
FLAT QPs
No ratings yet
FLAT QPs
22 pages
09 Strong Induction
No ratings yet
09 Strong Induction
40 pages
A Simple One-Pass Compiler (To Generate Code For The JVM)
No ratings yet
A Simple One-Pass Compiler (To Generate Code For The JVM)
70 pages

3 1 Parsing Syntax Analysis

Uploaded by

3 1 Parsing Syntax Analysis

Uploaded by

Compiler

Taken from: https://cse.iitkgp.ac.in/~bivasm/compiler2024.html

Tokens: if, else, op (+,-), number, other

► The task of a parser is to

► The syntax of programming language constructs can be specified

• Top-down parser works for LL grammar

Concentrate on parsing expressions

A sentence of G is a sentential form with no nonterminals.

1. Derivation of –(id+id) from start symbol E

1. Derivation of –(id+id) from E

► The internal node is a non-terminal A in the head of the

► Same parse tree for leftmost and rightmost derivations

parse tree for - ( id + id)

Two distinct leftmost derivations for the sentence id + id * id

parse tree for - ( id + id)

parse tree for - (+ id) ???

Immediate left recursion

Eliminating left recursion.

Unfolding all the left recursions

Try other productions!

(a) A recursive-descent parsing consists of a set of procedures, one for each

• We have a match for the second input symbol, a,

• The leaf a matches the second input symbol of w (i.e. a)

• Since S() returns and we have scanned w and produced a

Input string w=abcd

We know that b Follows X in any

Input string w=abcd

We know that b Follows X in any

S-> xAy Follow(A)=y

First(α) and First(β) Disjoint sets

Input string w=abcd

Input string w=bacd

• For every LL(1) grammar, each parsing-table entry uniquely identifies a

Try other productions!

(a) A recursive-descent parsing consists of a set of procedures, one for each

• The parser considers (i) the symbol

Derivation --- Rightmost derivation

Bottom-up parsing is therefore to construct a rightmost derivation in

• Bottom-up parsing as the process of "reducing" a string w to

Identifying the handle is a challenge

Bottom-up parsing in which

Reduce/reduce conflict: Cannot decide which of several reductions to

• LR parser makes shift-reduce decisions by LR(0) automaton

• State represent sets of items

Intuitively, an item indicates how much of a production body we

LR(0) automaton: Construct a deterministic finite automaton that is

To construct the canonical LR(0) collection for a grammar G,

Augmented grammar: If G is a grammar with start symbol S, then the

Can be easily derived from Kernel items

GOTO(I,X) specifies the

• Sets of items => One state

Augmented grammar: If G is a grammar with start symbol S, then the

(a) The states of this

(b) the transitions are

We say "state j" to refer to

Key Idea Input: w=yaα

Bottom-up parsing in which

• LR parser using an SLR-parsing table as an SLR parser

• Step 1: Given a grammar, G, we augment G to produce G', with

Stack: …αa looking for an handle

Stack: …α… *May* detected a handle!!

Stack: …α.. *May* detected a handle!!

Stack: …αa.. *May* detected a handle!!

Stack: $ id Input string: =*id$

Stack: $ id Input string: =*id$

• The stack contents must be a prefix of a right-sentential form.

Not all prefixes of right-sentential forms can appear on the stack

SLR parsing is based on the fact that LR(0) automata recognize

Stack: $ id Input string: =*id$

Stack: $ id Input string: =*id$

Since ……… *id=id

It is possible to carry extra information in the state that will

Look-ahead a is implicit for SLR

Stack: …αa expecting an handle

• Since the core of GOTO(I,X) depends only on the core,

LR(1) items LALR item!

Stack: …α… May detected a handle!!

Stack: …α.. May detected a handle!!

Stack: …αa.. May detected a handle!!