Syntax-Directed Translation
Dr. Biswapati Jana
Dept. of Computer Science
Vidyasagar University.
1
Syntax-Directed Translation
Translation of languages guided by context-free grammars.
Attach attributes to each grammar symbols.
Values of the attributes are computed by semantic rules
associated with the grammar productions.
-- Syntax–directed definitions: Each production has a set of
semantic rules associated with it.
--Translation Schemes: Embed the semantic actions within
the productions.
Conceptually we parse the input token stream, build the
parse tree and then traverse the tree as needed to evaluate the
semantic rules. The rules may generate code, save information
in the symbol table, issue error messages, etc.
Syntax-Directed Definitions
A syntax-directed definition is a generalization of a context-free
grammar in which:
– Each grammar symbol is associated with a set of attributes.
– This set of attributes for a grammar symbol is partitioned into
two subsets called
• synthesized and
• inherited attributes of that grammar symbol.
– Each production rule is associated with a set of semantic rules.
Semantic rules set up dependencies between attributes which can be
represented by a dependency graph.
This dependency graph determines the evaluation order of these semantic
rules.
Evaluation of a semantic rule defines the value of an attribute. But a
semantic rule may also have some side effects such as printing a value.
3
Annotated Parse Tree
A parse tree showing the values of attributes at each node is
called an annotated parse tree.
The process of computing the attributes values at the nodes is
called annotating (or decorating) of the parse tree.
Of course, the order of these computations depends on the
dependency graph induced by the semantic rules.
4
Syntax-Directed Definition -- Example
Production Semantic Rules
L → E return print([Link])
E → E1 + T [Link] = [Link] + [Link]
E→T [Link] = [Link]
T → T1 * F [Link] = [Link] * [Link]
T→F [Link] = [Link]
F→(E) [Link] = [Link]
F → digit [Link] = [Link]
1. Symbols E, T, and F are associated with a synthesized attribute val.
2. The token digit has a synthesized attribute lexval (it is assumed that it
is evaluated by the lexical analyzer).
5
Annotated Parse Tree -- Example
Input: 5+3*4 L
[Link]=17 return
[Link]=5 + [Link]=12
[Link]=5 [Link]=3 * [Link]=4
[Link]=5 [Link]=3 [Link]=4
[Link]=5 [Link]=3
6
Dependency Graph
Input: 5+3*4 L
[Link]=17
[Link]=5 [Link]=12
[Link]=5 [Link]=3 [Link]=4
[Link]=5 [Link]=3 [Link]=4
[Link]=5 [Link]=3
7
Syntax-Directed Definition – Inherited Attributes
Production Semantic Rules
D→TL [Link] = [Link]
T → int [Link] = integer
T → float [Link] = float
L → L1 , id [Link] = [Link], addtype([Link],[Link])
L → id addtype([Link],[Link])
Symbol T is associated with a synthesized attribute type.
Symbol L is associated with an inherited attribute in.
8
A Dependency Graph – Inherited Attributes
Input: float p,q
D [Link]=float
T L [Link]=float [Link]=float addtype(q,float)
float L id addtype(p,float) [Link]=q
id [Link]=p
parse tree dependency graph
9
Draw the Syntax Tree a-4+c
id
to entry for c
id num 4
to entry for a
10
Types of Intermediate Languages
Graphical Representations.
Consider the assignment a:=b*-c+b*-c:
assign assign
a + +
a
*
* *
b uminus b uminus uminus
c c
b c
Directed Acyclic Graphs for Expressions
a+ a*(b–c)+(b–c)*d
+ *
* d
a -
b c
12
Intermediate Code Generation
Three Address Code
Three Address Code
Statements of general form x:=y op z
As a result, x:=y + z * w
should be represented as
t1:=z * w
t2:=y + t1
x:=t2
Observe that given the syntax-tree or the dag of the graphical
representation we can easily derive a three address code for
assignments as above.
In fact three-address code is a linearization of the tree.
Three-address code is useful: related to machine-language/ simple/
optimizable.
Three address code
• In a three address code there is at most one
operator at the right side of an instruction
• Example:
a+ a*(b–c)+(b–c)*d
+
t1 = b – c
+ *
t2 = a * t1
* d t3 = a + t2
a - t4 = t1 * d
b c
t5 = t3 + t4
Types of Three-Address Statements.
Assignment Statement: x:=y op z
Assignment Statement: x:=op z
Copy Statement: x:=z
Unconditional Jump: goto L
Conditional Jump: if x relop y goto L
Stack Operations: Push/pop
More Advanced:
Procedure:
param x1
param x2
…
param xn
call p,n
Index Assignments:
x:=y[i]
x[i]:=y
Address and Pointer Assignments:
x:=&y
x:=*y
*x:=y
Data structures for three address codes
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references
to instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to
triples
Implementations of 3-address statements
Quadruples
op arg1 arg2 result
t1:=- c
t2:=b * t1 (0) uminus c t1
t3:=- c
t4:=b * t3 (1) * b t1 t2
t5:=t2 + t4
a:=t5 (2) uminus c
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
Temporary names must be entered into the symbol table
as they are created.
Implementations of 3-address statements, II
Triples op arg1 arg2
t1:=- c (0) uminus c
t2:=b * t1
(1) * b (0)
t3:=- c
(2) uminus c
t4:=b * t3
(3) * b (2)
t5:=t2 + t4
a:=t5 (4) + (1) (3)
(5) assign a (4)
Temporary names are not entered into the symbol table.
Implementations of 3-address statements, III
Indirect Triples
op op arg1 arg2
(0) (14) (14) uminus c
(1) (15) (15) * b (14)
(2) (16) (16) uminus c
(3) (17) (17) * b (16)
(4) (18) (18) + (15) (17)
(5) (19) (19) assign a (18)
Write quadruple, triples and indirect triples for following
expression : (x + y) * (y + z) + (x + y + z)
The three address code is:
(1) t1 = x + y
(2) t2 = y + z
(3) t3 = t1 * t2
(4) t4 = t1 + z
(5) t5 = t3 + t4
Boolean Expressions
• compute logical values
• change the flow of control
• boolean operators are: and or not
E → E or E
| E and E
| not E
| (E)
| id relop id
| true
| false
25
Numerical representation
• a or b and not c
t1 = not c
t2 = b and t1
t3 = a or t2
• relational expression a < b is equivalent to
if a < b then 1 else 0
1. if a < b goto 4.
2. t = 0
3. goto 5
4. t = 1
5.
26
Example:
Code for a < b or c < d and e < f
100: if a < b goto 103 if e < f goto 111
101: tl = 0 109: t3 = 0
102: goto 104 110: goto 112
103: tl = 1 111: t3 = 1
104:
112:
if c < d goto 107
t4 = t2 and t3
105: t2 = 0
106: goto 108 113: t5 = tl or t4
107: t2 = 1
108:
27