Intermediate Code Generation
Intermediate Code Generation
Translating source program into an “intermediate language.”
Simple
CPU Independent,
…yet, close in spirit to machine language.
Or, depending on the application other intermediate languages may be
used, but in general, we opt for simple, well structured intermediate
forms.
(and this completes the “Front-End” of Compilation).
Benefits
1. Retargeting is facilitated
2. Machine independent Code Optimization can be
applied.
Intermediate Code Generation (II)
Intermediate codes are machine independent codes, but they are close to
machine instructions.
The given program in a source language is converted to an equivalent program
in an intermediate language by the intermediate code generator.
Intermediate language can be many different languages, and the designer of the
compiler decides this intermediate language.
syntax trees can be used as an intermediate language.
postfix notation can be used as an intermediate language.
three-address code (Quadraples) can be used as an intermediate language
we will use quadraples to discuss intermediate code generation
quadraples are close to machine instructions, but they are not actual machine instructions.
Types of Intermediate Languages
Graphical Representations.
Consider the assignment a:=b*-c+b*-c:
assign assign
+
a + a
*
* *
b uminus uminus
uminus b
c c
b c
Syntax Dir. Definition for Assignment Statements
PRODUCTION Semantic Rule
S id := E { S.nptr = mknode (‘assign’, mkleaf(id, id.entry), E.nptr) }
E E1 + E2 {E.nptr = mknode(‘+’, E1.nptr,E2.nptr) }
E E1 * E2 {E.nptr = mknode(‘*’, E1.nptr,E2.nptr) }
E - E1 {E.nptr = mknode(‘uminus’,E1.nptr) }
E ( E1 ) {E.nptr = E1.nptr }
E id {E.nptr = mkleaf(id, id.entry) }
E num {E.nptr = mkleaf(num,num.val)}
Three Address Code
Statements of general form x:=y op z
No built-up arithmetic expressions are allowed.
As a result, x:=y + z * w
should be represented as
t1:=z * w
t2:=y + t1
x:=t2
Observe that given the syntax-tree or the dag of the graphical
representation we can easily derive a three address code for assignments
as above.
In fact three-address code is a linearization of the tree.
Three-address code is useful: related to machine-language/ simple/
optimizable.
Example of 3-address code
t1:=- c t1:=- c
t2:=b * t1 t2:=b * t1
t3:=- c t5:=t2 + t2
t4:=b * t3 a:=t5
t5:=t2 + t4
a:=t5
Types of Three-Address Statements.
Assignment Statement: x:=y op z
Assignment Statement: x:=op z
Copy Statement: x:=z
Unconditional Jump: goto L
Conditional Jump: if x goto L
ifFalse x goto L
Conditional Jump: if x relop y goto L
Procedures:
param x for parameters.
call p,n for procedure call.
y=call p,n for function call
return y used to represent returned value and is
optional. param x1
p( x1, x2, x3, ……. , xn ) param x2
…
param xn
call p,n
Types of Three Address Statements
Indexed Assignment instructions:
x:=y[i]
x[i]:=y
Address and Pointer Assignments:
x:=&y
x:=*y
*x:=y
Implementation of Three Address Statements
In a compiler Three Address Statements are implemented as
objects or records with fields for the operator and operands.
Three such representations are
Quadruples
Triples (or) Two Address Instructions
Indirect Triples
A Quadruple is a record with 4-fields namely op , arg1 , arg2
and result.
A triple is a record with 3-fields namely op , arg1 , arg2.
An Indirect Triple is a record containing a pointer to a triple.
Consider the assignment a:=b*-c+b*-c in high-level language.
Implementations of 3-address statements
Quadruples
t1:=- c op arg1 arg2 result
t2:=b * t1
t3:=- c
(0) uminus c t1
t4:=b * t3 (1) * b t1 t2
t5:=t2 + t4
a:=t5 (2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
Temporary names must be entered into the symbol
table as they are created.
Implementations of 3-address statements, II
Triples op arg1 arg2
t1:=- c
t2:=b * t1 (0) uminus c
t3:=- c
t4:=b * t3
(1) * b (0)
t5:=t2 + t4 (2) uminus c
a:=t5
(3) * b (2)
(4) + (1) (3)
(5) assign a (4)
Parenthesized numbers represents pointers into the triple
structure itself.
Temporary names are not entered into the symbol table.
Implementations of 3-address statements, III
Indirect Triples
op op arg1 arg2
(0) (14) (14) uminus c
(1) (15) (15) * b (14)
(2) (16) (16) uminus c
(3) (17) (17) * b (16)
(4) (18) (18) + (15) (17)
(5) (19) (19) assign a (18)
Representation of different Three Address Statements
as Quadruples
3-Address Statement Quadruple representation
op arg1 arg2 result
x := y op z op y z x
x := op y op y x
x := y = (or) y x
assign
goto L goto L
if x goto L goto x L
ifFalse x goto goto x L
if x relop y goto L relop x y L
param x param x
call p,n call p n
return y retrun y
x := y[i] =[] y i x
Representation of different 3- Address
Statements as Quadruples
3-Address Statement Quadruple representation
op arg1 arg2 result
x[i] := y []= y i x
x := &y & y x
x := *y * y x
Syntax-Directed Translation into 3-address code.
First deal with assignments.
Use attributes
E.place: the name that will hold the value of E
Identifier will be assumed to already have the place attribute defined.
E.code:hold the three address code statements that evaluate E (this
is the `translation’ attribute).
Use function newtemp that returns a new temporary variable that we
can use.
Use function gen to generate a single three address statement given the
necessary information (variable names and operations).
Syntax-Dir. Definition for 3-address code
PRODUCTION Semantic Rule
S id := E { S.code = E.code||gen(id.place ‘=’ E.place ‘;’) }
E E1 + E2 {E.place= newtemp ;
E.code = E1.code || E2.code ||
|| gen(E.place‘:=’E1.place‘+’E2.place) }
E E1 * E2 {E.place= newtemp ;
E.code = E1.code || E2.code ||
|| gen(E.place‘=’E1.place‘*’E2.place) }
E - E1 {E.place= newtemp ;
E.code = E1.code ||
|| gen(E.place ‘=’ ‘uminus’ E1.place) }
E ( E1 ) {E.place= E1.place ; E.code = E1.code}
E id {E.place = id.entry ; E.code = ‘’ }
e.g. a := b * - (c+d)
What about things that are not assignments?
E.g. while statements of the form “while E do S”
(intepreted as while the value of E is not 0 do S)
Extension to the previous syntax-dir. Def.
PRODUCTION
S while E do S1
Semantic Rule
S.begin = newlabel;
S.after = newlabel ;
S.code = gen(S.begin ‘:’)
|| E.code
|| gen(‘if’ E.place ‘=’ ‘0’ ‘goto’ S.after)
|| S1.code
|| gen(‘goto’ S.begin)
|| gen(S.after ‘:’)
Other types of 3-address statements
e.g. ternary operations like
x[i]:=y x:=y[i]
require two or more entries. e.g.
op arg1 arg2
(0) []= x i
(1) assign (0) y
op arg1 arg2
(0) []= y i
(1) assign x (0)
Dealing with Procedures
P procedure id ‘;’ block ‘;’
Semantic Rule
begin = newlabel;
Enter into symbol-table in the entry of the procedure name the begin
label.
P.code = gen(begin ‘:’) || block.code ||
gen(‘pop’ return_address) || gen(“goto return_address”)
S call id
Semantic Rule
Look up symbol table to find procedure name. Find its begin label called
proc_begin
return = newlabel;
S.code = gen(‘push’return); gen(goto proc_begin) || gen(return “:”)
Declarations
Using a global variable offset
PRODUCTION Semantic Rule
PMD {}
M {offset:=0 }
D id : T { addtype(id.entry, T.type, offset)
offset:=offset + T.width }
T char {T.type = char; T.width = 4; }
T integer {T.type = integer ; T.width = 4; }
T array [ num ] of T1
{T.type=array(1..num.val,T1.type)
T.width = num.val * T1.width}
T ^T1 {T.type = pointer(T1.type);
T1.width = 4}
Nested Procedure Declarations
For each procedure we should create a symbol table.
mktable(previous) – create a new symbol table where previous is the parent
symbol table of this new symbol table
enter(symtable,name,type,offset) – create a new entry for a variable in the
given symbol table.
enterproc(symtable,name,newsymbtable) – create a new entry for the
procedure in the symbol table of its parent.
addwidth(symtable,width) – puts the total width of all entries in the symbol
table into the header of that table.
We will have two stacks:
tblptr – to hold the pointers to the symbol tables
offset – to hold the current offsets in the symbol tables in tblptr stack.
Keeping Track of Scope Information
Consider the grammar fraction:
PD
D D ; D | id : T | proc id ; D ; S
Each procedure should be allowed to use independent names.
Nested procedures are allowed.
Keeping Track of Scope Information
(a translation scheme)
PMD { addwidth(top(tblptr), top(offset)); pop(tblptr); pop(offset) }
M { t:=mktable(null); push(t, tblptr); push(0, offset)}
D D1 ; D2 ...
D proc id ; N D ; S { t:=top(tblpr); addwidth(t,top(offset));
pop(tblptr); pop(offset);
enterproc(top(tblptr), id.name, t)}
N {t:=mktable(top(tblptr)); push(t,tblptr); push(0,offset);}
D id : T {enter(top(tblptr), id.name, T.type, top(offset);
top(offset):=top(offset) + T.width
Example: proc func1; D; proc func2 D; S; S