CST 302: Compiler Design Module V
B.Tech, CSE, S6
MODULE - V
CODE OPTIMIZATION AND GENERATION
SYLLABUS:
Code Optimization - Principal sources of optimization, Machine dependent and
machine independent optimizations, Local and global optimizations. Code generation
- Issues in the design of a code generator, Target Language, A simple code generator.
INTRODUCTION
The code generated by the compiler can be made faster or take less space or both. For
that, some transformations can be applied on the code, and is called optimization or
optimization transformations.
Compilers that can perform optimizations are called optimizing compilers.
Code optimization is an optional phase and it must not change the meaning of the
program.
Need For Optimization Phase in Compiler
Code produced by a compiler may not be perfect in terms of execution speed and
memory space.
Optimization by hand takes much more effort and time.
Machine level details like instructions and addresses are not known to the
programmer.
Advanced architecture features like instruction pipeline requires optimized code.
Structure reusability and maintainability of the code are improved.
Criteria for Code Optimization
It should preserve the meaning of the program i.e., it should not change the output or
produce error for a given input.
Eventually it should improve the efficiency of the program by a reusable amount.
Sometimes it may increase the size of the code or may slow the program slightly but
it should improve the efficiency.
It must be worth with the effort, i.e., the effort put on optimization must be worthy
when compared with the improvement.
1|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Types of Optimization:
There are two types of optimization:
1. Machine dependent optimization – run only in particular machine.
2. Machine independent optimization – used for any machine.
Optimization can be done in 2 phases:
1. Local optimization:
Transformations are applied over a small segment of the program called basic block, in
which statements are executed in sequential order.
2. Global optimization
Transformations are applied over a large segment of the program like loop, procedures,
functions etc. Local optimization must be done before applying global optimization.
Basic Block
Basic block is a sequence of consecutive 3 address statements which may be entered only at
the beginning and when entered statements are executed in sequence without halting or
branching.
2|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
To identify basic block, we have to find leader statements. Rules for leader statements are
Flow Graphs
3|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
4|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Principle Sources of Optimization
5|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
We can avoid re-computing the expression if we can use the previously
computed value.
6|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
7|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Note that in this example, the “if” statement is a dead code, whereas, “a=b+5” is an
unreachable code.
8|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Loop optimization techniques are:
Code Motion (or Code movement)
Induction Variables
Strength Reduction
Loop jamming (merging/combining)
Loop unrolling
9|P age
CST 302: Compiler Design Module V
B.Tech, CSE, S6
10 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Loop Jamming:
If the operations performed can be done in a single loop then, merge or combine
the loops.
11 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
The above program code can be converted to,
Loop Unrolling:
If there exists, simple code which can reduce the number of times the loop
executes then, the loop can be replaced with these codes.
This can be converted to,
12 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Optimization of Basic Blocks
There are two types of basic block optimizations:
1. Structure preserving transformations
2. Algebraic transformations
(Note: In structure preserving transformations, write about:
Dead code elimination
Copy propagation (variable propagation and constant propagation)
Common sub expression elimination
Strength reduction
Constant folding
Interchange of two independent statements )
Copy Propagation:
Strength Reduction:
13 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Constant Folding:
Solve the numeric expressions and use the result in its place, so that the compiler
need not calculate the result all the time.
Algebraic Transformations:
Local and Global Optimization: (If asked about the different techniques used -
Write about the basic block optimization techniques in local optimization and the
loop optimization techniques in the global optimization)
14 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
(Many of the structure preserving transformations can be done with the help of
Directed Acyclic Graphs - DAG)
Directed Acyclic Graph
15 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
16 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Machine Independent Optimization:
(All the above discussed optimization techniques are machine independent. i.e., we
did the optimization on three address code statements.)
Machine Dependent Optimization:
(When asked for 5 or 10 marks write about Peephole optimization)
Machine dependent optimization is done on the target code. This is an optional step. The
most common machine dependent optimization technique is the Peephole optimization
technique.
Peephole Optimization:
Peephole optimization is a simple and effective technique for locally improving target code. This
technique is applied to improve the performance of the target program by examining the
short sequence of target instructions (called the peephole) and replace these instructions
replacing by shorter or faster sequence whenever possible. Peephole is a small, moving
window on the target program.
The peephole optimization can be applied to the target code using the following characteristic.
1. Redundant Load/Store
2. Remove Unreachable Code
3. Flow of Control Optimization
4. Algebraic Simplifications
5. Use of Machine Idioms
Redundant Load/Store
The redundant load and store instructions can be eliminated. For example,
17 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
We can eliminate the second instruction since x is in already R0.
Unreachable Code
Unreachable code is a part of the program code that is never accessed because of
programming constructs. Programmers may have accidently written a piece of code that can
never be reached.
For example,
void fun()
{
int a=10, b=20, c;
c = a * 10;
return c;
b = b * 15;
return b;
}
Here when the statement „return c‟ is encountered, its flow will be transferred to the
caller of this function. Hence the two statements below it will never get executed.
Thus we can consider those two statements at unreachable and rewrite the code as,
void fun()
{
int a=10, b=20, c;
c = a * 10;
return c;
}
Now, we can see that in the above optimized code, „b=20‟ has become a dead code since
it is no longer used inside this function. Hence we can eliminate it.
Flow of Control Optimization:
There are instances in a code where the program control jumps back and forth
without performing any significant task. These jumps can be removed. Consider the
following chunk of code:
18 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
In this code, label L1 can be removed as it passes the control to L2. So instead of jumping
to L1 and then to L2, the control can directly reach L2, as shown below:
Algebraic Simplifications
Use of Machine Idioms
Some target instructions have equivalent machine instructions to perform some type
of operations. Hence we can replace these target instructions with the corresponding
machine instructions to improve the efficiency.
For example, some machines have auto-increment and auto-decrement addressing
modes. These modes can be used in the code for a statement like i = i+1.
CODE GENERATION
19 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Issues in the Design of a Code Generator:
Input to Code Generator
The input to the code generator consists of the intermediate representation of the
source program produced by the front end, together with information in the
symbol table.
There are several choices for the intermediate language including postfix notation,
three address representation such as quadruple, virtual machine
20 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
representations such as stack machine code, and graphical representations
such as syntax trees and DAGS.
We assume that prior to code generation the front end scanned, parsed and
translated the source program into a reasonably detailed intermediate
representation, along with type checking.
The code generation phase can therefore proceed on the assumption that its input
is free of errors.
Target Programs
The output of the code generator is the target program. This output may take
on a variety of forms- absolute machine language, relocatable machine language
or assembly language.
Producing an absolute machine language as output has the advantage that it
can be placed in a fixed location in memory and intermediate executed.
Producing a relocatable machine language program as output allows subprograms
to be compiled separately. A set of relocatable object modules can be linked
together and loaded for execution by a linking loader.
Producing an assembly language program as output makes the process of code
generation somewhat easier.
The instruction set architecture of the target machine has a significant impact
on the difficulty of constructing a good code generator that produces high quality
machine code. The most common target machine architectures are RISC
(reduced instruction set computer), CISC (complex instruction set computer) and
stack based.
The RISC machine has many registers, three-address instructions, simple
addressing modes and a relatively simple instruction set architecture. In contrast,
a CISC machine has few registers, two-address instructions, a variety of
addressing modes, several register classes, variable length instructions.
In stack based machine, operations are done by pushing operands onto the
stack and then performing the operations on the operands at the top of the stack.
To achieve high performance, the top of the stack is typically kept in registers.
Memory Management
Mapping of variable names to address is done co-operatively by the front end and
code generator. Name and width are obtained from symbol table. Width is the amount
of storage needed for that variable. Each three-address code is translated to addresses
and instructions during code generation. A relative addressing is done for each
instruction.
21 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Instruction Selection
The code generator must map the IR (Intermediate Representation) into a code sequence that can
be executed by the target machine. The complexity of performing this mapping is determined by
factors such as,
the level of the IR
the nature of the instruction-set architecture
the desired quality of the generated code
If the IR is high level, the code generator may translate each IR statement into a sequence of
machine instructions using code templates. Such statement-by-statement code generation
often produces poor code that needs further optimization. If IR reflects some of the low
level details of the underlying machine, then the code generator can use this information to
generate more efficient code sequence.
The nature of the instruction set of the target machine has a strong effect on the difficulty of
instruction selection. Uniformity and completeness of the instruction set are important factors.
Instruction speed and machine idioms are other important factors. If we do not care
about the efficiency of the target program, instruction selection is straightforward.
For each common three-address statement, a general code can be designed.
But if we are looking for the efficiency of the program, we need to look for several
other factors too. For example, consider the below statements.
22 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
The quality of the generated code is usually determined by its speed and size. For
example, if the target machine has an increment instruction INC, then the three-
address statement a = a+1 may be implemented more efficiently by the single
instruction INC a, rather than by a more obvious sequence that loads a into a register,
adds one to the register, and then store the result back into a.
That is, instead of writing
We can simply write, INC a.
Register allocation:
23 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Choice of Evaluation Order:
The order of evaluation can affect the efficiency of target code. Some order requires
fewer registers and instructions than others.
Picking the best order is an NP-complete problem. This can be solved up to an extent
by code optimization in which the order of instruction may change.
Target Machine
The source and destination of an instruction are specified by combining registers
and memory locations with address modes. The address modes together with their
assembly-language forms and associated costs are as follows:
24 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Immediate
25 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
Simple Code Generator
Code generator is used to produce the target code for three-address statements. It
uses registers to store the operands of the three address statement.
Code Generation Algorithm
The algorithm takes a sequence of three-address statements as input. For each three
address statement of the form x := y op z perform the various actions. These are as
follows:
y op z should
Code generator uses getReg function to determine the status of available registers
and the location of name values. getReg works as follows:
If variable Y is already in register R, it uses that register.
Else if some register R is available, it uses that register.
Else if both the above options are not possible, it chooses a register that
requires minimal number of load and store instructions. That is, it takes a
26 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
register which is already occupied, move its contents into some memory using
the instruction MOV R, M. and then uses the register R.
27 | P a g e
CST 302: Compiler Design Module V
B.Tech, CSE, S6
28 | P a g e