[go: up one dir, main page]

0% found this document useful (0 votes)
140 views20 pages

Unit 5 - Compiler Design - WWW - Rgpvnotes.in

This document provides information about code optimization and generation in compiler design. It discusses the organization of a code optimizer and basic blocks and flow graphs. The key points are: 1. Code optimization aims to produce efficient target code while maintaining semantic equivalence and improving program efficiency. Both programmers and compilers play a role in generating efficient code. 2. Optimization techniques can be classified as machine dependent or machine independent. Basic blocks, flow graphs, and dominators are important concepts for code optimization and generation. 3. Common optimization techniques include loop optimization, common subexpression elimination, dead code elimination, and variable propagation. Basic blocks are sequences of statements without branches. Flow graphs represent basic blocks and control flow.

Uploaded by

Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views20 pages

Unit 5 - Compiler Design - WWW - Rgpvnotes.in

This document provides information about code optimization and generation in compiler design. It discusses the organization of a code optimizer and basic blocks and flow graphs. The key points are: 1. Code optimization aims to produce efficient target code while maintaining semantic equivalence and improving program efficiency. Both programmers and compilers play a role in generating efficient code. 2. Optimization techniques can be classified as machine dependent or machine independent. Basic blocks, flow graphs, and dominators are important concepts for code optimization and generation. 3. Common optimization techniques include loop optimization, common subexpression elimination, dead code elimination, and variable propagation. Basic blocks are sequences of statements without branches. Flow graphs represent basic blocks and control flow.

Uploaded by

Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Program : B.

Tech
Subject Name: Compiler Design
Subject Code: IT-603
Semester: 6th
Downloaded from www.rgpvnotes.in

Department of Information Technology


Subject Notes
IT603 (A) – Compiler Design
B.Tech, IT-6th Semester

Unit V
Syllabus: Code Optimization and Generation: organization of code optimizer, basic blocks and
flow graphs, DAG representation of basic blocks, loops in flow graph, peephole optimization,
Basic of block optimization.

Unit Outcome: Design the structures and support required for compiling advanced language
features.

Unit 5: Code Optimization and Generation


The code optimization is required to produce an efficient target code. There are two
important issues that need to be considered while applying the techniques for code
optimization:
1. The semantic equivalence of the source program must not be changed.
2. The important over the program efficiency must be achieved without changing the
algorithm of the program.

The code become inefficient because of two factors; one is programmer and the other factor
is compiler. This is the responsibility of both programmer and the compiler to generate
efficient code using proper usage of target code.

Front End Intermediate Code Code generation

•Programmer •Apply •Use registers,


should make use transformations select
of efficient on loops, appropriate
algorithm procedure calls instructions and
and address do peephole
calculations optimization

Figure 5.1: Code Optimization

Classification of Optimization:
The classification of optimization can be done in two categories:

Page no: 1 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Machine
Dependent
Code
Optimization
Machine
Independent

Figure 5.2: Classification of Code Optimization


The machine dependent optimization is based on characteristics of the target machine for the
instruction set used and addressing modes used for the instruction to produce the efficient
target code. The machine dependent optimization can be achieved using the following
criteria:

 Allocation of sufficient number of resources to improve the execution efficiency of the


program.
 Using immediate instructions whenever necessary.
 The use of intermix instructions. The intermixing of instruction along with the data
increases the speed of execution.
The machine independent optimization is based on the characteristics of the programming
languages for appropriate programming structure and usage of efficient arithmetic
properties in order to reduce the execution time. The machine independent optimization
can be achieved using the following criteria:

 The code should be analyzed completely and use alternative equivalent sequence
of source code that will produce a minimum amount of target code.
 Use appropriate program structure in order to improve the efficiency of target
code.
 From the source program eliminate the unreachable code.
 Move two or more identical computations at one place and make use of the result
instead of each time computing the expressions.

Principle Sources of Optimization:


The optimization can be done locally or globally. If the transformation is applied on the same
basic block then that kind of transformation is done locally otherwise transformation is done
globally. Generally local transformation is done first.

Page no: 2 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Compile Time
Evaluation

Common Sub
Loop Optimization Expression
Elimination

Dead Code Variable


Elimination Propagation

Strength Reduction Code Movement

Figure 5.3: Principle Sources of Optimization

Basic Blocks:
The basic block is a sequence of consecutive statements in which flow of control enters at the
beginning and leaves at the end without halt or possibility of branching. Basic blocks are
important concepts from both code generation and optimization point of view.
Some Terminologies used in Basic Blocks

 Define and use: The three address statement a: = b+c is said to define a and to use b
and c.
 Live and dead: The name in the basic block is said to be live at a given point if its value
is used after that point in the program. And the name (variable) in the basic block is
said to be dead at a given point if its value is never used after that point in the program.
Algorithm for Partitioning into blocks:
Any given program can be partitioned into basic blocks by using following algorithm.
1. First determine the leaders by using following rules:
 The first statement is a leader.
 Any target statement of conditional or unconditional goto is a leader.
 Any statement that immediately follow a goto or unconditional goto is a
leader.

Page no: 3 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

2. The basic block is formed starting at the leader statement and ending just before the
next leader statement appearing.

Figure 5.4: Source Code and Basic Block


Basic blocks play an important role in identifying variables, which are being used more than
once in a single basic block. If any variable is being used more than once, the register memory
allocated to that variable need not be emptied unless the block finishes execution.
Flow Graph:
Basic blocks in a program can be represented by means of control flow graphs. A control flow
graph depicts how the program control is being passed among the blocks. It is a useful tool
that helps in optimization by help locating any unwanted loops in the program. A control flow
graph is a directed graph in which the flow control information is added to the basic blocks.

 The nodes to the control flow graph are represented by basic block.
 The block whose leader is the first statement is called initial block.
 There is a directed edge from block B1 to block B2 if B2 immediately follows B1 in the
given sequence. We can say that B1 is a predecessor if B2.

Page no: 4 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Figure 5.5: Basic Block and Flow Graph

Loop:
Loop is a collection of nodes in the flow graph such that,

 All such nodes are strongly connected. That means always there is a path from any
node to any other node within that loop.
 The collection of nodes has unique entry. That means always there is only one path
from a node outside the loop to the node inside the loop.
 The loop that contains no other loop is called inner loop.
Loops on Flow Graph:
A graph representation of three-address statements, called a flow graph, is useful for
understanding code-generation algorithms, even if the graph is not explicitly constructed by
a code-generation algorithm. Nodes in the flow graph represent computations, and the edges
represent the flow of control.

Dominators:
In a flow graph, a node d dominates node n, if every path from initial node of the flow graph
to n goes through d. This will be denoted by d dom n. Every initial node dominates all the
remaining nodes in the flow graph and the entry of a loop dominates all nodes in the loop.
Similarly every node dominates itself.

Example:
In the flow graph below, initial node, node1 dominates every node; node 2 dominates itself;
node 3 dominates all but 1 and 2; node 4 dominates all but 1, 2 and 3; node 5 and 6 dominates
only themselves, since flow of control can skip around either by going through the other; node

Page no: 5 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

7 dominates 7, 8, 9 and 10; node 8 dominates 8,9 and 10; node 9 and 10 dominates only
themselves.

Figure 5.6: Flow Graph and Dominator Tree


The way of presenting dominator information is in a tree, called the dominator tree, in which

• The initial node is the root.


• The parent of each other node is its immediate dominator.
• Each node d dominates only it’s descendent in the tree.
The existence of dominator tree follows from a property of dominators; each node has a
unique immediate dominator in that is the last dominator of n on any path from the initial
node to n node. In terms of the dom relation, the immediate dominator m has the property
is d=!n and d dom n, then d dom m.
Natural Loops:
One application of dominator information is in determining the loops of a flow graph suitable
for improvement. There are two essential properties of loops:

 A loop must have a single entry point, called the header. This entry point-dominates
all nodes in the loop, or it would not be the sole entry to the loop.
 There must be at least one way to iterate the loop(i.e.)at least one path back to the
header One way to find all the loops in a flow graph is to search for edges in the flow
graph whose heads dominate their tails. If a→b is an edge, b is the head and a is the
tail. These types of edges are called as back edges.
Inner loops:
If we use the natural loops as “the loops”, then we have the useful property that unless two
loops have the same header, they are either disjointed or one is entirely contained in the

Page no: 6 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

other. Thus, neglecting loops with the same header for the moment, we have a natural notion
of inner loop: one that contains no other loop.
When two natural loops have the same header, but neither is nested within the other, they
are combined and treated as a single loop.

Pre-Headers:

Several transformations require us to move statements “before the header”. Therefore begin
treatment of a loop L by creating a new block, called the pre-header. The pre-header has only
the header as successor, and all edges which formerly entered the header of L from outside L
instead enter the pre-header. Edges from inside loop L to the header are not changed. Initially
the pre-header is empty, but transformations on L may place statements in it.

Figure 5.7: (a) Two loops with the same header (b) Introduction of the pre-header
Reducible flow graphs:

Reducible flow graphs are special flow graphs, for which several code optimization
transformations are especially easy to perform, loops are unambiguously defined, dominators
can be easily calculated, and data flow analysis problems can also be solved efficiently.
Exclusive use of structured flow-of-control statements such as if-then-else, while-do,
continue, and break statements produces programs whose flow graphs are always reducible.
The most important properties of reducible flow graphs are that

 There are no jumps into the middle of loops from outside;


 The only entry to a loop is through its header
Global Optimization:
The global optimization is applied over a broad scope such as procedure or function body. For
a global optimization a program is represented in the form of program flow graph. The
program flow graph is a graphical representation in which each node represents the basic
block and edge represent the flow of control from one block to another.

Page no: 7 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Control Flow Analysis

Data Flow Analysis

Figure 5.8: Types of Global Optimization


Control & Data Flow Analysis:
The control flow analysis determines the information regarding arrangement of graph nodes
(basic blocks), presence of loops, nesting of loops and nodes visited before execution of a
specific node. Using this analysis optimization can be performed. Thus in control flow analysis
the analysis is made on the flow of control be carefully examining the program flow graph.
In data flow analysis the analysis is made on the data flow. That is the data flow analysis
determines the information regarding the definition and use of the data in the program. Using
this kind of analysis optimization can be done. The data flow analysis is basically a process in
which the values are computed using data flow properties.
DAG Representation of Block:
The directed acyclic graph is used to apply transformations to the basic block. A DAG gives a
picture of how the value computed by each statement in a basic block used in subsequent
statements of the block. To apply the transformations on a basic block a DAG is constructed
from three address statement.
A DAG can be constructed for the following type of labels on nodes:

 Leaf nodes are labeled by identifiers or variable names or constants. Generally, leaves
represent R-values.
 Interior nodes store operator values.
 Nodes are also optionally given a sequence of identifiers for a label.
The DAG and flow graphs are two different pictorial representations. Each node of the flow
graph can be represented by DAG because each node of the flow graph is a basic block.
Example:

Page no: 8 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Figure 5.9: Basic Block

Page no: 9 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Page no: 10 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Figure 5.10: Steps in DAG Construction Process


Algorithm for Construction of DAG:
We assume the three address statement could of following typestyles,

Case (i) x: = y op z
Case (ii) x: = op y
Case (iii) x: = y

Page no: 11 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

With the help of the following step, the DAG can be constructed.
Step 1: If y is undefined then create a node (y). Similarly, if z is undefined create a: node (z)
Step 2: For the case (i) create a node (op) whose left child is node (y) and node (z) will be) the
right child. Also, check for any common sub-expressions. For the case (ii) determine whether
is a node labeled op, such node will have a child node (y). In case (iii) node n win be node
(node(y).
Step 3: Delete x from list of identifiers for node (x). Append x to the list of attached identifiers
for node n found in 2.
Applications of DAG:
1. Determining the common sub-expressions (expressions computed more than once).

2. Determining which names are used inside the block and computed outside the block.
3. Determining which statements of the block could have their computed value outside the
block.
4. Simplifying the list of quadruples by elimination the common sub-expressions and not
eliminating expressions performing the assignment of form x: =y unless and until it is a must.
Optimization of Basic Blocks:
There are two types of basic block optimizations. They are:

Structure
Preserving
Transformations

Algebraic
Transformations

Figure 5.11: Types of Basic Block Optimization


Structure-Preserving Transformations:

Page no: 12 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Common Sub-
expression Renaming of
Elimination Temporary Variables

Dead Code Interchange of two


Elimination independent
adjacent statements

Figure 5.12: Primary Structure-Preserving Transformation


Common Sub-expression Elimination:

Common sub expressions need not be computed over and over again. Instead they can be
computed once and kept in store from where it’s referenced.

Example:
a: =b+c
b: =a-d
c: =b+c
d: =a-d

The 3rd and 4th statements compute the same expression: b+c and a-d
Basic block can be transformed to
a: = b+c
b: = a-d
c: = a

d: = b
Dead Code Elimination:
It is possible that a large amount of dead (useless) code may exist in the program. This might
be especially caused when introducing variables and procedures as part of construction or
error-correction of a program - once declared and defined, one forgets to remove them in
case they serve no purpose. Eliminating these will definitely optimize the code.
Renaming of Temporary Variables:
A statement t:=b+c where t is a temporary name can be changed to u:=b+c where u is another
temporary name, and change all uses of t to u. In this a basic block is transformed to its
equivalent block called normal-form block.
Interchange of Two Independent Adjacent Statements:

Page no: 13 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

Two statements
t1:=b+c
t2:=x+y

can be interchanged or reordered in its computation in the basic block when value of t1 does
not affect the value of t2.

Algebraic Transformations:
Algebraic identities represent another important class of optimizations on basic blocks. This
includes simplifying expressions or replacing expensive operation by cheaper ones i.e.
reduction in strength. Another class of related optimizations is constant folding. Here we
evaluate constant expressions at compile time and replace the constant expressions by their
values. Thus the expression 2*3.14 would be replaced by 6.28.

The relational operators <=, >=, <, >, + and = sometimes generate unexpected common sub
expressions. Associative laws may also be applied to expose common sub expressions.

For example, if the source code has the assignments


a: =b+c
e: =c+d+b
the following intermediate code may be generated:
a: =b+c; t :=c+d; e :=t+b

Example:
x: =x+0 can be removed
x: =y**2 can be replaced by a cheaper statement x:=y*y
The compiler writer should examine the language specification carefully to determine what
rearrangements of computations are permitted, since computer arithmetic does not always
obey the algebraic identities of mathematics. Thus, a compiler may evaluate x*y-x*z as x*(y-
z) but it may not evaluate a+(b-c) as (a+b)-c.
Peephole Optimization:
A statement-by-statement code-generations strategy often produces target code that
contains redundant instructions and suboptimal constructs. The quality of such target code
can be improved by applying “optimizing” transformations to the target program.
A simple but effective technique for improving the target code is peephole optimization, a
method for trying to improving the performance of the target program by examining a short
sequence of target instructions (called the peephole) and replacing these instructions by a
shorter or faster sequence, whenever possible.

Page no: 14 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

The peephole is a small, moving window on the target program. The code in the peephole
need not be contiguous, although some implementations do require this. It is characteristic
of peephole optimization that each improvement may spawn opportunities for additional
improvements.
Characteristics of Peephole Optimization:

Machine
Idioms
Reduction in
Strength
Algebric
Simplification
The Flow of
Control
Unreachable Optimization
Code
Redundant
Instruction
Elimination

Figure 5.13: Characteristics of Peephole Optimization

Redundant Instruction Elimination:


Especially the redundant loads and stores can be eliminated in this type of transformations.
Example:
MOV R0, x
MOV x, R0

We can eliminate the second instruction since x is in already R0. But if MOV x, R0 is a label
statement then we cannot remove it.

Unreachable Code:

 Especially the redundant loads and stores can be eliminated in this type of
transformations.
 An unlabeled instruction immediately following an unconditional jump may be
removed.
 This operation can be repeated to eliminate the sequence of instructions.
Example:
#define debug 0
If(debug) {
Print debugging information
}
In the intermediate representation the if statement may be translated as if-

Page no: 15 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

If debug=1 goto L1
goto L2
L1: print debugging information

L2:
Additionally, one obvious peephole optimization is to eliminate jumps over jumps. Thus no
matter what the value of debugging, can replaced by:
If debug goto L2debug≠1
Print debugging information
L2:
Now, since debug set to 0 at the beginning of the program, constant propagation program,
should replace by

If 0≠1 goto L2≠1


Print debugging information
L2:
As the argument of the first statement of evaluates to a constant true, it can replace by goto
L2.
Then all the statement that prints debugging aids are manifestly unreachable and can
manifestly eliminate one at a time.
The Flow of Control Optimization:
The unnecessary jumps can eliminate in either the intermediate code or the target code by
the following types of peephole optimizations.

We can replace the jump sequence.


Goto L1
……
L1: goto L2
By the sequence

Goto L2
…….
L1: goto L2
If there are no jumps to L1 then it may be possible to eliminate the statement L1: goto L2
provided it preceded by an unconditional jump. Similarly, the sequence

Page no: 16 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

If a<b goto L1
……
L1: goto L2

Can replaced by
If a<b goto L2
……
L1: goto L2
Algebraic Simplification:

 So Peephole optimization is an effective technique for algebraic simplification.


 The statements such as x = x + 0 or x:= x* 1 can eliminated by peephole optimization.
Reduction in Strength:

 Certain machine instructions are cheaper than the other.


 In order to improve the performance of the intermediate code, we can replace these
instructions by equivalent cheaper instruction.
 So For example, x2 is cheaper than x * x. Similarly, addition and subtraction are
cheaper than multiplication and division. So we can add an effectively equivalent
addition and subtraction for multiplication and division.
Machine Idioms:

 So the target instructions have equivalent machine instructions for performing some
have operations.
 Hence we can replace these target instructions by equivalent machine instructions in
order to improve the efficiency.
 Example: Some machines have auto-increment or auto-decrement addressing modes.
These modes can use in a code for a statement like i=i+1.

Basic of Block Optimization:


Optimization process can be applied on a basic block. While optimization, we don't need to
change the set of expressions computed by the block.
There are two type of basic block optimization. These are as follows:
Structure-Preserving Transformations
Algebraic Transformations

1. Structure preserving transformations:


The primary Structure-Preserving Transformation on basic blocks is as follows:

Page no: 17 Get real-time updates from RGPV


Downloaded from www.rgpvnotes.in

 Common sub-expression elimination


 Dead code elimination
 Renaming of temporary variables
 Interchange of two independent adjacent statements
Algebraic transformations:
In the algebraic transformation, we can change the set of expression into an algebraically
equivalent set. Thus the expression x: = x + 0 or x: = x *1 can be eliminated from a basic block
without changing the set of expression.
Constant folding is a class of related optimization. Here at compile time, we evaluate constant
expressions and replace the constant expression by their values. Thus the expression 5*2.7
would be replaced by13.5.
Sometimes the unexpected common sub expression is generated by the relational operators
like <=, >=, <, >, +, = etc.
Sometimes associative expression is applied to expose common sub expression without
changing the basic block value. If the source code has the assignments

Page no: 18 Get real-time updates from RGPV


We hope you find these notes useful.
You can get previous year question papers at
https://qp.rgpvnotes.in .

If you have any queries or you want to submit your


study notes please write us at
rgpvnotes.in@gmail.com

You might also like