Dpco Unit 4

CHENDHURAN COLLEGE OF ENGINEERING AND TECHNOLOGY
Pilivalam (Po), Thirumayam (Tk), Pudukkottai (Dt.) – 622 507
Department of Computer Science and Engineering

III Semester
CS3351 – Digital Principles and Computer Organization
UNIT IV
PROCESSOR
PART-A
1. Define datapath.AU May-14, 18, Dec.-16
Ans.: Datapath is an unit used to operate on or hold data within a processor. Its elements include the
instruction and data memories, the register file, the ALU and adders.
2. Mention the various phases in executing an instruction. AU Dec.-17
Ans.: The four stages in the instruction pipelining are :
S1 - Fetch (F): Read instruction from the memory.
S2- Decode (D): Decode the opcode and fetch source operand (s) if necessary.
S3- Execute (E): Perform the operation specified by the instruction.
S4-Store (S): Store the result in the destination.
3. What is the ideal speed-up expected in a pipelined architecture with 'n' stages? Justify
your answer. AU: May-07
• The performance of pipelined processor depends on whether the functional units are pipelined and whether
they are multiple execution units to allow all possible combination of instructions in the pipeline. If for
some combination, pipeline has to be stalled to avoid the resource conflicts then there is a structural hazard.
• In other words, we can say that when two instructions require the use of a given hardware resource at the
same time, the structural hazard occurs.
• The most common case in which this hazard may arise is in access to memory. One instruction may need
to access memory for storage of the result while another instruction or operand needed is being fetched. If
instructions and data reside in the same cache unit, only one instruction can proceed and the other
instruction is delayed. To avoid such type of structural hazards many processors use separate caches for
instruction and data.
4. What is meant by hazard in pipelining?AU: May-13, 17
Ans.: Any reason that causes the pipeline to stall is called a hazard.
5. How addressing modes affect the instruction pipelining?AU : May-14
Ans.: Degradation of performance is an instruction pipeline may be due to address dependency where
operand address cannot be calculated without available information needed by addressing mode for e.g. an
instructions with register indirect mode cannot proceed to fetch the operand if the previous instructions is
1
loading the address into the register. Hence operand access is delayed degrading the performance of
pipeline.
6. What is data hazard in pipelining? What are the solutions ? AU: Dec.-07, May-12, 13
Ans.: When either the source or the destination operands of an instruction are not available at the time
expected in the pipeline and as a result pipeline is stalled, we say such a situation is a data hazard.
1. The easiest way to handle data hazards is to stall the pipeline.
2. The second simple hardware technique which can handle data hazard is called forwarding or register by
passing.
7. What are the classification of data hazards ?AU: May-17
Ans.: The data hazard can be classified as,
1. RAW (read after write) hazard
2. WAW (write after write) hazard
3. WAR (write after read) hazard
8. What is meant by speculative execution ? AU: May-12
Ans.: Speculative execution means that instructions are executed before the processor is certain that they are
in the correct execution sequence. Hence, care must be taken that no processor registers or memory
locations are updated until it is confirmed that these instructions should indeed be executed. If the branch
decision indicates otherwise, the instructions and all their associated data in the execution units must be
purged, and the correct instructions fetched and executed.
9. What is called static and dynamic branch prediction ?
OR Differentiate between the static and dynamic techniques. AU: May-13
Ans.: The branch prediction decision is always the same every time a given instruction is executed. Any
approach that has this characteristic is called static branch prediction. Another approach in which the
prediction decision may change depending on execution history is called dynamic branch prediction.
10. What is the role of cache in pipelining? AU: Dec.-11.
Ans.: Each pipeline stage is expected to complete in one clock cycle. The clock period should be long enough
to let the slowest pipeline stage to complete. Faster stages have to wait for the slowest one to complete.
Since main memory is very slow compared to the execution, if each instruction needs to be fetched from
main memory, pipeline is almost useless. The cache memory reduces the memory access time and makes
pipelining useful.
2
1. Explain about building a data path. (13)
3
4
5
6
7
8
9
2. Explain about pipelining(13)
10
11
12
3. What are pipeline hazards? Outline the types of pipeline hazards. (13)
Hazards
There are situations in pipelining when the next instruction cannot execute in the following
clock cycle. These events are called hazards, and there are three different types.
1) Structural Hazard
2) Data Hazard
3) Control Hazard
In the above scenario, in cycle 4, instructions I1 and I4 are trying to access same resource (Memory) which
introduces a resource conflict.
● To avoid this problem, we have to keep the instruction on wait until the required resource (memory in our
case) becomes available. This wait will introduce stalls in the pipeline as shown below:
13
When the above instructions are executed in a pipelined processor, then data dependency condition will
occur, which means that I2 tries to read the data before I1 writes it, therefore, I2 incorrectly gets the old
value from I1.
● To minimize data dependency stalls in the pipeline, operand forwarding is used.
Operand Forwarding : In operand forwarding, we use the interface registers present between the stages to
hold intermediate output so that dependent instruction can access new value from the interface register
directly.
14
The output which we get I1 -> I2 -> I3 -> BI1
So, the output sequence is not equal to the expected output, that means the pipeline is not implemented
correctly.
1. Using Stalls - To correct the above problem we need to stop the Instruction fetch until we get target address
of branch instruction. This can be implemented by introducing delay slot until we get the target address
2. Branch Prediction - There are 2 different types of prediction

1. Static
a. In this strategy branch can be predicted based on branch code types statically. This means that the
probability of branch with respect to a particular branch type is used to predict the branch.
b. This branch strategy may not produce accurate results every time. One improvement over branch stalling
is to predict that the branch will not be taken and thus continue execution down the sequential instruction
stream.
2. Dynamic
a. This strategy uses recent branch history during program execution to predict whether or not the branch
will be taken next time when it occurs. It uses recent branch information to predict the next branch.
15
This technique is called dynamic branch prediction.
b. A branch prediction buffer or branch history table is a small memory indexed by the lower portion of the
address of the branch instruction.
The memory contains a bit that says whether the branch was recently taken or not.
3. Delayed Branching
1) The slot directly after a delayed branch instruction, which in the MIPS architecture is filled by an
instruction that does not affect the branch.
2) An instruction that always executes after the branch in the branch delay slot.
4. (i) Outline a control unit with a diagram and state the functions performed by a control unit.(8)
(ii) Outline the difference between hardwired and microprogrammedcontrol (5)
16
Hardwired Control Unit
● It is a method of generating control signals with the help of Finite State Machines
(FSM). It’s made in the form of a sequential logic circuit by physically connecting
components such as flip-flops, gates, and drums that result in the finished circuit. As a
result, it’s known as a hardwired controller.
Instruction register is a type of processor register used to contain an instruction that

is currently in execution. It generates the OP-code bits respective of the operation as
well as the addressing mode of operands.
● The instruction decoder decodes the opcode. Now on the basis of the addressing
mode of instruction and operation which exists in the instruction register, the
instruction decoder sets the corresponding Instruction signal INSi to 1.
● Step Counter - specifies the current step of instruction execution. It contains the
signals from T1,…., T5. Now on the basis of the step which contains the instruction,
one of the signals of a step counter will be set from T1 to T5 to 1.
● Clock - The one-clock cycle of the clock will be completed for each step. For
example, suppose that if the stop counter sets T3 to 1, then after completing one clock
cycle, the step counter will set T4 to 1.
● Counter Enable will "disable" the Step Counter so that it will stop till current step of
execution is complete,then increment to the next step signal.
● Condition Signals - There are various conditions in which the signals are generated
with the help of control signals that can be less than, greater than, less than equal,
greater than equal, and many more.
● The external input is the last one. It is used to tell the Control Signal Generator
17
about the interrupts, which will affect the execution of an instruction.
Microprogrammed Control Unit
● A control unit whose binary control values are saved as words in memory is called a
microprogrammed control unit.
18
5. (i)Explain about Instruction Execution (6)
I. Instruction Execution
Steps in detail:
❖ All instructions start by using the program counter to supply the instruction address to
the instruction memory.
❖ After the instruction is fetched, the register operands used by an instruction are
specified by fields of that instruction.
❖ Once the register operands have been fetched, all the instruction classes, except jump,
use the ALU after reading the registers.
➢ Memory reference instructions (load or store) use the ALU for an address
calculation.
➢ Arithmetic Logical instructions use the ALU for the operation execution.
➢ Branches use the ALU for comparison.
❖ The second input to the ALU can come from a register or the immediate field of the
instruction.
❖ After using the ALU, the actions required to complete various instruction classes are
not same.
➢ If the operation is a memory reference instruction a load or store, the ALU

result is used as an address to either store a value from the registers or load a
value from memory into the registers. The result from the ALU or memory is
written back into the register file.
➢ If the instruction is an arithmetic-logical instruction, the result from the ALU

must be written to a register.
➢ Branches require the use of the ALU output to determine the next instruction
address, which comes either from the ALU (where the PC and branch off set
are summed) or from an adder that increments the current PC by 4.
Main 5 steps
1. Fetch an instruction and increment the program counter.
2. Decode the instruction and read registers from the register file.
3. Perform an ALU operation.
4. Read or write memory data if the instruction involves a memory operand.
19
5. Write the result into the destination register, if needed.
Load Instruction
Eg. Load R5, X(R7)
Steps are as follows:

1. Fetch the instruction from the memory.
2. Increment the program counter.
3. Decode the instruction to determine the operation to be performed.

4. Read register R7.
5. Add the immediate value X to the contents of R7.
6. Use the sum X + [R7] as the effective address of the source operand, and read the
contents of that location in the memory.
7. Load the data received from the memory into the destination register, R5.
❖ Depending on how the hardware is organized, some of these actions can be performed
at the same time.
Arithmetic and Logic Instruction
● There are either two source registers, or a source register and an immediate source
operand.
● No access to memory operands is required.
Eg. Add R3, R4, R5
Steps as follows
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read registers R4 and R5.
3. Compute the sum [R4] + [R5].
4. No action.
5. Load the result into the destination register, R3.
Store Instruction
Store R6, X(R8)
20
Steps as follows:
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read registers R6 and R8.
3. Compute the effective address X + [R8].
4. Store the contents of register R6 into memory location X + [R8].
5. No action.
5.(ii)An instruction pipeline has five stages where each stage takes 2 nanoseconds and all instructions
use all five stages, branch instruction are not overlapped (i.e.) the instruction after the branch is not
fetched till the branch instruction is completed. Under ideal condition. (7)
Calculate the average instruction execution time assuming that 20% of all instruction executed are
branch instructions. Ignore the fact that some branch instructions may be conditional.
Each stage is 2ns. So, after 5 time units each of 2ns, the first instruction finishes (i.e., after 10ns), in every 2ns
after that a new instruction gets finished. This is assuming no branch instructions. Now, once the pipeline is full,
we can assume that the initial fill time doesn't matter our calculations and average execution time for each
instruction is 2ns assuming no branch instructions.
A. Now, we are given that 20% of instructions are branch (like JMP) and when a branch instruction is
executed, no further instruction enters the pipeline. So, we can assume every 5th instruction is a branch
instruction. So, with this assumption, total time to finish 5 instruction will be 5∗2+8=18 ns (as when a
branch instruction enters the pipeline and before it finishes, 4 pipeline stages will be empty
totaling 4∗2=8 ns, as it is mentioned in question that the next instruction fetch starts only when branch
instruction completes). And this is the same for every set of 5 instructions, and hence the average
instruction execution time =18/5=3.6 ns
B. This is just a complex statement. But what we need is to identify the % of branch instructions which
cause a branch to be taken as others will have no effect on the pipeline flow.
20% of instructions are branch instructions. 80% of branch instructions are conditional.
That means .2∗.8=16% of instructions are conditional branch instructions and it is given that 50% of
those result in a branch being taken.
So, 8% of instructions are conditional branches being taken and we also have 20% of 20%=4% of
unconditional branch instructions which are always taken.
So, percentage of instructions where a branch is taken is 8+4=12% instead of 20% in (A) part.
So, in 100 instructions there will be 12 branch instructions. We can do a different calculation here as compared
to (A) as 12 is not a divisor of 100. Each branch instruction causes a pipeline delay of 4∗2=8 ns.
So, 12 instructions will cause a delay of 12∗8=96 ns. For 100 instructions, we need 100∗2=200 ns without any
delay and with delay we require 200+96=296 ns for 100 instructions.
So, average instruction execution time =296/100=2.96 ns
21

Dpco Unit 4

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Dpco Unit 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dpco Unit 4

Uploaded by

Copyright:

Available Formats

CHENDHURAN COLLEGE OF ENGINEERING AND TECHNOLOGY

Pilivalam (Po), Thirumayam (Tk), Pudukkottai (Dt.) – 622 507

Department of Computer Science and Engineering

2. Branch Prediction - There are 2 different types of prediction

Instruction register is a type of processor register used to contain an instruction that

➢ Branches use the ALU for comparison.

➢ If the operation is a memory reference instruction a load or store, the ALU

➢ If the instruction is an arithmetic-logical instruction, the result from the ALU

Eg. Load R5, X(R7)

Steps are as follows:

3. Decode the instruction to determine the operation to be performed.

Store R6, X(R8)

So, average instruction execution time =296/100=2.96 ns

You might also like