Ut2 QB
Ut2 QB
2) How does a Hardwired control unit differ from a Micro programmed control
unit?
A Hardwired Control Unit and a Microprogrammed Control Unit differ mainly in
how they generate control signals to execute instructions in a computer's
CPU.
Hardwired Control Unit uses fixed logic circuits, including gates, flip-flops, and
other combinational logic to generate specific control signals directly. The
control signals are predetermined and hard-coded into the hardware.
Microprogrammed Control Unit stores a set of microinstructions (microcode)
in memory. These microinstructions are executed one by one to generate
control signals, providing a more flexible way of controlling the CPU
operations.
Hardwired Control Unit is difficult to modify because the control signals are
hardcoded. Any change requires redesigning and replacing hardware
components, making it less adaptable. Microprogrammed Control Unit is easy
to modify since the control signals are generated from microinstructions
stored in memory. To modify or add new operations, you can simply update
the microprogram, making it more flexible.
Comparison:
The Hardwired Control Unit generates control signals using fixed logic
circuits such as combinational logic gates, flip-flops, and decoders.
Control signals are created directly from hardware based on the current
instruction and the status of the CPU.
The control logic is "hardwired" or permanently fixed, meaning the design
is static and cannot be easily modified once implemented.
Main Idea: Instructions are executed by activating specific control signals
through predefined hardware paths.
Advantages:
1) A micro‐programmed control unit is flexible and allows designers to
incorporate new and more powerful instructions as VLSI technology increases
the available chip area for the CPU
2) Allows any design errors discovered during the prototyping stage to be
removed.
Disadvantages:
1) Requires several clock cycles to execute each instruction, due to the
access time of the microprogram memory
5) Occupies a large portion (typically 55%) of the CPU chip area.
•
Here the behaviour of control unit is represented in the form of a table,
which is known as the state table.
• Here, each row represents the T-states and the columns represent the
instructions.
• Every intersection of the specific column to each row indicates which
control signal will be produced in the corresponding T- state of an
instruction.
• Here the hardware circuitry is designed for each column(i.e. for a
specific instruction) for producing control signals in different T-states.
Advantage –
•It is the simplest method.
•This method is mainly used for small instruction set processors(i.e. in
RISC processors).
Drawback –
• In modern processors ,there is a very large number of instruction set.
Therefore, the circuit becomes complicated to design, difficult to
debug, and if we make any modifications to the state table then the
large parts of the circuit need to be changed.
• Therefore ,this is not widely used for these kinds of processors.
• There are many redundancies in circuit design like the control signals
are required for fetching the instruction is common and which is
repeated for N number of instruction. So the cost of circuitry design
may increase.
The instruction ADD R1, R2 means "Add the value in register R2 to the value in
register R1 and store the result in R1." This process can be divided into the
following steps in a microprogrammed control unit.
Instruction Fetch:
MAR ← PC (Move the content of the Program Counter to the Memory Address
Register)
MDR ← Memory[MAR] (Move the content of the memory at the address in MAR
to the Memory Data Register)
IR ← MDR (Load the fetched instruction into the Instruction Register)
PC ← PC + 1 (Increment the Program Counter to point to the next instruction)
Instruction Decode:
Decode the instruction in IR to recognize the ADD R1, R2 operation. The control
unit identifies that the operation is an addition and the registers involved are R1
and R2.
Explanation:
- Step 1-4 (Fetch): These steps retrieve the instruction from memory and
prepare for execution by updating the Program Counter.
- Step 5 (Decode): The control unit decodes the instruction to identify the
operation (ADD) and registers (R1 and R2).
- Step 6-7 (Execute): The value of R2 is fetched into a temporary register,
and then it's added to the value in R1, storing the result back into R1.
—---------------------------------------------------------------------------------------------------------
CHAPTER 5
3) A block set associative cache memory consists of 128 blocks divided into 4
block sets.The main memory consists of 16384 blocks and each block
contains 256 eight-bit words. i) How many bits are required for addressing
main memory. ii) How many bits are needed to represent TAG, SET and
WORD fields.
4) Compare with suitable parameters SRAM with DRAM
5) Consider a direct mapped cache of size 512 KB with block size 1 KB.
There are 7 bits in the tag. Find-
Size of main memory
Tag directory size
6) Consider a direct mapped cache with block size 4 KB. The size of main
memory is 16 GB and there are 10 bits in the tag. Find-
Size of cache memory
Tag directory size
—---------------------------------------------------------------------------------------------------------
CHAPTER 6
Instruction
1 2 3 4 5
/ Cycle
I1 IF(Mem) ID EX Mem
I2 IF(Mem) ID EX
I3 IF(Mem) ID EX
I4 IF(Mem) ID
In the above scenario, in cycle 4, instructions I1 and I4 are trying to access same
resource (Memory) which introduces a resource conflict.
To avoid this problem, we have to keep the instruction on wait until the required
resource (memory in our case) becomes available. This wait will introduce stalls
in the pipeline as shown below:
Instruction/
1 2 3 4 5 6 7
Cycle
I1 IF(CM) ID EX DM WB
I2 IF(CM) ID EX DM WB
I3 IF(CM) ID EX DM WB
I4 IF(CM) ID EX DM
I5 IF(CM) ID EX
I6 IF(CM) ID
I7 IF(CM)
Control Dependency
Solution for Control dependency Branch Prediction is the method through which
stalls due to control dependency can be eliminated. In this at 1st stage prediction
is done about which branch will be taken. For branch prediction Branch penalty is
zero.
Branch penalty : The number of stalls introduced during the branch operations in
the pipelined processor is known as branch penalty.
NOTE : As we see that the target address is available after the ID stage, so the
number of stalls introduced in the pipeline is 1. Suppose, the branch target
address would have been present after the ALU stage, there would have been 2
stalls. Generally, if the target address is present after the kth stage, then there
will be (k – 1) stalls in the pipeline.
Total number of stalls introduced in the pipeline due to branch instructions
= Branch frequency * Branch Penalty
Data Dependency
Occur when an instruction depends on the result of a previous instruction that
has not yet completed.
It occurs when there is a conflict in access of an operand location
E.g. Two instructions I1 and I2
I2 dependent on I1
Consider A=10
I1: A<- A +5
I2: B<- A X 2
3 types of data hazards
a) RAW
b) WAR
c) WAW
2) RAW hazard occurs when instruction J tries to read data before instruction
I writes it.
Eg:
I: R2 <- R1 + R3
J: R4 <- R2 + R3
3) WAR hazard occurs when instruction J tries to write data before instruction
I reads it.
Eg:
I: R2 <- R1 + R3
J: R3 <- R4 + R5
4) WAW hazard occurs when instruction J tries to write output before
instruction I writes it.
Eg:
I: R2 <- R1 + R3
J: R2 <- R4 + R5
5) WAR and WAW hazards occur during the out-of-order execution of the
instructions.
Non-Pipeline Processor:
In a non-pipeline processor, each instruction is executed sequentially. Since
each instruction has 4 stages (Fetch, Decode, Execute, Write), and each stage
takes 1 nanosecond (nsec), the total time for one instruction is:
Time for one instruction = 4×1 nsec = 4 nsec
Time for 10 instructions = 10×4 nsec = 40nsec
Thus, the total time required on a non-pipelined processor is 40 nsec.
Pipeline Processor:
In a pipeline processor, after the first instruction enters the second stage, the
next instruction can enter the first stage, allowing for overlapping execution. The
pipeline allows each instruction to start every 1 nsec, but the first instruction still
takes 4 nsec to fully complete. The remaining instructions follow one after
another with a 1 nsec gap between them.
Thus, the total time required for the pipeline processor is:
Total time = 4 nsec + 9 nsec =13 nsec
7) Draw and explain 4 stage instruction pipelining and briefly describe the
hazards associated with it.
Fetch (IF):
The CPU fetches the instruction located at the address given by the PC
(Program Counter) from memory, increments the PC, and stores the instruction
in the Instruction Register (IR).
Decode (ID):
The CPU decodes the fetched instruction to understand the operation and
determine the sources of data (registers, memory, or immediate values). Control
signals are generated to execute the instruction.
Execute (EX):
The decoded instruction is executed. For example, the ALU performs operations
like addition, subtraction, or bitwise logic. If the instruction is a memory
load/store, memory addresses are calculated, or branch conditions are
evaluated.
Write-Back (WB):
The result of the operation is written to the appropriate register or memory
location, completing the instruction’s execution. For example, the result of an
arithmetic operation might be written to a general-purpose register.
Hazards:
Data Hazards:
Occur when an instruction depends on the result of a previous instruction that
has not yet completed.
Example: If I2 needs data from I1 before I1 has completed its execution.
Solutions:
Use techniques like forwarding (bypassing) or pipeline stalls to resolve data
hazards.
Control Hazards:
Occur when the pipeline makes wrong assumptions about the next instruction to
execute, usually due to branches (e.g., if/else conditions or loops).
Example: A branch instruction changes the flow of execution, and the pipeline
has already fetched the next sequential instruction, which may not be correct.
Solutions:
Use branch prediction or pipeline flushing to handle control hazards.
Structural Hazards:
Occur when two or more instructions require the same hardware resource at the
same time.
Example: Both the fetch and write-back stages need access to the memory
simultaneously.
Solutions:
Use resource duplication (e.g., separate caches for instruction and data memory)
to mitigate structural hazards.