[go: up one dir, main page]

0% found this document useful (0 votes)
11 views30 pages

COA (Computer Organization & Architecture)

A digital system consists of functional units including input, CPU, memory, output, and buses that interconnect these components for data processing. The document details various architectures, types of buses, bus arbitration methods, and processor organization, emphasizing the importance of registers, memory transfers, and addressing modes. Additionally, it explains the Arithmetic and Logic Unit (ALU) and its operations, including fast addition techniques and signed operand multiplication methods.

Uploaded by

prachiranjan26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views30 pages

COA (Computer Organization & Architecture)

A digital system consists of functional units including input, CPU, memory, output, and buses that interconnect these components for data processing. The document details various architectures, types of buses, bus arbitration methods, and processor organization, emphasizing the importance of registers, memory transfers, and addressing modes. Additionally, it explains the Arithmetic and Logic Unit (ALU) and its operations, including fast addition techniques and signed operand multiplication methods.

Uploaded by

prachiranjan26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Mod1

• Functional units of digital system and their


interconnections
functional-components-of-a-computer/

A digital system is composed of several functional units that work together to process, store,
and transmit digital data. The main functional units and their interconnections are:

1. Input Unit

• Converts external data (analog or digital) into a digital form that the system can
process.

• Examples: Keyboard, Mouse, Sensors, A/D Converter.

2. Central Processing Unit (CPU)

The brain of the system, responsible for executing instructions. It consists of:

a) Arithmetic and Logic Unit (ALU)

• Performs arithmetic (addition, subtraction, multiplication, division) and logical


operations (AND, OR, NOT, XOR).

b) Control Unit (CU)

• Directs data flow within the system.

• Manages instructions execution by sending control signals to other components.

c) Registers

• Small, fast storage locations for temporary data storage during processing.

3. Memory Unit

• Stores data and instructions required for processing.

• Types:

o Primary Memory: RAM (volatile), ROM (non-volatile).

o Secondary Memory: Hard Disk, SSD, Flash Memory.

4. Output Unit

• Converts processed digital data into human-readable or machine-readable form.

• Examples: Monitor, Printer, D/A Converter.


5. Bus System (Interconnections)

A set of parallel wires that interconnect different units. It consists of:

• Data Bus: Transfers actual data between components.

• Address Bus: Carries addresses to specify memory locations.

• Control Bus: Sends control signals (Read/Write, Interrupts) between units.

6. Input/Output (I/O) Interfaces

• Facilitates communication between CPU and external devices.

• Examples: USB, PCI, UART, GPIO interfaces.

buses, bus architecture, types of buses and bus arbitration


Buses, Bus Architecture, Types of Buses, and Bus Arbitration
A bus is a communication pathway used to transfer data, addresses, and
control signals between different components of a digital system. It is a set of
parallel lines (wires) that interconnect different parts of the system.
1. Bus Architecture
A digital system typically follows a bus architecture, which consists of three
main types of buses:
• Data Bus
o Transfers actual data between the CPU, memory, and I/O devices.
o Bi-directional in nature.
o The width of the data bus (e.g., 8-bit, 16-bit, 32-bit, or 64-bit)
determines the amount of data transferred in one cycle.
• Address Bus
o Carries the address of the memory location or I/O device to be
accessed.
o Unidirectional (CPU → Memory/I/O).
o The width of the address bus determines the memory capacity
(e.g., a 16-bit address bus can access 2¹⁶ = 64K memory locations).
• Control Bus
o Carries control signals to coordinate read/write operations.
o Includes signals such as Read (RD), Write (WR), Interrupts, Clock,
and Reset.
2. Types of Bus Architectures
a) Single Bus Architecture
• All units (CPU, memory, and I/O) share a common bus.
• Simple design but can lead to bus contention and bottlenecks.
• Used in small microcontroller-based systems.
b) Multiple Bus Architecture
• Separate buses for CPU-memory and CPU-I/O communication.
• Reduces congestion and improves performance.
• Used in high-performance computing and multiprocessor systems.
c) System Bus Architecture
• Used in modern processors.
• Consists of three main buses:
o Front Side Bus (FSB): Connects CPU to main memory.
o Back Side Bus (BSB): Connects CPU to cache memory.
o Peripheral Bus: Connects CPU to I/O devices (PCI, USB, etc.).
3. Types of Buses
a) Processor Bus (System Bus)
• Connects the CPU, memory, and I/O devices.
• Used for data transfer inside the system.
b) Memory Bus
• Dedicated bus between the CPU and RAM.
• Used in high-speed computing systems.
c) Input/Output Bus (I/O Bus)
• Connects the CPU to peripheral devices (printers, hard disks, etc.).
• Examples: PCI, USB, SATA, Ethernet.
d) Expansion Bus
• Used to connect additional devices (graphics cards, network cards).
• Examples: PCIe, AGP, Thunderbolt.
4. Bus Arbitration
Bus arbitration is the process of resolving conflicts when multiple devices
request access to a shared bus. It ensures efficient and fair access to the bus.
Types of Bus Arbitration
1. Daisy-Chaining Arbitration
o Devices are connected in series (priority-based).
o Higher-priority devices get access first.
o Simple but lower-priority devices may face starvation.
2. Polling (Software Arbitration)
o The CPU checks each device in sequence to grant access.
o Simple but slow due to CPU intervention.
3. Fixed Priority Arbitration
o Each device is assigned a fixed priority level.
o Higher-priority devices always get access first.
o Can cause starvation for lower-priority devices.
4. Dynamic Priority Arbitration (Fair Arbitration)
o The priority of a device changes based on usage.
o Ensures fairness and avoids starvation.
5. Bus Mastering (Distributed Arbitration)
o Multiple devices can take control of the bus.
o A device (bus master) controls access instead of the CPU.
o Used in high-speed systems (PCI, DMA).

Register, bus and memory transfer. Processor organization


Register, Bus, and Memory Transfer | Processor
Organization
A digital system consists of several components that work together to process and transfer
data. Key aspects include registers, buses, memory transfer, and processor organization.

1. Register, Bus, and Memory Transfer

a) Register Transfer
Registers are small, fast storage units within the CPU that temporarily hold data during
execution. Register Transfer refers to moving data between registers using control signals.

Types of Register Transfer Operations:

• Load (LD): Transfers data from memory to a register.

• Store (ST): Transfers data from a register to memory.

• Move (MOV): Transfers data between registers.

• Increment/Decrement (INR/DCR): Modifies the register content.

Example of Register Transfer Notation:

• R1←R2R1 \leftarrow R2 → Contents of R2 are copied into R1.

• R1←R1+R2R1 \leftarrow R1 + R2 → Adds R2 to R1 and stores the result in R1.

b) Bus Transfer

A bus is a set of parallel lines used for communication between different units of a digital
system.

Bus Transfer Operations:

• Data transfer between CPU and memory.

• Data transfer between CPU and I/O devices.

• Register-to-register transfers via an internal bus.

c) Memory Transfer

Memory transfer refers to reading from and writing to memory. It involves three main
operations:

1. Memory Read: Data is transferred from memory to CPU.

o Control Signal: Read (RD)

o Example: MDR←Memory[MAR]MDR \leftarrow Memory[MAR]

o (Memory Address Register (MAR) holds address, Memory Data Register


(MDR) stores data).

2. Memory Write: Data is transferred from CPU to memory.

o Control Signal: Write (WR)


o Example: Memory[MAR]←MDRMemory[MAR] \leftarrow MDR

3. Direct Memory Access (DMA): A technique where peripherals access memory


directly, bypassing the CPU for high-speed transfers.

2. Processor Organization

Processor organization refers to how the components inside a CPU are structured and
interact to execute instructions efficiently.

a) Single Accumulator Organization

• Uses one Accumulator (AC) as the main register for computations.

• Simple but requires frequent memory access.

• Example: Intel 8085, 8051 Microcontroller

b) General-Purpose Register Organization

• Uses multiple general-purpose registers (GPRs) instead of a single accumulator.

• Reduces memory access and increases speed.

• Used in modern processors like x86, ARM.

c) Stack Organization

• Uses a Last-In-First-Out (LIFO) stack for computations.

• Instructions operate on the top of the stack.

• Example: RISC architectures, Java Virtual Machine (JVM).

d) Register Stack vs. Memory Stack

• Register Stack: Uses CPU registers for stack operations (fast).

• Memory Stack: Uses RAM for stack operations (slower but expandable).

e) Pipelined Organization

• Breaks instruction execution into multiple stages (Fetch, Decode, Execute, etc.).

• Improves performance by executing multiple instructions in parallel.

• Used in modern superscalar processors.

The processor uses registers, buses, and memory transfer mechanisms to process data
efficiently. Different processor organizations optimize performance based on the

Or -bus-and-memory-transfers
Processor organization,
general registers organization, stack organization and addressing
modes.
Processor Organization, General Register Organization, Stack
Organization, and Addressing Modes
A processor (CPU) is the central unit of a digital system responsible for executing
instructions. Its organization determines how instructions are processed, stored, and
executed.

1. Processor Organization

Processor organization refers to the internal structure of the CPU and how it manages data
flow. There are four common types:

a) Single Accumulator Organization

• Uses a single Accumulator (AC) as the main register for arithmetic and logic
operations.

• Example:

o Instruction: ADD B

o Operation: AC ← AC + B

• Simple design but requires frequent memory access.

• Used in early microprocessors like Intel 8085.

b) General-Purpose Register Organization

• Uses multiple General-Purpose Registers (GPRs) for computations.

• Reduces memory access and speeds up execution.

• Example: Modern CPUs (Intel x86, ARM).

c) Stack Organization

• Uses a stack (LIFO structure) for computations instead of registers.

• Operations like PUSH and POP manage data storage.

• Used in RISC architectures and Java Virtual Machine (JVM).

d) Pipelined Organization
• Breaks execution into multiple stages (Fetch, Decode, Execute, etc.).

• Allows parallel execution of instructions for higher efficiency.

• Used in modern processors like Intel Core, AMD Ryzen.

2. General Register Organization

In General-Purpose Register (GPR) Organization, multiple registers store data temporarily


for processing.

a) Register Types

• General Registers (R1, R2, etc.) – Store temporary values during execution.

• Program Counter (PC) – Holds the address of the next instruction.

• Instruction Register (IR) – Holds the currently executing instruction.

• Memory Address Register (MAR) – Holds memory addresses for read/write


operations.

• Memory Data Register (MDR) – Holds data transferred between memory and CPU.

• Stack Pointer (SP) – Points to the top of the stack.

b) Register-Based Instruction Execution

Example of an instruction execution:

LOAD R1, 1000 ; Load value from memory address 1000 into R1

ADD R2, R1 ; Add R1 to R2

STORE R2, 1001 ; Store the result in memory address 1001

Using registers reduces memory access, improving speed and efficiency.

3. Stack Organization

A stack is a data structure that follows the Last In, First Out (LIFO) principle.

a) Stack Operations

• PUSH X → Inserts X into the stack.

• POP X → Removes the top element from the stack.

b) Types of Stacks

• Register Stack → Uses CPU registers for stack operations (faster but limited).
• Memory Stack → Uses RAM for stack operations (slower but expandable).

c) Stack-Based Execution

Example:

PUSH A ; Store A in stack

PUSH B ; Store B in stack

ADD ; Add top two elements (A + B)

POP C ; Store result in C

Used in RISC processors, function calls, and recursion handling.

4. Addressing Modes

Addressing modes determine how an instruction accesses data in memory or registers.

a) Immediate Addressing

• Operand is given directly in the instruction.

• Example: MOV R1, #10 (Load 10 into R1).

b) Register Addressing

• Operand is stored in a register.

• Example: ADD R1, R2 (R1 = R1 + R2).

c) Direct Addressing

• Operand is stored in a memory location given in the instruction.

• Example: LOAD R1, 2000 (R1 ← data at memory address 2000).

d) Indirect Addressing

• Instruction gives the address of another memory location that holds the operand.

• Example: LOAD R1, (3000) (R1 ← data at address stored in location 3000).

e) Indexed Addressing

• Uses a base address and an index register for flexible memory access.

• Example: LOAD R1, ARRAY[X] (R1 ← Data from address ARRAY + X).

f) Relative Addressing

• Uses the Program Counter (PC) and an offset for branching.


• Example: JUMP 20 (Jump to PC + 20).

Conclusion

• Processor organization determines CPU efficiency and execution speed.

• General registers improve performance by reducing memory access.

• Stack organization is essential for function calls and recursion.

• Addressing modes allow flexible and efficient data access.

mod 1& 2(a bit ),4(achha khasa) yt

Mod 2

Detailed Explanation of Arithmetic and Logic Unit (ALU)


and Related Concepts
The Arithmetic and Logic Unit (ALU) is a critical
component of the CPU responsible for executing arithmetic
(addition, subtraction, multiplication, division) and logic
(AND, OR, XOR, NOT) operations. Modern ALUs also
handle floating-point operations using IEEE standards.

1. Look-Ahead Carry Adders (Fast Adders)


A carry look-ahead adder (CLA) is a fast adder that
overcomes the delay caused by carry propagation in ripple
carry adders.
a) Problem with Ripple Carry Adders
• In an n-bit ripple carry adder, the carry output from
one stage must be computed before the next stage starts.
• This creates propagation delay, making large adders
slow.
b) Solution: Carry Look-Ahead Adder
• Uses carry generate (G) and carry propagate (P)
functions to calculate carry values in advance.
• Equations:
o Carry Generate: Gi=Ai⋅BiG_i = A_i \cdot B_i →
Carry is generated if both inputs are 1.
o Carry Propagate: Pi=Ai+BiP_i = A_i + B_i →
Carry is propagated if at least one input is 1.
o Carry Output: Ci+1=Gi+Pi⋅CiC_{i+1} = G_i +
P_i \cdot C_i
• This allows parallel computation of carries, reducing
delay significantly.
• Used in high-speed processors and digital circuits.

2. Multiplication of Signed Operands


Multiplication of signed numbers requires special handling
due to the two’s complement representation. The following
methods are commonly used:
a) Booth’s Algorithm
• Efficient method for signed multiplication using bit-
pair recording.
• Works for both positive and negative numbers.
• Reduces the number of additions/subtractions,
making it faster than normal shift-and-add methods.
Booth’s Algorithm Steps:
1. Append an extra 0 at the rightmost bit of the multiplier
(i.e., Q−1=0Q_{-1} = 0).
2. Read two bits at a time (current Q0Q_0 and previous
Q−1Q_{-1}):
o 00 → No operation (Shift Right).
o 01 → Add multiplicand to product and shift right.
o 10 → Subtract multiplicand from product and shift
right.
o 11 → No operation (Shift Right).
3. Repeat for n bits.
Advantage: Reduces unnecessary operations, making
multiplication efficient.
Used in: High-performance ALUs and DSP processors.

b) Array Multiplier
• Uses combinational logic to perform multiplication.
• Constructed using AND gates and adders to generate
partial products.
• Steps:
1. Each bit of the multiplier is ANDed with the
multiplicand to form partial products.
2. Partial products are then summed using adders.
3. Final sum gives the result.
Advantage: Simple hardware, fast execution for small
multiplications.
Disadvantage: High hardware cost for large
multiplications.
Used in: FPGA and ASIC circuits.

3. Division Algorithms
a) Restoring Division Algorithm
• Similar to long division in binary.
• Steps:
1. Subtract divisor from dividend.
2. If result is negative, restore the previous value and
set quotient bit to 0.
3. If result is positive, keep the new value and set
quotient bit to 1.
4. Shift left and repeat for n bits.
Used in: Simple processors, low-speed ALUs.

b) Non-Restoring Division Algorithm


• Avoids restoring step, making it faster.
• Steps:
1. If remainder is positive, subtract divisor.
2. If remainder is negative, add divisor.
3. Repeat for n bits.
Advantage: Faster than restoring division.
Used in: High-performance CPUs.

4. Floating-Point Arithmetic Operations


Floating-point numbers represent very large or very small
numbers in scientific notation:
(−1)S×M×2E(-1)^S \times M \times 2^E
where:
• SS → Sign bit (0 = positive, 1 = negative).
• MM → Mantissa (fraction part).
• EE → Exponent (power of 2 adjustment).
a) Floating-Point Addition and Subtraction
• Align the exponents by shifting the smaller number.
• Perform addition/subtraction on the mantissas.
• Normalize the result by adjusting the exponent.
• Round off to fit the representation.
b) Floating-Point Multiplication
• Multiply the mantissas.
• Add the exponents.
• Normalize and round the result.
c) Floating-Point Division
• Divide the mantissas.
• Subtract the exponents.
• Normalize and round.
Used in: Scientific computing, AI, graphics processing.

5. IEEE 754 Standard for Floating-Point Numbers


The IEEE 754 standard defines the representation of
floating-point numbers in computers.
IEEE 754 Single-Precision (32-bit format)
• Sign (1 bit)
• Exponent (8 bits, biased by 127)
• Mantissa (23 bits, normalized)
Example:
3.14≈1.570×21\text{3.14} \approx 1.570 \times 2^1
Binary:
Sign | Exponent | Mantissa
0 | 10000000 | 10010001111010111000000
IEEE 754 Double-Precision (64-bit format)
• Sign (1 bit)
• Exponent (11 bits, biased by 1023)
• Mantissa (52 bits, normalized)
Advantages of IEEE 754:
• Standardized representation across different hardware.
• Supports denormalized numbers for underflow cases.
• Handles special cases like NaN (Not a Number) and
Infinity.
Used in:
• Supercomputers
• Machine learning
• Financial computing

6. Arithmetic & Logic Unit (ALU) Design


a) Basic ALU Structure
• Inputs: Two n-bit operands (A, B).
• Control Unit: Determines operation (ADD, SUB, AND,
OR, XOR).
• Arithmetic Circuit: Performs addition/subtraction using
a full-adder circuit.
• Logic Circuit: Performs bitwise logical operations.
• Multiplexer: Selects output based on control signals.
• Flags: Carry, Zero, Overflow, Sign flags indicate results.
Used in: CPUs, GPUs, microcontrollers.

Conclusion
• Look-ahead carry adders improve addition speed.
• Booth’s algorithm optimizes signed multiplication.
• Floating-point arithmetic enables high-precision
calculations.
• IEEE 754 standard ensures compatibility across
systems.
• ALU design is essential for processing arithmetic and
logic operations efficiently.
MOD3

Detailed Explanation of Control Unit and Related Concepts

The Control Unit (CU) is a crucial part of the CPU that directs the operation of the processor.
It fetches, decodes, and executes instructions by generating appropriate control signals for
different components (ALU, registers, memory, etc.).

1. Instruction Types

An instruction is a command given to the CPU to perform a specific task. Based on


functionality, instructions are classified as follows:

a) Data Transfer Instructions

• Move data between registers and memory.

• Examples: MOV, LOAD, STORE

• Example: MOV R1, R2 (copies contents of R2 into R1).

b) Arithmetic Instructions

• Perform mathematical operations.

• Examples: ADD, SUB, MUL, DIV

• Example: ADD R1, R2, R3 (R1 = R2 + R3).

c) Logical Instructions

• Perform bitwise operations like AND, OR, XOR, NOT.

• Example: AND R1, R2 (R1 = R1 & R2).

d) Control Transfer (Branching) Instructions

• Change the sequence of execution.

• Examples: JUMP, CALL, RETURN, HALT

• Example: JMP 1000H (Jump to address 1000H).

e) Input/Output Instructions

• Used for communication with I/O devices.

• Examples: IN, OUT


• Example: IN R1, PORT1 (Reads data from PORT1 into R1).

2. Instruction Formats

An instruction format defines the structure of an instruction in memory. It includes:

1. Opcode (Operation Code) – Specifies the operation to be performed.

2. Operands (Registers, Memory addresses, Constants) – Specify the data used.

3. Mode Bits – Define the addressing mode.

Types of Instruction Formats:

• Zero Address (Stack-based, Implicit operands) – Example: PUSH A

• One Address (Accumulator-based) – Example: ADD B (A = A + B)

• Two Address (General register-based) – Example: ADD R1, R2 (R1 = R1 + R2)

• Three Address (More flexibility, large instruction size) – Example: ADD R1, R2, R3 (R1
= R2 + R3).

3. Instruction Cycle and Sub-Cycles

The Instruction Cycle is the sequence of steps the CPU follows to execute an instruction.

Main Phases of the Instruction Cycle:

1. Fetch Cycle

o Retrieve instruction from memory.

o PC → MAR, Read Memory, MDR → IR, PC++.

2. Decode Cycle

o Decode the instruction in IR (Instruction Register).

3. Execute Cycle

o Perform the operation in ALU, registers, memory, or I/O.

4. Store Cycle (if needed)

o Store result back into memory/register.

Each cycle consists of smaller micro-operations, controlled by control signals

Main Phases of the Instruction Cycle


The Instruction Cycle is the sequence of steps a CPU follows to execute an instruction. It
consists of the following four main phases:

1. Fetch Phase

2. Decode Phase

3. Execute Phase

4. Store Phase (if needed)

Each phase consists of multiple micro-operations, which are low-level steps performed by
the control unit.

1. Fetch Phase (Fetching the Instruction)

Purpose

The CPU retrieves the instruction from memory into the Instruction Register (IR).

Micro-Operations in Fetch Cycle

1. MAR ← PC (Move the program counter (PC) value to the Memory Address Register
(MAR)).

2. Read Memory (CPU issues a read signal to fetch data).

3. MDR ← Memory[MAR] (Data from memory is loaded into the Memory Data Register
(MDR)).

4. IR ← MDR (Instruction is transferred from MDR to the Instruction Register (IR)).

5. PC ← PC + 1 (PC is incremented to point to the next instruction).

Example

If PC = 2000H, the instruction at 2000H is fetched into the IR, and PC is updated to 2001H.

2. Decode Phase (Understanding the Instruction)

Purpose

The Control Unit (CU) interprets the instruction stored in the IR and prepares the CPU to
execute it.

Steps in Decode Phase

1. IR is decoded to determine the operation (e.g., ADD, LOAD, STORE).


2. Control Unit generates control signals based on instruction type.

3. Identify operands (registers, memory addresses, or constants).

Example

For the instruction:

ADD R1, R2

• The opcode (ADD) is decoded.

• The operands (R1, R2) are identified.

3. Execute Phase (Performing the Operation)

Purpose

The CPU performs the required operation (arithmetic, logic, data transfer, or branch).

Steps in Execute Phase

• Arithmetic/Logic Operations: The ALU performs the computation (e.g., ADD R1, R2
→ R1 = R1 + R2).

• Memory Access: If needed, read/write operations are performed.

• Branching Instructions: The PC may be updated for jumps or loops.

Example

For ADD R1, R2:

1. Fetch the values from R1 and R2.

2. Send them to the ALU.

3. Perform addition.

4. Store the result in R1.

4. Store Phase (Storing the Result)

Purpose

The result of execution is stored in a register or memory.

Steps in Store Phase

1. If needed, move the result to the appropriate location (register/memory).


2. Set status flags (Zero, Carry, Overflow) based on the result.

Example

If the instruction was:

STORE R1, 3000H

• The value in R1 is stored at memory address 3000H.

Complete Instruction Cycle Example

Let’s consider the instruction:

ADD R1, R2 (R1 = R1 + R2)

1. Fetch:

2. MAR ← PC

3. Read Memory

4. MDR ← Memory[MAR]

5. IR ← MDR

6. PC ← PC + 1

7. Decode:

o Control Unit decodes ADD R1, R2.

8. Execute:

9. ALU: R1 ← R1 + R2

10. Store:

o Result is stored in R1.

Conclusion

The Instruction Cycle ensures systematic execution.


Fetch → Decode → Execute → Store is repeated for each instruction.
The Control Unit manages these phases using micro-operations.

Would you like a diagram or a flowchart for better understanding?

.
4. Micro-Operations

A Micro-operation is a small step performed within an instruction cycle.

Examples of Micro-Operations:

• Fetch: MAR←PC,Read,MDR←Memory,IR←MDR,PC←PC+1MAR \leftarrow PC, Read,


MDR \leftarrow Memory, IR \leftarrow MDR, PC \leftarrow PC + 1

• Execute (Addition Example): R1←R1+R2R1 \leftarrow R1 + R2

• Branch Execution: PC←AddressPC \leftarrow Address

Each micro-operation is triggered by control signals generated by the Control Unit.

5. Execution of a Complete Instruction

Example: ADD R1, R2

1. Fetch: Load instruction from memory.

2. Decode: Determine it’s an ADD operation.

3. Execute: Fetch values from R1 and R2, perform addition in ALU.

4. Store: Store the result back in R1.

The CU manages this process using micro-operations and control signals.

6. Program Control

Program control refers to how the CPU sequences instructions. It involves:

• Jumps (Unconditional & Conditional) – Example: JMP 2000H

• Subroutine Calls – Example: CALL FUNCTION

• Interrupt Handling – CPU stops execution to process an external request.

7. Reduced Instruction Set Computer (RISC)

RISC processors use simple instructions that execute in one clock cycle.

Characteristics of RISC:
Few instruction types (Load/Store architecture).
Fixed instruction size (Simplifies decoding).
More registers (Reduces memory access).
Pipelining (Increases speed).

Example: ARM, MIPS, PowerPC processors use RISC.

8. Pipelining

Pipelining improves CPU performance by executing multiple instructions simultaneously.

Stages in a 5-Stage Pipeline:

1. Fetch – Retrieve instruction.

2. Decode – Identify operation.

3. Execute – Perform operation.

4. Memory Access – Read/write data.

5. Write Back – Store result.

Increases throughput but can cause hazards (data, control, and structural hazards).
Used in modern CPUs and GPUs.

9. Hardwired and Microprogrammed Control

The Control Unit is implemented using:

a) Hardwired Control

• Uses combinational logic circuits to generate control signals.

• Fast but complex (hard to modify).

• Used in high-performance processors.

b) Microprogrammed Control

• Uses control memory to store a microprogram (set of micro-instructions).

• Easier to modify but slower than hardwired control.

10. Microprogram Sequencing

A Microprogram Sequencer determines the sequence of micro-instructions.


Steps in Microprogram Execution:

1. Fetch micro-instruction from control memory.

2. Decode and execute it.

3. Determine the next micro-instruction.

11. Horizontal and Vertical Microprogramming

a) Horizontal Microprogramming

• Uses long control words (direct control of each signal).

• Fast but complex.

• Example: VLIW (Very Long Instruction Word) processors.

b) Vertical Microprogramming

• Uses compact control words (decoded into control signals).

• Easier to design but slightly slower.

• Example: Microcontrollers and simpler CPUs.

Conclusion

Control Unit manages CPU execution using micro-operations.


Instruction Cycle involves Fetch, Decode, Execute, Store.
RISC improves efficiency by reducing instruction complexity.
Pipelining boosts performance but requires hazard management.
Hardwired vs. Microprogrammed Control – tradeoff between speed and flexibility.
Horizontal vs. Vertical Microprogramming – affects control unit design.

difference-between-horizontal-and-vertical-micro-
programmed-control-unit
computer-organization-hardwired-vs-micro-
programmed-control-unit
micro programmed control
micro programmed control architecture

MOD4
2d & 2.5D

• Cache memory
▪ cache explanation
▪ cache mapping
▪ Cache Memory: Concept, Design Issues, and Performance
▪ 1. Concept of Cache Memory
▪ What is Cache Memory?
▪ Cache memory is a small, high-speed memory located close to the CPU that
stores frequently accessed data from main memory (RAM). It helps reduce
the time taken to fetch instructions and data, improving overall system
performance.
▪ Why is Cache Memory Needed?
▪ The CPU is much faster than RAM, causing a bottleneck in fetching data.
▪ Cache bridges the gap by storing frequently used data, reducing the access
time.
▪ This is based on the principle of locality:
▪ Temporal Locality: If data is accessed once, it will likely be accessed again
soon.
▪ Spatial Locality: If one memory location is accessed, nearby locations will
likely be accessed soon.
▪ Cache Hierarchy (Levels of Cache)
▪ Modern processors use multi-level caching:
▪ L1 Cache (Level 1) – Fastest, smallest, closest to CPU.
▪ L2 Cache (Level 2) – Larger but slower than L1.
▪ L3 Cache (Level 3) – Shared among CPU cores, slower than L2 but faster than
RAM.

▪ 2. Cache Design Issues
▪ Several key design choices affect cache performance:
▪ a) Cache Mapping Techniques
▪ Cache mapping defines how main memory addresses are mapped to cache
blocks.
▪ Direct Mapped Cache
▪ Each memory block is mapped to a specific cache line.
▪ Simple but causes conflicts when multiple memory blocks map to the same
line.
▪ Example: Block Address MOD Cache Size = Cache Line Index
▪ Fully Associative Cache
▪ Any memory block can be stored anywhere in the cache.
▪ Requires complex hardware to search for data.
▪ Reduces conflicts but increases cost.
▪ Set-Associative Cache
▪ Compromise between direct-mapped and fully associative.
▪ Memory blocks are mapped to a set of cache lines (e.g., 2-way set-associative
cache means each set has 2 blocks).
▪ Balances performance and complexity.

▪ b) Cache Replacement Policies
▪ When a new data block needs to be loaded into cache but the cache is full, a
replacement policy decides which block to remove.
▪ Least Recently Used (LRU)
▪ Removes the least recently accessed block.
▪ Good performance but requires extra tracking hardware.
▪ First-In-First-Out (FIFO)
▪ Removes the oldest block in the cache.
▪ Simple but may remove frequently used data.
▪ Random Replacement
▪ Randomly selects a block to replace.
▪ Easy to implement but unpredictable performance.

▪ c) Cache Write Policies
▪ Defines how data is written to cache and memory.
▪ Write-Through
▪ Every write to cache is also written to main memory.
▪ Slower but ensures consistency between cache and memory.
▪ Write-Back
▪ Writes occur only in cache, and updates to memory happen later.
▪ Faster but requires additional mechanisms to update main memory when
necessary.

▪ 3. Cache Performance
▪ The efficiency of cache memory is measured using cache performance
metrics:
▪ a) Cache Hit and Cache Miss
▪ Cache Hit: The CPU finds the requested data in the cache.
▪ Cache Miss: The CPU does not find the data in the cache and must fetch it
from RAM (causing delays).
▪ b) Cache Hit Ratio
▪ Hit Ratio=Cache HitsTotal Memory Accesses\text{Hit Ratio} =
\frac{\text{Cache Hits}}{\text{Total Memory Accesses}}
▪ Higher hit ratio means better performance.
▪ Typical hit ratio: 90%+ in well-optimized systems.
▪ c) Cache Miss Penalty
▪ The extra time taken to fetch data from RAM when a cache miss occurs.
▪ Reducing miss penalty improves overall performance.
▪ d) Average Memory Access Time (AMAT)
▪ AMAT=Hit Time+(Miss Rate×Miss Penalty)AMAT = \text{Hit Time} +
(\text{Miss Rate} \times \text{Miss Penalty})
▪ Lower AMAT means faster system performance.

▪ 4. Improving Cache Performance
▪ To enhance cache efficiency:
Increase Cache Size – Reduces misses but increases cost.
Use Set-Associative Mapping – Balances performance and cost.
Optimize Replacement Policies – LRU is generally better.
Use Prefetching – Load likely-needed data into cache in advance.
Multi-level Caching – L1, L2, and L3 caches optimize speed and size trade-
offs.

▪ Conclusion
▪ Cache memory reduces access time and improves CPU efficiency.
Design choices (mapping, replacement, write policies) affect
performance.
Cache performance is measured using hit ratio, miss penalty, and AMAT.
Optimizing cache size, associativity, and prefetching enhances
performance.
▪ Would you like examples, diagrams, or numerical problems for deeper
understanding?

You might also like