[go: up one dir, main page]

0% found this document useful (0 votes)
31 views18 pages

Unit IV and Unit V Notes

The document discusses Input-Output Organization in computer systems, focusing on peripheral devices, I/O interfaces, and various modes of data transfer including programmed I/O, interrupt-driven I/O, and direct memory access (DMA). It explains the classification of peripheral devices into input, output, and storage categories, as well as the functions and components of I/O interfaces. Additionally, it covers asynchronous data transfer methods and the role of Input-Output Processors (IOP) in managing I/O operations to enhance CPU efficiency.

Uploaded by

vsksai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views18 pages

Unit IV and Unit V Notes

The document discusses Input-Output Organization in computer systems, focusing on peripheral devices, I/O interfaces, and various modes of data transfer including programmed I/O, interrupt-driven I/O, and direct memory access (DMA). It explains the classification of peripheral devices into input, output, and storage categories, as well as the functions and components of I/O interfaces. Additionally, it covers asynchronous data transfer methods and the role of Input-Output Processors (IOP) in managing I/O operations to enhance CPU efficiency.

Uploaded by

vsksai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

II BSC III Semester

Computer Organization

UNIT – IV
Input-Output Organization: Peripheral Devices, input-output interface, asynchronous
data transfer, modes of transfer- programmed I/O, priority interrupt, direct memory
access, Input – Output Processor (IOP).

Peripheral Devices in Computer Organization


In computer organization, peripheral devices are external hardware components
that communicate with the computer's central processing unit (CPU) through
input/output (I/O) interfaces. These devices either input data into the computer
system or output data from it, and in some cases, they can perform both functions.
Peripherals are essential for extending the functionality of a computer beyond its
core components (CPU, memory, etc.).
Classification of Peripheral Devices:
Peripheral devices are typically classified into three main categories based on their
primary function:
1. Input Devices: These allow the user to provide data and instructions to the
computer.
Examples:
 Keyboard: Inputs text and commands.
 Mouse: Provides positional data and command input via clicks.
 Scanner: Converts physical documents into digital data.
 Microphone: Captures audio as input.
 Touchscreen: Acts as both an input (touch) and output device.
2. Output Devices: These provide data from the computer to the user or
another system.
Examples:
 Monitor: Displays visual output such as text, images, and
videos.
 Printer: Converts digital documents into physical form.
 Speakers: Output audio signals.
 Projector: Displays computer visuals on a larger screen.
3. Storage Devices (Input/Output): These act as both input and output devices
as they store data and can read or write it.
Examples:
 Hard Disk Drive (HDD): Provides long-term storage for data.
 Solid State Drive (SSD): A faster alternative to HDDs.
 USB Flash Drives: Portable storage devices.
 CD/DVD: Optical storage media.

Input-Output Interface,

Input-Output (I/O) Interface in Computer Organization


The Input-Output (I/O) interface in computer organization refers to the mechanism
that allows communication between the central processing unit (CPU) and
peripheral devices. Since the CPU and peripherals operate at different speeds and

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 1


use different data formats, an I/O interface is required to manage this interaction,
ensuring efficient and accurate data transfer.
Key Functions of an I/O Interface
The main functions of an I/O interface are:
1. Data Conversion: Converting the data format from the peripheral device
(e.g., a keyboard, printer, or disk) to a format understandable by the CPU and vice
versa.
2. Data Buffering: Since peripherals often operate much slower than the CPU,
the interface uses buffering to store data temporarily, allowing the CPU to continue
working while waiting for the peripheral to complete an operation.
3. Synchronization: Ensures that the CPU and the peripheral device are
synchronized during data exchange, preventing data loss or overflow.
4. Error Detection and Handling: Detecting transmission errors and managing
error recovery mechanisms.
5. Address Decoding: Ensures that the data is directed to or from the correct
peripheral device by interpreting the I/O address.
I/O Interface Components
1. I/O Ports: An I/O port is a specific location in memory where data is
transferred to or from a peripheral device. I/O ports can be accessed by the CPU
using special instructions or memory-mapped addresses.
o Data Register: Stores the actual data being transferred.
o Control Register: Used to control the operation of the peripheral
device (e.g., whether to read or write).
o Status Register: Provides the current status of the device (e.g.,
whether it’s ready to send or receive data).
2. I/O Controller: Also known as a device controller, the I/O controller acts as
an intermediary between the CPU and a peripheral device. It manages the
communication process, offloading the CPU from direct management of the device's
low-level details.
Example: A hard drive controller manages reading and writing to a hard disk,
converting the high-level requests from the CPU into physical data operations.
3. Interrupts: Interrupts are signals sent by I/O devices to indicate that they are
ready for data transfer or need attention. The Interrupt Request (IRQ) is sent to the
CPU, which temporarily halts its current operations to handle the I/O operation.
4. Bus System: I/O devices communicate with the CPU through the system bus.
The bus system consists of:
o Address Bus: Determines which device or memory location is being
accessed.
o Data Bus: Transports actual data between the CPU and peripherals.
o Control Bus: Carries control signals, such as read/write operations
and interrupt signals.
Types of I/O Communication
There are different communication methods for managing data transfer between the
CPU and I/O devices:
1. Programmed I/O:
o In this method, the CPU is responsible for executing all I/O operations.
o The CPU checks the status of the peripheral device continuously
(polling) and moves the data when the device is ready.
o It is simple but inefficient because the CPU is engaged in managing
I/O, which slows down the overall system performance.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 2


2. Interrupt-Driven I/O:
o The I/O device sends an interrupt to notify the CPU when it is ready
for data transfer.
o This allows the CPU to perform other tasks while waiting for the I/O
operation to complete, improving efficiency.
o The interrupt handler responds to the interrupt, performs the necessary
data transfer, and then resumes the interrupted task.
3. Direct Memory Access (DMA):
o DMA allows devices to transfer data directly to/from memory without
involving the CPU for each data transaction.
o The DMA controller manages the data transfer, allowing the CPU to
focus on other tasks.
o This method is particularly useful for high-speed devices such as disk
drives, graphics cards, and network

Asynchronous data transfer,

Asynchronous Data Transfer in Computer Organization


Asynchronous data transfer refers to the method of data transmission in which
data is sent between the CPU and peripheral devices without requiring a shared
clock signal to synchronize the sending and receiving operations. In this mode, data
transfer occurs independently, with the sender and receiver managing their timing
asynchronously. Each component signals when it is ready to send or receive data,
allowing devices operating at different speeds to communicate effectively.
Key Concepts in Asynchronous Data Transfer
1. No Shared Clock:
o Unlike synchronous data transfer, where both the sender and
receiver use a common clock signal to coordinate data transfer, asynchronous
transfer occurs without a shared timing mechanism.
o Instead, data is transferred based on control signals or handshaking
between the sender and receiver.
2. Control Signals:
o In asynchronous transfers, control signals are used to indicate when
data is ready to be sent or received. These signals are crucial for ensuring correct
communication between devices that do not operate at the same speed.
o Handshaking is commonly used in asynchronous transfers to
coordinate data transfer.
3. Variable Timing:
o Because there is no strict timing requirement, data can be transmitted
at irregular intervals. The sender can send data when it is ready, and the receiver
can process it when it is available to do so.
Asynchronous Data Transfer Techniques
1. Handshaking:
o Handshaking is the most common mechanism used to coordinate
asynchronous data transfers. It involves an exchange of control signals between the
sender and receiver to indicate readiness.
Example of Handshaking Process:

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 3


o Request Signal: The sender device signals the receiver that it is ready
to send data by raising a request line.
o Acknowledge Signal: The receiver acknowledges the request by
sending an acknowledgment signal to the sender, indicating that it is ready to accept
the data.
o Data Transfer: The sender transmits the data once the
acknowledgment is received.
o Completion: Once the data is transferred, the request and
acknowledgment signals are cleared, indicating the transfer is complete.
This handshaking mechanism ensures that the data is transmitted only when both
devices are ready, which is especially important when dealing with peripherals that
operate at different speeds than the CPU.
2. Strobe Signals:
o Another technique for asynchronous communication is the use of
strobe signals.
o The sender sends a data strobe signal to inform the receiver that valid
data is on the data bus.
o The receiver uses this signal to capture the data at the right moment.
Two Types of Strobe-based Transfer:
o Source-Initiated Strobe: The sender initiates the data transfer by
sending a strobe signal when data is ready. The receiver captures the data upon
receiving the strobe.
o Destination-Initiated Strobe: The receiver requests data from the
sender using a strobe signal when it is ready to receive the data.

Modes Of Transfer- Programmed I/O, Priority Interrupt, Direct Memory Access,

Modes of transfer in computer organization


In computer organization, modes of transfer refer to the various methods used to
transfer data between the CPU, memory, and I/O devices. These modes determine
how data moves within a system, based on factors like the speed of the devices and
the amount of data being transferred. The key modes of data transfer include:
1. Programmed I/O
2. Interrupt-Driven I/O
3. Direct Memory Access (DMA)

1. Programmed I/O
In programmed I/O, the CPU is responsible for managing the entire data transfer
process between the I/O device and memory. The CPU actively polls the I/O device
to check whether it is ready to send or receive data, and then it manually moves the
data one piece at a time.
Characteristics:
 Polling: The CPU continuously checks the status of the I/O device, either
waiting for the device to be ready for communication or processing data.
 CPU Involvement: The CPU is fully involved in every step of the data
transfer, which means that it cannot perform other tasks while the data transfer is
happening.
Steps:
1. The CPU issues a command to the I/O device.
2. The CPU repeatedly checks (polls) the status of the device until it's ready.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 4


3. Once the device is ready, the CPU reads or writes the data to/from memory.

Example:
 Keyboards and Mice: Simple devices like keyboards and mice often use
programmed I/O because their data transfer rates are low, and polling the device is
sufficient.
Advantages:
 Simple to implement.
 No additional hardware required.
Disadvantages:
 Inefficient: The CPU spends a significant amount of time polling the device,
leading to wastage of processing power.
 Low Performance: The CPU is kept busy with I/O tasks, reducing the time
available for other processes.

2. Interrupt-Driven I/O
In interrupt-driven I/O, the CPU is not continuously polling the I/O device. Instead,
the I/O device sends an interrupt to the CPU when it is ready for data transfer,
allowing the CPU to perform other tasks while waiting for the device.
Characteristics:
 Interrupt Signals: When the device is ready, it sends an interrupt signal to
the CPU. The CPU then temporarily pauses its current task to service the interrupt.
 Interrupt Handler: When an interrupt occurs, the CPU runs a special routine
called an interrupt handler, which processes the I/O request and transfers data.
Steps:
1. The CPU issues an I/O command to the device and continues executing other
tasks.
2. When the I/O device is ready, it sends an interrupt to the CPU.
3. The CPU halts its current operations, executes the interrupt handler to
transfer data, and then resumes its original task.
Example:
 Disk I/O: When reading data from a disk, the system typically uses interrupts
to signal when a block of data is ready to be processed.
Advantages:
 Efficient Use of CPU: The CPU does not waste time polling; it performs other
tasks and only handles the I/O when necessary.
 Improved Performance: Since the CPU is free to execute other tasks, the
overall system performance improves.
Disadvantages:
 Complexity: The interrupt-driven I/O mechanism is more complex to
implement than programmed I/O.
 Overhead: Handling multiple interrupts in quick succession can lead to
overhead, affecting system performance.

3. Direct Memory Access (DMA)


Direct Memory Access (DMA) is a mode of transfer in which a special hardware
component, known as the DMA controller, manages data transfer between memory
and I/O devices directly, without involving the CPU for each byte of data transfer.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 5


This significantly reduces CPU involvement and speeds up the process, especially
for large data transfers.

Characteristics:
 DMA Controller: The DMA controller takes over the data transfer process,
allowing the CPU to perform other tasks without being interrupted.
 Bulk Data Transfer: DMA is highly efficient for transferring large blocks of
data, such as moving data from disk to memory or memory to peripheral devices.
Steps:
1. The CPU initializes the DMA controller by specifying the memory location, the
I/O device, the direction of transfer (read/write), and the amount of data to transfer.
2. The DMA controller handles the data transfer directly between the memory
and the I/O device.
3. Once the transfer is complete, the DMA controller sends an interrupt to the
CPU to inform it that the operation is finished.
Example:
 Hard Disk to Memory Transfer: When reading a large file from a hard disk to
memory, DMA is used to transfer the data quickly without consuming CPU
resources.
Advantages:
 High Efficiency: The CPU is free to perform other tasks while the DMA
controller handles the transfer, resulting in better system performance.
 Best for Bulk Data Transfer: DMA is ideal for transferring large amounts of
data efficiently.
Disadvantages:
 Additional Hardware Required: The system needs a DMA controller, which
increases the complexity and cost of the system.
 Not Suitable for Small Transfers: For small data transfers, the overhead of
setting up DMA may outweigh its benefits.

Comparison of I/O Transfer Modes


Feature Programmed I/O Interrupt-Driven I/O DMA
CPU High (CPU constantly Moderate (CPU Low (DMA controller
Involvement polls devices) handles interrupts) handles transfers)
Low (wastes CPU Moderate (CPU is High (CPU is free
Efficiency
time) interrupted but free) during data transfer)
Data Transfer Low (suitable for Moderate (suitable for High (ideal for large,
Rate small transfers) medium transfers) fast transfers)
Simple (easy to Medium (interrupt Complex (requires a
Complexity
implement) management) DMA controller)
Slow devices (e.g., Medium-speed devices High-speed devices
Use Cases
keyboards) (e.g., printers) (e.g., hard drives)

Input – Output Processor (IOP).

Input-Output Processor (IOP) in Computer Organization

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 6


An Input-Output Processor (IOP) is a specialized processor in computer systems
that manages all input and output (I/O) operations, relieving the Central Processing
Unit (CPU) from the overhead of handling I/O tasks. The IOP acts as an
intermediary between the CPU and peripheral devices, ensuring efficient data
transfer between them.
The IOP is also referred to as a Peripheral Processor or I/O Controller, and it can
directly communicate with both memory and peripheral devices, allowing the CPU to
focus on computational tasks without being bogged down by I/O management.
Functions of the IOP
The IOP is responsible for several key functions:
1. Data Transfer: Manages the transfer of data between the I/O devices (like
printers, disk drives, and network interfaces) and memory without direct involvement
from the CPU.
2. Interrupt Handling: Handles interrupts from I/O devices and performs
necessary data transfers, allowing the CPU to execute other tasks in parallel.
3. Data Formatting and Conversion: Converts data between the format
required by the I/O devices and the format used by the CPU and memory.
4. Control and Synchronization: Manages the synchronization between I/O
devices and memory, ensuring that data is transferred when both the device and
memory are ready.
5. Error Detection and Recovery: Monitors data transfer for errors, such as
parity errors, and initiates recovery processes when needed.
Interaction between CPU and IOP
In a system with an IOP, the interaction between the CPU, memory, and I/O devices
is structured to maximize efficiency:
1. Command Issuing: The CPU sends I/O commands to the IOP, specifying the
I/O operation to be performed. These commands may include:
o Which device to interact with.
o The location in memory where the data is to be read or written.
o The amount of data to transfer.
2. IOP Operation: Once the IOP receives the command, it takes over the
responsibility of controlling the I/O devices and managing data transfers. The IOP
communicates directly with memory and peripheral devices to perform the required
operation.
3. Independent Operation: The IOP operates independently of the CPU. While
the IOP is handling I/O tasks, the CPU can continue executing other processes
without waiting for the I/O operation to complete.
4. Completion Notification: After completing the I/O operation, the IOP sends
an interrupt to the CPU, indicating that the task is done. The CPU can then process
the transferred data or continue executing other instructions.
Architecture of an IOP
An IOP typically has its own internal architecture, including several key components
similar to a basic CPU. These components enable the IOP to function independently
and manage I/O operations effectively.
1. Control Unit (CU): The control unit in the IOP is responsible for interpreting
I/O commands from the CPU and issuing appropriate control signals to peripheral
devices and memory for data transfer.
2. Arithmetic and Logic Unit (ALU): The IOP may have a simple ALU for
performing basic arithmetic and logical operations during data transfers, such as
error checking, data formatting, or simple computations related to the I/O process.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 7


3. Registers: The IOP contains internal registers to hold data, status
information, and addresses during the transfer of data between the I/O devices and
memory.
4. Memory Access Control: The IOP manages direct access to memory via
techniques like Direct Memory Access (DMA), allowing it to read or write data
to/from memory independently of the CPU.
5. I/O Channels: These are pathways through which the IOP communicates
with various peripheral devices. The I/O channels are used to control the data flow
between the memory and the I/O devices.
Advantages of Using an IOP
1. Reduced CPU Overhead: The IOP takes over the I/O handling, freeing up
the CPU to focus on more important computational tasks, improving overall system
performance.
2. Parallel Processing: Since the IOP operates independently, the system can
handle I/O and processing tasks simultaneously, increasing throughput and
efficiency.
3. Efficient Data Transfer: The IOP can handle complex I/O tasks such as
formatting, error checking, and buffering, making data transfer faster and more
reliable.
4. Scalability: By offloading I/O operations to the IOP, the system can scale to
handle more I/O devices without overwhelming the CPU.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 8


UNIT – V
Computer Arithmetic and Parallel Processing: Data representation- fixed point,
floating point,addition and subtraction, multiplication and division algorithms.
Parallel Processing-Parallel Processing, Pipelining, Arithmetic Pipeline, Instruction
Pipeline.

Computer Arithmetic and Parallel Processing: Data representation- fixed point,


floating point,addition and subtraction, multiplication and division algorithms.

In computer organization, arithmetic operations are performed on binary data,


which can be represented in different forms such as fixed point and floating point.
These representations enable efficient processing of both integer and fractional
numbers. Arithmetic operations like addition, subtraction, multiplication, and
division are essential for computation, and several algorithms are used to optimize
their execution, particularly in parallel processing environments.

1. Data Representation
1.1 Fixed-Point Representation
In fixed-point representation, the decimal point is fixed at a specific position in the
binary number. This format is typically used for representing integers, and in some
cases, fractional numbers, where the decimal (or binary point) does not move. Fixed-
point numbers are suitable for applications that require a constant range and
precision (e.g., integer arithmetic, real-time systems).
 Unsigned Fixed Point: Only positive numbers are represented.
o Example: 1010 (binary) = 10 (decimal)
 Signed Fixed Point: Both positive and negative numbers can be represented
using techniques like Two's Complement.
o Example: 1101 (binary) in Two's Complement = -3 (decimal)
1.2 Floating-Point Representation
Floating-point representation allows the decimal point to "float" relative to the
number. This format is used to represent a much larger range of numbers, including
very large and very small values, as well as fractions. It is widely used in scientific
and financial computing where a high degree of precision is needed.
The IEEE 754 Standard is commonly used for floating-point numbers, with a number
represented in three parts:
1. Sign Bit (S): Indicates if the number is positive or negative (1 for negative, 0
for positive).
2. Exponent (E): Specifies the position of the binary point.
3. Mantissa/Significant (M): Contains the actual binary digits of the number.
A floating-point number is represented as:
N=(−1)S×M×2(E−Bias)N = (-1)^S \times M \times 2^{(E-Bias)}N=(−1)S×M×2(E−Bias)
 Example: A 32-bit IEEE 754 floating-point number consists of 1 bit for the
sign, 8 bits for the exponent, and 23 bits for the mantissa.

2. Arithmetic Algorithms
2.1 Addition and Subtraction Algorithms
Fixed-Point Addition and Subtraction
 Unsigned Binary Addition: Binary numbers are added bit by bit, starting
from the least significant bit (LSB), and carry is propagated to the next bit.
o Example: Adding 1101 (13) and 1010 (10):

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 9


 1101 + 1010 = 10111 (23)
 Signed Binary Addition (Two’s Complement): In signed addition using
two’s complement, the process is similar to unsigned addition. Overflow can occur if
the result exceeds the range of the representation.
o Example: Adding 1101 (-3) and 0011 (3) gives 0000 (0).
Floating-Point Addition and Subtraction
1. Align the Exponents: Ensure both numbers have the same exponent by
shifting the smaller number's mantissa to the right.
2. Add or Subtract the Mantissas: Perform the operation on the aligned
mantissas.
3. Normalize the Result: Adjust the result by normalizing the mantissa (if
necessary) and adjusting the exponent.
4. Round and Handle Overflow/Underflow: Apply rounding and check for
overflow or underflow.
Example:
 Adding 1.110 × 2^3 and 1.001 × 2^2:
o Align exponents: 1.110 × 2^3 and 0.1001 × 2^3
o Add mantissas: 1.110 + 0.1001 = 10.0101
o Normalize: 1.00101 × 2^4
2.2 Multiplication Algorithms
Fixed-Point Multiplication
Binary multiplication is performed similarly to decimal multiplication, using bit-wise
shifting and addition:
 Shift-and-Add Algorithm:
1. Multiply the LSB of the multiplier by the multiplicand.
2. Shift the multiplicand left and repeat for the next bit of the multiplier.
3. Add the results.
Example: Multiplying 1011 (11) and 1100 (12):
yaml
Copy code
1011
× 1100
0000 (0 × 1011) 1011 (1 × 1011) 1011 (1 × 1011) shifted
1001100 (Result: 132)
markdown
Copy code

##### **Booth's Multiplication Algorithm**


**Booth's Algorithm** is used for signed multiplication and reduces the number of
required addition/subtraction operations. It examines pairs of bits and generates
appropriate partial products by shifting and adding/subtracting.

Steps:
1. Analyze the multiplier bits two at a time.
2. Depending on the pattern, either add, subtract, or leave the multiplicand
unchanged.
3. Shift the result and repeat.

**Example**:
- Multiply `-6 (11010)` by `+3 (0011)` using Booth's Algorithm:

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 10


- Perform operations based on pairs of bits (0, 1, -1).

#### **2.3 Division Algorithms**

##### **Fixed-Point Division**


Binary division is performed similarly to long division in decimal arithmetic. It involves
repeated subtraction and shifting operations.

1. **Restoring Division**:
- Subtract the divisor from the dividend's most significant bits.
- If the result is positive, place a `1` in the quotient and continue.
- If negative, restore the previous result by adding the divisor back, place `0` in the
quotient, and continue.

2. **Non-Restoring Division**:
- Optimized version of the restoring method that skips the restoration step by
allowing partial quotients to become negative temporarily.

**Example**:
- Divide `1100` (12) by `0011` (3).

##### **Floating-Point Division**


Floating-point division involves dividing the mantissas and subtracting the
exponents:
1. **Subtract the Exponents**.
2. **Divide the Mantissas**.
3. **Normalize the Result**.
4. **Round and Handle Overflow/Underflow**.

**Example**:
- Dividing `1.110 × 2^3` by `1.010 × 2^2` gives:
- Mantissa: `1.110 ÷ 1.010 = 1.011`
- Exponent: `3 - 2 = 1`
- Result: `1.011 × 2^1`.

---

### **3. Parallel Processing in Arithmetic Operations**

In parallel processing, multiple arithmetic units can be used simultaneously to


improve the efficiency of these operations. Key techniques include:

- **Bit-level Parallelism**: Operates on multiple bits of the operands simultaneously.


For example, modern processors use 64-bit architecture to operate on 64 bits in
parallel, speeding up arithmetic operations.

- **Instruction-level Parallelism (ILP)**: Allows the execution of multiple arithmetic


instructions concurrently. This is achieved through techniques like **pipelining**,
**superscalar execution**, and **out-of-order execution**.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 11


- **Array Processors and SIMD (Single Instruction, Multiple Data)**: These
architectures perform the same operation on multiple data points simultaneously,
which is particularly useful in matrix operations, multimedia processing, and scientific
computations.

### **Conclusion**

Understanding data representation and efficient algorithms for arithmetic operations


like **addition**, **subtraction**, **multiplication**, and **division** is crucial in
computer organization. Fixed-point and floating-point representations offer ways to
handle integers and real numbers, while algorithms such as **Booth's multiplication**
and **restoring division** ensure efficient processing. In parallel processing, the use
of multiple arithmetic units enhances performance, making modern computing
systems capable of handling complex computations at high speeds.
4o

Multiplication and Division Algorithms in Computer Organization


Multiplication and division are fundamental arithmetic operations in computer
organization. These operations, especially for binary numbers, can be complex and
require efficient algorithms for fast and accurate computation. Let's explore the key
algorithms used for multiplication and division in binary systems, as well as their
implementations and optimizations.

1. Multiplication Algorithms
Binary multiplication involves multiplying bits and accumulating partial results, much
like decimal multiplication. There are various algorithms for performing binary
multiplication, ranging from simple shift-and-add methods to more optimized
techniques like Booth's Algorithm.
1.1 Shift-and-Add Multiplication (Unsigned Multiplication)
The shift-and-add algorithm is the simplest form of binary multiplication, where
partial products are computed for each bit in the multiplier and added together.
Steps:
1. Initialize the result to 0.
2. For each bit in the multiplier:
o If the bit is 1, add the multiplicand to the result.
o Shift the multiplicand to the left (by 1 bit) after processing each bit of
the multiplier.
3. Shift the result after each addition to adjust for the binary position.
4. The final result is the sum of the shifted multiplicands.
Example: Multiply 13 (1101) by 12 (1100)
yaml
Copy code
Multiplicand: 1101 (13)
Multiplier: 1100 (12)
-----------------------
Step 1: 0 + 0000 -> 00000 (since the LSB of multiplier is 0)
Step 2: 0 + 1101 -> 110100 (shift and add the multiplicand)
Step 3: 0 + 1101 -> 1101000 (shift again)
Step 4: Final result = 1100100 (in binary) = 156 (in decimal)

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 12


Complexity:
 Time complexity: O(n), where n is the number of bits in the multiplier.
 Space complexity: O(n).

1.2 Booth's Multiplication Algorithm


Booth's Algorithm is a more efficient technique for multiplying signed binary
numbers. It reduces the number of addition and subtraction operations by encoding
the multiplier to detect consecutive ones or zeroes, thus optimizing shifts and
additions.
Steps:
1. Analyze pairs of bits in the multiplier:
o 00: No addition, only shift.
o 01: Add the multiplicand to the partial result and shift.
o 10: Subtract the multiplicand from the partial result and shift.
o 11: No addition, only shift.
2. Shift the result after each operation and adjust accordingly.
3. The process continues until all bits in the multiplier are processed.
Example: Multiply -6 (11010 in Two’s complement) by 3 (0011)
Multiplicand: -6 (11010)
Multiplier: 3 (0011)
----------------------
Step 1: Analyze bits of the multiplier
- 00: No operation, just shift.
- 11: Add -6 and shift.
- 01: Subtract -6 and shift.
Final result: 11010 (binary) = -18 (decimal)
Advantages of Booth’s Algorithm:
 Efficient for numbers with large sequences of 1s or 0s.
 Reduces the number of additions/subtractions.
Complexity:
 Time complexity: O(n).
 Booth’s Algorithm can handle both signed and unsigned multiplication.

1.3 Array Multiplication (Parallel Multiplication)


Array Multiplication is a parallelized approach where multiple partial products are
computed simultaneously. This algorithm is used in hardware implementations
such as multipliers in processors, enabling high-speed multiplication by leveraging
parallelism.
Key Concepts:
 Parallel computation of partial products using an array of adders.
 The result is computed in fewer cycles, making it much faster than serial
algorithms.
Application:
 Widely used in VLSI (Very Large Scale Integration) circuits and ASICs
(Application-Specific Integrated Circuits) for efficient hardware multiplication.

2. Division Algorithms
Division is more complex than multiplication due to the need for repeated subtraction
and shifting. Two commonly used binary division algorithms are Restoring Division
and Non-Restoring Division.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 13


2.1 Restoring Division Algorithm
The Restoring Division Algorithm is a straightforward approach for binary division,
which performs repeated subtraction of the divisor from the dividend while restoring
the value when the subtraction leads to a negative result.
Steps:
1. Initialize the quotient and remainder to 0.
2. For each bit of the dividend:
o Shift the quotient left.
o Subtract the divisor from the current remainder.
o If the result is positive, set the quotient bit to 1.
o If the result is negative, restore the remainder by adding back the
divisor and set the quotient bit to 0.
3. Continue until all bits of the dividend have been processed.
Example: Divide 1100 (12) by 0011 (3)
vbnet
Copy code
Dividend: 1100
Divisor: 0011
---------------------
Step 1: Subtract divisor -> Positive, set quotient bit to 1.
Step 2: Shift left and subtract -> Negative, restore remainder.
Step 3: Continue until all bits are processed.
Result: Quotient = 0100 (4), Remainder = 0.
Complexity:
 Time complexity: O(n).
 Restoring division requires restoration of the remainder for each negative
result, which can be inefficient.

2.2 Non-Restoring Division Algorithm


The Non-Restoring Division Algorithm optimizes division by eliminating the need
to restore the remainder after negative results. Instead, it handles negative results by
continuing the division process and adjusting the quotient at the end.
Steps:
1. Initialize the quotient and remainder.
2. For each bit of the dividend:
o If the remainder is positive, subtract the divisor.
o If the remainder is negative, add the divisor.
o Shift the quotient left and set the quotient bit based on the result.
3. At the end of the process, adjust the quotient if necessary to account for the
sign of the remainder.
Example: Divide 1100 (12) by 0011 (3)
vbnet
Copy code
Dividend: 1100
Divisor: 0011
---------------------
Step 1: Subtract divisor -> Positive, set quotient bit to 1.
Step 2: Negative remainder, add back divisor.
Step 3: Continue until all bits are processed.
Result: Quotient = 0100 (4), Remainder = 0.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 14


Advantages of Non-Restoring Division:
 More efficient than restoring division as it eliminates unnecessary restoration
steps.
 Often used in hardware implementations for faster performance.
Complexity:
 Time complexity: O(n).

2.3 Division by Repeated Subtraction


Repeated Subtraction is the simplest approach for division, where the divisor is
repeatedly subtracted from the dividend until the remainder is less than the divisor.
Steps:
1. Subtract the divisor from the dividend.
2. Increment the quotient for each subtraction.
3. Repeat until the remainder is less than the divisor.
Example:
yaml
Copy code
12 ÷ 3:
12 - 3 = 9 (quotient = 1)
9 - 3 = 6 (quotient = 2)
6 - 3 = 3 (quotient = 3)
3 - 3 = 0 (quotient = 4)
Complexity:
 Time complexity: O(n) for n-bit numbers.
 This method is inefficient for large numbers and is mostly used for manual
division or simple cases.

Conclusion
Multiplication and division algorithms in computer organization are essential for
arithmetic operations in CPUs and hardware systems. Algorithms like shift-and-add
multiplication and Booth's multiplication are widely used for binary multiplication,
while restoring and non-restoring division algorithms optimize binary division.
Modern processors often use parallelism, such as in array multipliers, to speed up
these operations. Understanding these algorithms is crucial for designing efficient
arithmetic logic units (ALUs) and for optimizing system performance.

Parallel Processing-Parallel Processing, Pipelining, Arithmetic Pipeline,


Instruction Pipeline.

Parallel Processing in Computer Organization


Parallel processing is a method used in computer organization to perform multiple
computations simultaneously. It involves the concurrent execution of operations,
either by breaking down tasks into smaller sub-tasks that can be performed
independently or by exploiting hardware resources to execute multiple instructions at
the same time. This leads to higher system throughput and efficient use of
computing resources.

1. Parallel Processing
Parallel processing involves dividing tasks into smaller units that can be executed
simultaneously by multiple processors or cores. It is particularly beneficial in

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 15


applications requiring large-scale computations, such as scientific simulations,
graphics rendering, and data analysis.
Types of Parallelism:
 Data Parallelism: Different data sets are processed in parallel, using the
same operation. This is useful in vector processing and SIMD (Single Instruction,
Multiple Data) architectures.
 Task Parallelism: Different tasks or instructions are executed in parallel,
each performing different operations on possibly the same or different data. This is
common in MIMD (Multiple Instruction, Multiple Data) systems.
Key Features:
 Speedup: The primary advantage of parallel processing is the speedup
gained by distributing work across multiple processing units.
 Scalability: Systems can be scaled by adding more processors to handle
larger workloads.
Architectures for Parallel Processing:
 Multi-core Processors: A single chip with multiple processing cores.
 Multiprocessor Systems: Multiple CPUs that can share memory and
resources.
 Clusters and Grids: Networks of interconnected computers that work
together to solve a problem.

2. Pipelining
Pipelining is a technique where multiple instruction stages are processed
simultaneously in different pipeline stages. It improves the instruction throughput by
overlapping the execution of several instructions.
Pipeline Stages:
A typical pipeline is divided into stages, each performing a part of an instruction’s
execution. Common stages include:
1. Fetch (IF): Retrieve the next instruction from memory.
2. Decode (ID): Interpret the instruction and prepare necessary resources.
3. Execute (EX): Perform the required computation or data movement.
4. Memory Access (MEM): Access data from memory, if required.
5. Write Back (WB): Store the result back into a register or memory.
Example:
For a simple instruction like ADD R1, R2, R3 (R1 = R2 + R3), each pipeline stage
performs one part of the process:
 IF: Fetch the instruction.
 ID: Decode the instruction and read registers R2 and R3.
 EX: Add the contents of R2 and R3.
 MEM: Access memory if needed (not in this case).
 WB: Write the result back to R1.
Benefits of Pipelining:
 Increased Throughput: Multiple instructions are processed simultaneously,
increasing the number of instructions executed per unit of time.
 Reduced Instruction Latency: Each instruction takes multiple cycles, but the
overall time per instruction is reduced as stages overlap.
Hazards in Pipelining:
Pipelining introduces certain challenges, known as pipeline hazards:
 Data Hazards: Occur when instructions depend on the results of previous
instructions still in the pipeline.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 16


 Control Hazards: Occur due to branch instructions that may change the flow
of execution, requiring pipeline flushing.
 Structural Hazards: Occur when hardware resources are insufficient to
support all pipeline stages simultaneously.

3. Arithmetic Pipeline
An Arithmetic Pipeline is a type of pipeline specifically designed to handle the
stages of arithmetic operations such as addition, multiplication, and floating-point
computations. Arithmetic pipelines break down complex arithmetic operations into
smaller stages that can be processed in parallel.
Example of Arithmetic Pipeline:
For a floating-point multiplication, an arithmetic pipeline might be divided into stages:
1. Mantissa Multiplication: Multiply the mantissas of the two floating-point
numbers.
2. Exponent Addition: Add the exponents of the numbers.
3. Normalization: Adjust the result to fit the standard floating-point
representation.
4. Rounding: Round the result to the appropriate precision.
Each of these stages can be executed in parallel for different sets of numbers,
thereby speeding up the overall multiplication operation.
Use Cases:
 Digital Signal Processing (DSP): In applications like filtering and
convolution, arithmetic pipelines can accelerate operations.
 Vector Processors: Arithmetic pipelines are integral to vector processors,
which perform the same operation on multiple data points simultaneously.

4. Instruction Pipeline
An Instruction Pipeline refers to the pipelined execution of instructions in a CPU.
Modern processors use instruction pipelines to fetch, decode, execute, and complete
multiple instructions in an overlapping manner.
Steps in Instruction Pipelining:
1. Fetch: Get the next instruction from memory.
2. Decode: Determine what the instruction does and prepare the necessary
operands.
3. Execute: Perform the computation or action specified by the instruction.
4. Memory Access: Read/write to memory if needed.
5. Write Back: Store the result of the instruction.
Example:
Consider executing the following sequence of instructions:
1. ADD R1, R2, R3
2. SUB R4, R5, R6
3. MUL R7, R8, R9
In a non-pipelined processor, each instruction would need to complete before the
next one starts, resulting in a long execution time. In contrast, with pipelining:
 While the first instruction is in the execute stage, the second can be in the
decode stage, and the third in the fetch stage. This overlap increases throughput
significantly.
Challenges in Instruction Pipelines:
 Instruction Dependencies: When one instruction depends on the result of a
previous instruction, the pipeline may need to stall until the result is ready.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 17


 Branch Instructions: Branches can disrupt the pipeline flow since the next
instruction depends on the outcome of the branch.
Techniques to Improve Pipeline Efficiency:
 Branch Prediction: Predict the outcome of branches to avoid stalls.
 Forwarding (Data Hazards): Pass the results of an operation directly to
subsequent instructions without waiting for it to be written to a register.
 Superpipelining: Increase the number of pipeline stages to improve
instruction throughput.

Parallelism in Pipelines
Super-scalar Processing:
Modern processors implement super-scalar architectures where multiple pipelines
operate in parallel, allowing the execution of more than one instruction per clock
cycle. This provides an even higher level of parallelism.
SIMD and MIMD:
 SIMD (Single Instruction, Multiple Data) pipelines operate on multiple data
elements in parallel using the same instruction, ideal for vector and matrix
operations.
 MIMD (Multiple Instruction, Multiple Data) systems support multiple
pipelines executing different instructions on different data, maximizing flexibility and
performance in multiprocessor systems.

5. Summary of Key Concepts


Concept Description
Parallel Performing multiple computations simultaneously using multiple
Processing processors or cores.
Overlapping the execution of instructions by dividing them into
Pipelining
stages that execute simultaneously.
Arithmetic A pipeline that processes arithmetic operations in parallel, breaking
Pipeline them into smaller sub-operations (e.g., floating-point operations).
Instruction A pipeline used in processors to fetch, decode, execute, and
Pipeline complete instructions concurrently, improving throughput.

Conclusion
Parallel processing, through techniques like pipelining, arithmetic pipelines, and
instruction pipelines, is a powerful way to enhance the performance of modern
processors. By allowing multiple operations or instructions to be processed
simultaneously, these techniques maximize efficiency, reduce execution time, and
enable high-performance computing in various applications, from general-purpose
processors to specialized systems for scientific computing and multimedia
processing.

V.SAI KRISHNA M.SC.,M.TECH., (PHD) Page 18

You might also like