17 Computer Architecture and Organization
17 Computer Architecture and Organization
(CSE2003)
7-Apr-25 1
Multiprocessor Architectures
Module 6
7-Apr-25 2
Background
If an application needed huge amount of data computation in very
short period, then there is requirement of following items at low cost.
• Multiple functional units
• Large cache
• Separate buses for instructions and data
• Interleaved main memory
• Fast circuit technology
7-Apr-25 3
Introduction
• Traditional way to increase system performance is to use multiple
processors that can execute in parallel to support a given work.
• The two most common multiple-processor organizations are
1. Symmetric multiprocessors (SMPs)
2. Clusters
3. Nonuniform memory access (NUMA)
7-Apr-25 4
Symmetric multiprocessors (SMPs)
• An SMP consists of multiple similar processors within the same
computer, interconnected by a bus or some sort of switching
arrangement.
• The most critical problem to address in an SMP is that of cache
coherence. Each processor has its own cache and so it is possible for a
given line of data to be present in more than one cache.
• If such a line is altered in one cache, then both main memory and the
other cache have an invalid version of that line.
• Cache coherence protocols are designed to cope with this problem.
7-Apr-25 5
Cluster
• A cluster is a group of interconnected, whole computers working
together as a unified computing resource that can create the illusion
of being one machine.
• The term whole computer means a system that can run on its own,
apart from the cluster.
7-Apr-25 6
Nonuniform memory access (NUMA)
7-Apr-25 7
Generic Block Diagram of a Tightly Coupled
Multiprocessor NUMA
7-Apr-25 8
Supercomputer vs Multi processor
• Supercomputers typically • High performance is achieved
combine millions of such through arrangement of two or
microprocessors to interpret more processor on a single chip
and execute instructions. or different processor on
• It is used for scientific separate chips.
research and commercial • Multiprocessor system has
purpose, like CAD design, efficient bandwidth, medium
fluid flow analysis. processing power, memory
• High price as well as capacity and I/O devices.
maintenance and • It is applicable for normal
performance cost. workstation at reasonable cost.
7-Apr-25 9
Multi Core Architecture
• A multi-core processor is an
integrated circuit with two
or more processors
connected to it for faster
simultaneous processing of
several tasks, reduced
power consumption, and for
greater performance.
Generally, it is made up of
two or more processors that
read and execute program
instructions.
7-Apr-25 10
Multi-core processor
• In other words, on a single chip, a multi-core processor
comprises numerous processing units, or "Cores," each of
which has the potential to do distinct tasks.
• For instance, if you are performing many tasks at once,
such as watching a movie and using WhatsApp, one core
will handle activities like watching a movie while the other
handles other responsibilities like WhatsApp.
7-Apr-25 11
Parallel Processing
Large computation can be divided into many parts that can be
performed in parallel:
e.g.
I/O Transfer and processor computation can be performed in
parallel.
7-Apr-25 12
Types of Parallel Processor System/Flynn
Classification
• Single instruction, single data (SISD) stream
• Single instruction, multiple data (SIMD) stream
• Multiple instruction, single data (MISD) stream
• Multiple instruction, multiple data (MIMD) stream
7-Apr-25 13
Single instruction, single data (SISD) stream
7-Apr-25 14
Single instruction, multiple data (SIMD) stream
• A single machine instruction
controls the simultaneous
execution of a number of
processing elements on a
lockstep basis. Each
processing element has an
associated data memory, so
that each instruction is
executed on a different set of
data by the different
processors. Vector and array
processors fall into this
category.
7-Apr-25 15
Multiple instruction, single data (MISD) stream
• A sequence of data is
transmitted to a set of
processors, each of which
executes a different
instruction sequence. This
structure is not
commercially implemented.
7-Apr-25 16
Multiple instruction, multiple data (MIMD)
stream
•A set of processors
simultaneously execute
different instruction
sequences on different data
sets. SMPs, clusters, and
NUMA systems fit into this
category.
7-Apr-25 17
CACHE COHERENCE
• In contemporary multiprocessor systems, it is customary to have one
or two levels of cache associated with each processor.
• It does, however, create a problem known as the cache coherence
problem.
• Multiple copies of the same data can exist in different caches
simultaneously, and if processors are allowed to update their own
copies freely, an inconsistent view of memory can result.
7-Apr-25 18
Memory Organization for Multiprocessor
Architecture
Uniform memory access (UMA)
• All processors have access to all parts
of main memory using loads and
stores.
• The memory access time of a
processor to all regions of memory is
the same.
• The access times experienced by
different processors are the same.
• The memory modules are accessed
using a single global address space.
• In such a shared memory system , all
processors accesses all memory
modules in the same way.
7-Apr-25 19
Nonuniform memory access (NUMA)
7-Apr-25 20
Contd..
7-Apr-25 21
Inter Connection Structures
7-Apr-25 22
7-Apr-25 23
7-Apr-25 24
7-Apr-25 25
7-Apr-25 26
7-Apr-25 27
7-Apr-25 28