0% found this document useful (0 votes)

14 views52 pages

Lecture 11

Uploaded by

jihem33832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views52 pages

Lecture 11

Uploaded by

jihem33832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Basic Communication

Operations
Preliminaries
• A big problem is divided into smaller tasks (logical unit)
• Process is an entity that execute tasks
• Mapping is performed to allocate tasks to processes
• Several processes executes at the same time and perform Inter
Process Communication (Interaction)
• Interaction is performed to share Data, Work, Synchronization
Information
• There are various patterns for communication
Assumptions for the Operations
• Interconnections support cut-through routing
• Communication time between any pair of nodes in the
network is same (regardless of the number of intermediate
nodes)
• Links are bi-directional
• The directly connected nodes can simultaneously send and receive messages of
m words without any congestion
• Single-port communication model
• A node can send on only one of its links at a time
• A node can receive on only one of its links at a time
• However, a node can receive a message while sending
another message at the same time on the same or a different
link.
Patterns
1. One to All Broadcast/All to One Reduction
2. All to All Broadcast/All to All Reduction
3. All Reduce (All to One Reduction + One to All
Broadcast)
4. Scatter (One to All Broadcast Personalized)/Gather
Topologies
1. Ring/Linear Array (One Dimensional)
2. Mesh (Two Dimensional)
3. Hyper Cube (Three Dimensional)
One-to-All Broadcast and All-to-One
Reduction
One-to-All Broadcast
• A single process sends identical data to all other processes.
• Initially one process has data of m size.
• After broadcast operation, each of the processes have own copy of
the m size.
All-to-One Reduction
• Dual of one-to-all broadcast
• The m-sized data from all processes are combined through an
associative operator
• Accumulated at a single destination process into one buffer of
size m
One-to-All Broadcast and All-to-One Reduction
One-to-All Broadcast and All-to-One Reduction
• Application: Used in many parallel algorithms including matrix-vector
multiplication, shortest path, Gaussian Elimination.
• How it works: Sequentially send p-1 from the source to the other p-1
process

• Disadvantages:
• Source becomes bottleneck
• The communication network is underutilized because only the connection
between a single pair of nodes is used at a time
• Solution: Recursive Doubling
Recursive doubling (Linear Array or Ring)
Recursive Doubling Broadcast
• Source process sends the massage to another process
• In next communication phase both the processes can
simultaneously propagate the message
• Message “HI” from the source node P0 is passed to all other nodes in
the ring in following three steps:
1. P0 to P4 (Distance:4)
2. P0 to P2, P4 to P6, in parallel (Distance:2)
3. P0 to P1, P2 to P3, P4 to P5, P6 to P7, in parallel (Distance:1)
Recursive doubling (Linear Array or Ring)
Recursive Doubling Reduction
Example: Sum of all numbers
Mesh
• We can regard each row and column of a square mesh
of p nodes as a linear array of nodes
• Communication algorithms on the mesh are simple
extensions of their linear array counterparts

Broadcast and Reduction

• Two step breakdown:
i. The operation is performed along one dimension by treating the row
as linear array
ii. Then the all the columns are treated similarly
One to all Broadcast on a 16 node mesh
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI”
Step 1 (0th row recursive doubling)
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI”
Step 2 (0th row recursive doubling)
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
Step 3 (All Column recursive doubling)
3 7 11 15

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
Step 4 (All Column recursive doubling)
3 7 11 15
“HI” “HI” “HI” “HI”

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13
“HI” “HI” “HI” “HI”

0 4 8 12
“HI” “HI” “HI” “HI”
Reduction
3 7 11 15
“HI” “HI” “HI” “HI”

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13
“HI” “HI” “HI” “HI”

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI”
Mesh (Broadcast and Reduction)
Hypercube
Broadcast
• Source node first send data to one node in the highest
dimension
• The communication successively proceeds along lower
dimensions in the subsequent steps
• The algorithm is same as used for linear array
• However, here [in hypercube] changing order of dimension will not congest
the network
Hypercube (Broadcast)
Matrix-Vector Multiplication (An
Application)
All-to-All Broadcast and All-to-All
Reduction
• All-to-All Broadcast
• A generalization to of one-to-all broadcast.
• Every process broadcasts m-word message.
• The broadcast-message for each of the processes can be
different than others
• All-to-All Reduction
• Dual of all-to-all broadcast
• Each node is the destination of an all-to-one reduction out of
total P reductions.
All-to-All Broadcast and All-to-All
Reduction
Linear Ring Broadcast (All to All)
Linear Ring Reduction (All to All)
• Draw an All-to-All Broadcast on a P-node linear ring
• Reverse the directions in each foreach of the step without
changing message
• After each communication step, combine messages
having same broadcast destination with associative
operator.
Task
• Draw an All-to-All Broadcast on a 4-node linear ring
• Reverse the directions and combine the results using ‘SUM’
All-to-All Broadcast on 2D Mesh
• based on the linear
array algorithm,
treating rows and
columns of the mesh
as linear arrays
• communication takes
place in two phases
• Row Wise All to All
Broad cast
• Column Wise All to
All Broad cast
All-to-All Broadcast on HyperCube
• The hypercube algorithm for all-to-all broadcast extends
the mesh algorithm to log p dimensions.
• Procedure: Requires log p steps.
• Communication: Occurs along a different dimension (x, y,
z) of the p-node hypercube in each step.
• Step Process: Pairs of nodes exchange data, doubling the
message size for the next step by concatenating received
messages with current data.
• Figure Illustrates these steps for an eight-node hypercube
with bidirectional communication channels.
All-Reduce
• All-Reduce: All to One Reduction + One to All Broad Cast
• Use all-to-one reduction followed by one-to-all broadcast
• The output is same as All to All Broadcast with less traffic
congestion
Example

• All to All Broadcast

• All to One Reduction

• One to All Broadcast

Prefix-Sums
• Prefix-sums are also known as scan operations
• Given p numbers n0, n1, ..., np-1(one on each node), the
problem is to compute the sums such that: -
• 𝑺𝒌 = σ𝑘𝑖=0 (𝒏𝒊)
• Here 𝑺𝒌 is the prefix-sum computed at kth node after the operation.
• Example:
• Original sequence: <3, 1, 4, 0, 2>
• Sequence of prefix sums: <3, 4, 8, 8, 10>
Rules
• Round Bracket (): The
msg is sent to other
node in next step
• Square Bracket []: The
msg is kept with that
node
• Lower index node will
keep msg in square
bracket as it is
• Higher index will add
msg in square bracket
that it got from lower
index node
Scatter and Gather
• Scatter (one-to-all personalized communication)
• Gather (Concatenation) is different than all to one
reduction as it doesn’t reduce the results with
associative operator
The scatter operation on an eight-
node hypercube
All-to-All personalized Communication
• Each node sends a distinct message of size m to every
other node.
• Also known total exchange
Example (Transpose Matrix)

All-to-all personalized communication in

transposing a 4 x 4 matrix using four processes.
All-to-All personalized [Ring]
Cont.
• All-to-all personalized communication on a six-node
ring.
• The label of each message is of the form {x, y},
where x is the label of the node that originally owned
the message, and y is the label of the node that is the
final destination of the message.
• The label ({x1, y1}, {x2, y2}, ..., {xn, yn}) indicates a
message that is formed by concatenating n individual
messages.
All-to-All personalized [Mesh]
• Two Steps
1. All to All Personal
Communication (Row Wise)
2. All to All Personal
Communication (Column
Wise)
All-to-All
personalized
[Hyper Cube]
• 0th Process
• 1st Step (x-axis) 0<->1
• (0,1), (0,3), (0,5), (0,7)

• 2nd step (y-axis) 0<->2

• (0,2), (0,6), (1,2), (1,6)

• 3rd Step (z-axis) 0<->4

• (0,4), (1,4), (2,4), (3,4)

• 2nd Process
• 1st Step (x-axis) 2<->3
• (2,3), (2,7), (2,5), (2,1)

• 2nd step (y-axis) 2<->0

• (2,0), (2,4), (3,0), (3,4)

• 3rd Step (z-axis) 2<->6

• (2,6), (3,6), (0,6), (1,6)
Circular Shift
• circular q-shift is the operation in which node i sends
a data packet to node (i + q) mod p in a p-node
ensemble (0 < q < p).
Circular Shift [Linear/Ring]
• Min (q, P-q) for finding short path of communication
Circular Shift [Mesh]
• Circular shift over Mesh
Topology is done in following
steps
1. Communication Over Row {q
mod sqrt (p)}
2. Compensatory Column Shift
3. Communication Over Column
{Floor[q/sqrt(p)]}

The communication steps in a

circular 5-shift on a 4 x 4 mesh
Circular Shift
[Hypercube]
• Q-shift e.g 5-shift
• First convert to binary
representation (101)
• Write the power of 2
(for enabled bits) 22 +
20
• i.e. 5 = 4 + 1
• 5 shift = 4 shift + 1 shift
he mapping of an eight-node linear array
onto a three-dimensional hypercube to
perform a circular 5-shift as a combination of
a 4-shift and a 1-shift.

Lecture 14 Basic Communication Operations
No ratings yet
Lecture 14 Basic Communication Operations
40 pages
Basic Communications
No ratings yet
Basic Communications
13 pages
Chap4 Selected Slides
No ratings yet
Chap4 Selected Slides
54 pages
Module 3ppt
No ratings yet
Module 3ppt
50 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
HPC UNIT 3 To UNIT 6 Technical-Merged
No ratings yet
HPC UNIT 3 To UNIT 6 Technical-Merged
143 pages
Parallel Communication Patterns
No ratings yet
Parallel Communication Patterns
84 pages
F2 PDF
No ratings yet
F2 PDF
51 pages
Parallel Communication Techniques
No ratings yet
Parallel Communication Techniques
71 pages
Decode HPC
No ratings yet
Decode HPC
68 pages
HPC Endsem 2024 FlyHigh Services
No ratings yet
HPC Endsem 2024 FlyHigh Services
16 pages
HPC Bankai
No ratings yet
HPC Bankai
7 pages
Lecture 17 PDC BCS 6EF SMI Spring 2025
No ratings yet
Lecture 17 PDC BCS 6EF SMI Spring 2025
17 pages
PDC - Co1-Basic Op & Cost Analysis
No ratings yet
PDC - Co1-Basic Op & Cost Analysis
22 pages
Lecture 16 PDC BCS 6EF SMI Spring 2025
No ratings yet
Lecture 16 PDC BCS 6EF SMI Spring 2025
15 pages
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
No ratings yet
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
13 pages
Lecture 15 PDC BCS 6EF SMI Spring 2025
No ratings yet
Lecture 15 PDC BCS 6EF SMI Spring 2025
27 pages
Lecture 19 PDC BCS 6EF SMI Spring 2025
No ratings yet
Lecture 19 PDC BCS 6EF SMI Spring 2025
14 pages
Basic Communication Operations
No ratings yet
Basic Communication Operations
23 pages
HPC Communication Techniques
No ratings yet
HPC Communication Techniques
41 pages
HPC Endsem FlyHigh Services
No ratings yet
HPC Endsem FlyHigh Services
18 pages
Lecture 18 PDC BCS 6EF SMI Spring 2025
No ratings yet
Lecture 18 PDC BCS 6EF SMI Spring 2025
14 pages
Unit 3 HPC
No ratings yet
Unit 3 HPC
73 pages
Parallel Computing Challenges & Trends
No ratings yet
Parallel Computing Challenges & Trends
81 pages
Principal Parameters: Communication Costs in Static Interconnection Networks
No ratings yet
Principal Parameters: Communication Costs in Static Interconnection Networks
31 pages
HPC 3rd Unit
No ratings yet
HPC 3rd Unit
16 pages
Intro To Communication: - Advantages
No ratings yet
Intro To Communication: - Advantages
13 pages
Mid 2 Solution
No ratings yet
Mid 2 Solution
5 pages
LEC6 parallelAlg-Broadcasting
No ratings yet
LEC6 parallelAlg-Broadcasting
15 pages
801 DCexp 3
No ratings yet
801 DCexp 3
18 pages
Parallel Communication Costs
No ratings yet
Parallel Communication Costs
24 pages
Network Topologies Explained
No ratings yet
Network Topologies Explained
48 pages
Static Interconnection Networks
No ratings yet
Static Interconnection Networks
10 pages
Chapter 2 - Parallel Programming Platforms
No ratings yet
Chapter 2 - Parallel Programming Platforms
33 pages
Exercise 9
No ratings yet
Exercise 9
5 pages
CN Slot
No ratings yet
CN Slot
8 pages
10-Hypercube & Network
No ratings yet
10-Hypercube & Network
22 pages
SIMD Architecture Explained
100% (1)
SIMD Architecture Explained
45 pages
3 Module 3 Message Passing Studemt Version 2
No ratings yet
3 Module 3 Message Passing Studemt Version 2
18 pages
Parallel Algorithms Underlying MPI Implementations
No ratings yet
Parallel Algorithms Underlying MPI Implementations
55 pages
Chapter 06
No ratings yet
Chapter 06
47 pages
Lec8 MPIalgorithmDesign
No ratings yet
Lec8 MPIalgorithmDesign
12 pages
Slides Chapter 2 - Parallel Programming Platforms
No ratings yet
Slides Chapter 2 - Parallel Programming Platforms
33 pages
Ds 2016 17 Lec3
No ratings yet
Ds 2016 17 Lec3
25 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
19 pages
Parallel Algorithms Underlying MPI Implementations
No ratings yet
Parallel Algorithms Underlying MPI Implementations
55 pages
Introduction
No ratings yet
Introduction
46 pages
Unit-I Data Communications: Dr. Y. Narasimha Murthy PH.D
No ratings yet
Unit-I Data Communications: Dr. Y. Narasimha Murthy PH.D
27 pages
Duplexing, Multiplexing, and Multiple Access: A Comparative Analysis For Mesh Networks
No ratings yet
Duplexing, Multiplexing, and Multiple Access: A Comparative Analysis For Mesh Networks
7 pages
Module I Edited
No ratings yet
Module I Edited
58 pages
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
20 pages
WSN Unit 4
No ratings yet
WSN Unit 4
5 pages
Notes
No ratings yet
Notes
6 pages
The Influence of ATM Service Quality On Customer Satisfaction in
No ratings yet
The Influence of ATM Service Quality On Customer Satisfaction in
15 pages
TigerTouch Installation Guide English
No ratings yet
TigerTouch Installation Guide English
24 pages
Can Computers Become Conscious?
No ratings yet
Can Computers Become Conscious?
5 pages
Sac Charo Mat
No ratings yet
Sac Charo Mat
2 pages
L2 BioMols
No ratings yet
L2 BioMols
65 pages
AXI Vs AHB. Difference Between AXI and AHB
100% (2)
AXI Vs AHB. Difference Between AXI and AHB
3 pages
Message
No ratings yet
Message
2 pages
Impression Brochure
No ratings yet
Impression Brochure
2 pages
Chrysler Wiring Diagrams 1989-2005
100% (3)
Chrysler Wiring Diagrams 1989-2005
206 pages
A Short Course On Synchronous Machines and Synchronous Condensers
No ratings yet
A Short Course On Synchronous Machines and Synchronous Condensers
109 pages
DB en Elr h5 Ies PT 24dc 500ac 105517 en 03
No ratings yet
DB en Elr h5 Ies PT 24dc 500ac 105517 en 03
21 pages
Manual Setting Type: Twin Volume Built In, High Accuracy Type
No ratings yet
Manual Setting Type: Twin Volume Built In, High Accuracy Type
4 pages
Evaluation of Methods For Design Discharge Estimation in Ungauged Catchments, A Case of Tigithe River Catchment in Mara River Basin
No ratings yet
Evaluation of Methods For Design Discharge Estimation in Ungauged Catchments, A Case of Tigithe River Catchment in Mara River Basin
11 pages
9th Question Paper
No ratings yet
9th Question Paper
5 pages
10 - Grinding Machine ME 46 Machine Shop Theory and Practice
No ratings yet
10 - Grinding Machine ME 46 Machine Shop Theory and Practice
48 pages
Vma1565 Man Tgs 33400 BB Sa 6x4 Tractor Head en
No ratings yet
Vma1565 Man Tgs 33400 BB Sa 6x4 Tractor Head en
16 pages
Capital City Towers Moscow Case Study
No ratings yet
Capital City Towers Moscow Case Study
6 pages
Senior Project Guidlines
No ratings yet
Senior Project Guidlines
27 pages
Standard Test Method For Hydrazine
No ratings yet
Standard Test Method For Hydrazine
4 pages
Numerical and Experimental Study On A Bypass Pig Motion in Oil Transmission Pipeline: A Case Study
No ratings yet
Numerical and Experimental Study On A Bypass Pig Motion in Oil Transmission Pipeline: A Case Study
17 pages
New Developments in Antistatic and Conductive Additives: Migrating Antistats
No ratings yet
New Developments in Antistatic and Conductive Additives: Migrating Antistats
4 pages
Large-Scale Brain Network Modelling Using Graph-Theory Approach in Neuroscience
No ratings yet
Large-Scale Brain Network Modelling Using Graph-Theory Approach in Neuroscience
5 pages
爱德思公式表
No ratings yet
爱德思公式表
7 pages
Mixed Dentition Space Analysis: A Review: December 2012
No ratings yet
Mixed Dentition Space Analysis: A Review: December 2012
7 pages
Asymmetric Coupled Transmission Lines in An Inhomogeneous Medium
100% (1)
Asymmetric Coupled Transmission Lines in An Inhomogeneous Medium
6 pages
Computation of Vimshottari Dasa
100% (2)
Computation of Vimshottari Dasa
3 pages
Surds and Indices 26 Jun 2024
No ratings yet
Surds and Indices 26 Jun 2024
44 pages
Music Theory Practice Paper Grade 5 (All Languages) - James
No ratings yet
Music Theory Practice Paper Grade 5 (All Languages) - James
8 pages
Programming With Abap4 The World of Sap Coding Programming For Beginner Professional Volume Book 1 Gurunanjeshwar Togurage PDF Download
100% (3)
Programming With Abap4 The World of Sap Coding Programming For Beginner Professional Volume Book 1 Gurunanjeshwar Togurage PDF Download
78 pages
SQL Anywhere Node JS API Reference en
No ratings yet
SQL Anywhere Node JS API Reference en
16 pages

Lecture 11

Uploaded by

Lecture 11

Uploaded by

Basic Communication

Broadcast and Reduction

• All to All Broadcast

• All to One Reduction

• One to All Broadcast

All-to-all personalized communication in

• 2nd step (y-axis) 0<->2

• 3rd Step (z-axis) 0<->4

• 2nd step (y-axis) 2<->0

• 3rd Step (z-axis) 2<->6

The communication steps in a

You might also like