[go: up one dir, main page]

0% found this document useful (0 votes)
8 views19 pages

Experiments

The document outlines various experiments demonstrating parallel programming concepts using OpenMP and MPI in Python. It covers topics such as vector addition, dot product, loop work-sharing, matrix multiplication, and non-blocking operations, providing algorithms and code examples for each. The aim is to illustrate the application of parallel processing techniques for efficient computation.

Uploaded by

rishabhsuri2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views19 pages

Experiments

The document outlines various experiments demonstrating parallel programming concepts using OpenMP and MPI in Python. It covers topics such as vector addition, dot product, loop work-sharing, matrix multiplication, and non-blocking operations, providing algorithms and code examples for each. The aim is to illustrate the application of parallel processing techniques for efficient computation.

Uploaded by

rishabhsuri2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Name :– Rishabh Suri

Reg no.:- 22BCE11028

Exp 1. OpenMP – Basic programs such as Vector addition, Dot Product

Aim:- To demonstrate basic OpenMP-style parallel programming using Python.

Algorithm

1. Vector Addition using Parallelization (Simulating OpenMP)

Algorithm:

1. Input: Two vectors A and B of length n.

2. Output: A new vector C where C[i] = A[i] + B[i] for all i.

3. Steps:

○ Divide the work among multiple processes or threads.

○ Each thread will compute a segment of the resulting vector.

○ Combine the results to form the final vector.

2.Dot Product using Parallelization (Simulating OpenMP)

Algorithm:

1. Input: Two vectors A and B of length n.

2. Output: A scalar value D = A[0]*B[0] + A[1]*B[1] + ... + A[n-1]*B[n-1].

3. Steps:

○ Divide the work of computing the dot product among multiple processes.

○ Each thread computes a segment of the dot product.

○ Combine the results to get the final dot product.

1. Vector Addition using multiprocessing

Code:-
from multiprocessing import Pool
def add_elements(pair):
return pair[0] + pair[1]

if __name__ == "__main__":
A = [1, 2, 3, 4, 5]
B = [10, 20, 30, 40, 50]

pairs = list(zip(A, B))


with Pool() as pool:
result = pool.map(add_elements, pairs)

print("Vector A:", A)
print("Vector B:", B)
print("Vector Addition (A + B):", result)

2. Dot Product using multiprocessing

Code:-
from multiprocessing import Pool

# Function to multiply two elements


def multiply_elements(pair):
return pair[0] * pair[1]
if __name__ == "__main__":
# Example vectors
A = [1, 2, 3, 4, 5]
B = [10, 20, 30, 40, 50]

# Combine vectors into pairs


pairs = list(zip(A, B))

# Create a Pool of workers


with Pool() as pool:
products = pool.map(multiply_elements, pairs)

# Compute dot product by summing the products


dot_product = sum(products)

print("Vector A:", A)
print("Vector B:", B)
print("Dot Product:", dot_product)

Exp 2. OpenMP – Loop work-sharing and sections work-sharing


Aim:- To demonstrate the concept of OpenMP loop work-sharing and sections work-sharing
using Python's multiprocessing module.

ALGORITHM
1. Loop Work-Sharing Algorithm
1. Define a computational function (e.g., square of a number).
2. Create a list of inputs.
3. Use a process pool to parallelly map the function to inputs.
4. Collect and display results

2. Sections Work-Sharing Algorithm


1. Define multiple independent functions (tasks).
2. Launch each function as a separate process.
3. Use .start() to begin and .join() to wait for completion.
4. Display when all tasks complete.

1. Loop Work-Sharing in Python


Code:-
from multiprocessing import Pool

# Function to compute square


def compute_square(n):
return n * n

if __name__ == "__main__":
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]

with Pool() as pool:


result = pool.map(compute_square, data)

print("Original data:", data)


print("Squares:", result)
2. Sections Work-Sharing in Python
from multiprocessing import Process
import time

def task1():
time.sleep(1)
print("Task 1: Data preprocessing done.")

def task2():
time.sleep(2)
print("Task 2: Model training done.")

def task3():
time.sleep(1.5)
print("Task 3: Evaluation done.")

if __name__ == "__main__":
# Creating processes (sections)
p1 = Process(target=task1)
p2 = Process(target=task2)
p3 = Process(target=task3)

# Start all
p1.start()
p2.start()
p3.start()
# Wait for all to finish
p1.join()
p2.join()
p3.join()

print("All tasks (sections) completed.")


Exp 3. OpenMP – Combined parallel loop reduction and Orphaned parallel loop reduction

Aim:- To demonstrate OpenMP – Combined parallel loop reduction and Orphaned parallel
loop reduction

1. Combined Parallel Loop with Reduction in Python

Algorithm

1. Combined Parallel Loop Reduction (in Python)

Algorithm for Combined Parallel Loop Reduction:

1. Input: A list of values (e.g., numbers to sum) and a reduction operation (e.g., sum).

2. Output: A reduced result (e.g., the sum of all values).

3. Steps:

1. Split the input data into chunks that can be processed in parallel.

2. Each chunk is processed in parallel by a worker (thread or process).


3. Each worker computes a partial result (e.g., sum of its chunk).

4. Combine all the partial results from all workers into a final result (reduction
operation)

Example: Sum of Array Elements in Parallel


Code:-
from multiprocessing import Pool

# Function to return the value (identity for sum)


def identity(x):
return x

if __name__ == "__main__":
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]

with Pool() as pool:


partial_sums = pool.map(identity, data)

total_sum = sum(partial_sums)

print("Sum of elements:", total_sum)

2. Orphaned Parallel Loop with Reduction in Python

Algorithm
Algorithm for Orphaned Parallel Loop Reduction:

1. Input: A list of values and a shared reduction variable (e.g., a global sum).

2. Output: A reduction result (e.g., sum of all values).

3. Steps:

1. Each thread or process computes a partial result independently.

2. Threads or processes attempt to update a shared variable (e.g., a global


sum).

3. If not synchronized properly, multiple threads may write to the shared variable
at the same time, causing issues (orphaned updates).

4. To avoid errors, we must ensure that proper synchronization or reduction


handling is used.

Code:-
from multiprocessing import Pool

# Top-level function (no nesting)


def identity(x):
return x

def compute_sum(data):
with Pool() as pool:
partial_sums = pool.map(identity, data)
return sum(partial_sums)

if __name__ == "__main__":
data = [10, 20, 30, 40, 50]
result = compute_sum(data)
print("Sum using orphaned parallel loop:", result)
Exp 4. OpenMP – Matrix multiply (specify run of a GPU card, large scale data … Complexity
of the problem need to be specified)

Aim:-To demonstrate OpenMP – Matrix multiply

1. Matrix Multiplication with CPU Parallelism


Example: Matrix Multiplication with multiprocessing

Code:-
import numpy as np
from multiprocessing import Pool

# Function to multiply a single row with the matrix


def row_multiply(args):
row, B = args
return np.dot(row, B)

def matrix_multiply(A, B):


# Initialize result matrix C with appropriate size
C = np.zeros((A.shape[0], B.shape[1]))

with Pool() as pool:


# We pass a list of rows and matrix B to each worker
result = pool.map(row_multiply, [(A[i], B) for i in range(A.shape[0])])

for i in range(A.shape[0]):
C[i] = result[i]
return C

if __name__ == "__main__":
# Large matrices
A = np.random.rand(1000, 1000)
B = np.random.rand(1000, 1000)

C = matrix_multiply(A, B)
print("Matrix multiplication complete. Result shape:", C.shape)

2. Matrix Multiplication with GPU Parallelism using CuPy


Example: Matrix Multiplication with CuPy
import cupy as cp

def matrix_multiply_gpu(A, B):


# Move data to GPU
A_gpu = cp.asarray(A)
B_gpu = cp.asarray(B)

# Perform matrix multiplication on GPU


C_gpu = cp.dot(A_gpu, B_gpu)
# Move the result back to CPU (if needed)
return cp.asnumpy(C_gpu)

if __name__ == "__main__":
# Large matrices (for GPU, much larger data can be used)
A = np.random.rand(1000, 1000)
B = np.random.rand(1000, 1000)

C = matrix_multiply_gpu(A, B)
print("Matrix multiplication complete on GPU. Result shape:", C.shape)

Output:-

Exp 5:- MPI – Basics of MPI

Aim:- To demonstrate basic MPI functionality using Python, you can use the mpi4py library

Algorithm:-

Step 1: Start the program.

Step 2: Import the MPI module from the mpi4py library.

Step 3: Initialize the MPI environment (handled automatically in mpi4py).

Step 4: Get the communicator using MPI.COMM_WORLD.

Step 5: Determine:

● Rank of the current process using comm.Get_rank()

● Size (total number of processes) using comm.Get_size()

Step 6: Print a message from each process indicating:

● Its rank

● The total number of processes

Step 7: End the program.

Code:-

from mpi4py import MPI


def mpi_hello_world():

# Get the MPI communicator and rank/size

comm = MPI.COMM_WORLD

rank = comm.Get_rank()

size = comm.Get_size()

# Print a message from each process

print(f"Hello from process {rank} out of {size} processes")

if __name__ == "__main__":

mpi_hello_world()

Output:-

Exp 6 MPI – Communication between MPI process

Aim:- Demonstrate how one MPI process (e.g., rank 0) sends a message and another
process (e.g., rank 1) receives it.

Code:-
from mpi4py import MPI

def point_to_point_mpi():
comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
data = "Hello from Process 0"
comm.send(data, dest=1, tag=11)
print(f"Process {rank} sent data: '{data}' to Process 1")

elif rank == 1:
received_data = comm.recv(source=0, tag=11)
print(f"Process {rank} received data: '{received_data}' from Process 0")

if __name__ == "__main__":
point_to_point_mpi()

Output:-

Exp 7. MPI – Collective operation with "synchronization"

Aim:-To demonstrate synchronization using MPI Barrier, which ensures that all processes
reach the same point in the program before proceeding.

Algorithm:

1. Start the program.

2. Import MPI from mpi4py.

3. Get the communicator using MPI.COMM_WORLD.

4. Get the rank and size of the current process.

5. Each process prints a message before synchronization.

6. Call comm.Barrier() to synchronize all processes.

7. After the barrier, each process prints a message.

8. End the program.

Code:-

Python Code: Collective Synchronization with Barrier()

from mpi4py import MPI

import time

def mpi_synchronization_demo():

comm = MPI.COMM_WORLD

rank = comm.Get_rank()
size = comm.Get_size()

print(f"Process {rank} reached before the barrier")

# Synchronization point

comm.Barrier()

# Add a slight delay so output order appears clearer

time.sleep(0.1 * rank)

print(f"Process {rank} passed the barrier")

if __name__ == "__main__":

mpi_synchronization_demo()

Output:-

8 MPI – Collective operation with "data movement"

Aim:- To implement an MPI collective operation in Python, you can use the mpi4py library,
which provides bindings for MPI functions.
Algorithm

1.Initialization:

● Initialize the MPI environment.


● Get the total number of processes (size) and the rank (ID) of the current process.

2.Root Process Data Preparation:

● If the current process is the root process (rank 0):

○ Prepare a data array (e.g., [0, 1, 2, ..., size-1]), where each element
represents a data chunk that will be sent to each process.

3. Scatter Operation:

● Use the scatter function to distribute the data from the root process to all other
processes.
● Each process (including the root) will receive a portion of the data from the root.

4. Computation:

● Each process (including the root) performs a computation on the received data (e.g.,
doubling the value).

5. Gather Operation:

● After computation, each process sends its result back to the root process using the
gather function.

6.Root Process Collects Results:

● The root process gathers all the processed data from all the other processes.

7.Display Results:

● Each process prints the data it received and the computation result.
● The root process prints the final gathered result.

Example: Scatter and Gather operations

Code:-
from mpi4py import MPI

# Initialize the MPI environment


comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Root process (rank 0) will send data to all other processes


if rank == 0:
# Create a list of data
data = [i for i in range(size)] # Example: [0, 1, 2, ..., size-1]
else:
data = None # Other processes don't need data initially

# Scatter operation: distribute data from root to all processes


local_data = comm.scatter(data, root=0)

# Print local data on each process


print(f"Process {rank} received data: {local_data}")

# Perform some computation (example: doubling the received data)


local_result = local_data * 2

# Gather operation: send the processed data from all processes to root
gathered_data = comm.gather(local_result, root=0)

# Root process gathers all results


if rank == 0:
print(f"Gathered data at root: {gathered_data}")

Output:-

Exp 9:-MPI – Collective operation with "collective computation"

Aim:-To implement and demonstrate collective computation in MPI using collective


operations like Scatter, Gather, and Reduce.

Algorithm:-
1. Initialize the MPI environment.
2. Get the number of processes and the rank of the current process.
3. On the root process:
o Create a list of numbers to compute (e.g., squares of numbers).
4. Use Scatter to distribute parts of the list to all processes.
5. Each process computes the square of its number (local computation).
6. Use Gather or Reduce to collect results at the root process.
7. Display the final results at the root process.

Code:-
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
# Step 1: Root process prepares the data
if rank == 0:
data = [i for i in range(size)]
else: data = None
# Step 2: Scatter data to all processes
recv_data = comm.scatter(data, root=0)
# Step 3: Each process performs computation (e.g., square)
local_result = recv_data ** 2
# Step 4: Gather the results to root process
results = comm.gather(local_result, root=0)
# Step 5: Root process displays final result
if rank == 0:
print("Input data:", [i for i in range(size)])
print("Squared results (gathered):", results)

Output:-

Exp 10:- MPI – Non-blocking operation using python

Aim:-Demonstrate non-blocking communication using Isend() and Irecv() in MPI, which


allows processes to continue execution without waiting for communication to complete.

Code:-
from mpi4py import MPI
import time

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Make sure there are at least 2 processes


if size < 2:
print("Run with at least 2 processes.")
else:
if rank == 0:
data = "Hello from Process 0"
req = comm.Isend([data, MPI.CHAR], dest=1, tag=10)
print("Process 0 is doing something else while sending...")
time.sleep(1) # Simulate other work
req.Wait()
print("Process 0 finished sending.")

elif rank == 1:
buf = bytearray(100) # Buffer to hold the incoming message
req = comm.Irecv([buf, MPI.CHAR], source=0, tag=10)
print("Process 1 is doing something else while receiving...")
time.sleep(1) # Simulate other work
req.Wait()
print("Process 1 received:", buf.decode().rstrip('\x00'))

Output:-

You might also like