[go: up one dir, main page]

0% found this document useful (0 votes)
10 views38 pages

Concurrency and Parallelism

The document explains multitasking in Python, focusing on concurrency and parallelism. It details how concurrency allows multiple tasks to progress simultaneously while parallelism enables actual simultaneous execution using multiple CPU cores, highlighting Python's threading, asyncio, and multiprocessing modules. It also discusses the Global Interpreter Lock (GIL), thread synchronization, and when to use threading, asyncio, or multiprocessing based on task types.

Uploaded by

prachi dhavale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views38 pages

Concurrency and Parallelism

The document explains multitasking in Python, focusing on concurrency and parallelism. It details how concurrency allows multiple tasks to progress simultaneously while parallelism enables actual simultaneous execution using multiple CPU cores, highlighting Python's threading, asyncio, and multiprocessing modules. It also discusses the Global Interpreter Lock (GIL), thread synchronization, and when to use threading, asyncio, or multiprocessing based on task types.

Uploaded by

prachi dhavale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Concurrency and

Parallelism
Understanding Multitasking in Python
What is Multitasking?
• Multitasking = doing multiple tasks at once

• Two approaches: Concurrency and Parallelism


Introduction
• Concurrency and parallelism are key concepts for improving program
efficiency.
• Concurrency: Tasks appear to run simultaneously but may not be truly
parallel.
• Parallelism: Tasks actually run at the same time using multiple CPU
cores.
• These concepts help optimize performance for different types of
workloads.
• Python provides several modules to achieve concurrency and
parallelism, such as threading, asyncio, and multiprocessing.
Concurrency vs Parallelism

•Concurrency:
•Multiple tasks progressing at the same time
•Can happen on a single CPU (context switching)
•Parallelism:
•Multiple tasks running at the same time
•Requires multiple CPUs/cores
•Analogy:
•Concurrency = Single chef cooking many dishes
•Parallelism = Multiple chefs cooking in parallel
Key Differences
Concurrency with Threading
• The threading module allows multiple threads to run concurrently within
the same process.

• Threads share the same memory space, making communication easier but
requiring synchronization.

• The Global Interpreter Lock (GIL) prevents multiple threads from executing
Python bytecode simultaneously, limiting true parallel execution for CPU-
bound tasks.

• Best suited for I/O-bound tasks like web requests, file I/O, or database
queries.
Python and the Global Interpreter
Lock (GIL)
• Python uses the GIL

• Only one thread can execute Python bytecode at a time

• Impacts parallel execution of threads

• Workaround: Use multiprocessing


Pros and Cons of threading
• Useful for I/O-bound tasks (networking, file I/O)

• Limitations: CPU-bound tasks are still serialized


Thread Synchronization Overview
• Why we need synchronization:
• Prevent race conditions
• Coordinate access to shared resources
• Python provides:
• Lock
• Rlock
• Semaphore
How to Use Thread Locks in
Python
• thread locks are implemented using the threading.Lock() class.
• To use a lock in your code, you first need to create an instance of the
Lock() class

import threading

lock = threading.Lock()
Lock
• Once you have created a lock instance, you can use it to protect a shared
resource.
• To acquire the lock, you use the acquire() method
lock.acquire()

• This will block the thread until the lock becomes available.
• Once the lock is acquired, the thread can access the shared resource
safely.
• When the thread is done accessing the resource, it must release the lock
using the release() method
lock.release()
Important note
• It’s important to note that when using locks, you must ensure that you release the lock in
all possible code paths.
• If you acquire a lock and then exit the function without releasing the lock, the lock will
remain locked, preventing other threads from accessing the shared resource.
• To avoid this, it’s a good practice to use a try-finally block to ensure that the lock is
released, even if an exception occurs:

lock.acquire()
try:
# access the shared resource
finally:
lock.release()
Using Lock in Python
• Lock ensures only one thread accesses a resource at a time
import threading

lock = threading.Lock()

def safe_increment():
with lock:
# critical section
print("Locked section")

t1 = threading.Thread(target=safe_increment)
t2 = threading.Thread(target=safe_increment)
t1.start()
t2.start()
Use with lock: for auto-release
RLock – Re entrant Lock
• RLock allows the same thread to acquire a lock multiple times
• Use case: Recursive functions with locks
import threading

lock = threading.Lock()

print("Acquiring the lock: ", lock.acquire())


# adding 0 as timeout value else, the below statement will wait forever
print("Acquiring the lock again: ", lock.acquire(0))
RLock
import threading

lock = threading.RLock()

print("Acquiring the lock: ", lock.acquire())


# adding 0 as timeout value else, the below statement will wait forever
print("Acquiring the lock again: ", lock.acquire())
Difference between Lock and RLock
• A Lock can only be acquired once. It cannot be acquired again, until it is
released. (After it's been released, it can be re-acquired by any thread).

• An RLock on the other hand, can be acquired multiple times, by the


same thread. It needs to be released the same number of times in
order to be "unlocked".

• Another difference is that an acquired Lock can be released by any


thread, while an acquired RLock can only be released by the thread
which acquired it.
Semaphores
• A Semaphore controls access to a pool of resources
• Can allow N threads to access concurrently
• When a thread wants to access the shared resource, it must first
acquire a semaphore.
• Each semaphore has a specified limit that determines how many
threads can access the resource simultaneously.
• Once the limit is reached, any additional threads that try to acquire
the semaphore will be blocked until a thread releases the semaphore.
Example
import threading
from time import sleep

items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# semaphore to limit the number of threads that can access the list simultaneously to 4
semaphore = threading.Semaphore(value=4)

def process_item(item):
semaphore.acquire() # acquire the semaphore
try:
sleep(3) # simulate some processing time
print(f'Processing item {item}') # process the item
finally: # Make sure we always release the semaphore
semaphore.release() # release the semaphore

# create a list of threads to process the items


threads = [
threading.Thread(target=process_item, args=(item,))
for item in items
]
[thread.start() for thread in threads] # start all threads
[thread.join() for thread in threads] # wait for all threads to finish
Semaphore with ‘with’ statement
import threading
from time import sleep

items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# semaphore to limit the number of threads that can access the list simultaneously to 4
semaphore = threading.Semaphore(value=4)

def process_item(item):
with semaphore: # acquire the semaphore
sleep(3) # simulate some processing time
print(f'Processing item {item}') # process the item

# create a list of threads to process the items


threads = [
threading.Thread(target=process_item, args=(item,))
for item in items
]
[thread.start() for thread in threads] # start all threads
[thread.join() for thread in threads] # wait for all threads to finish
BoundedSemaphore
• Like Semaphore, but raises error if release() called too many times
• Prevents bugs in logic
• Best for controlled resource counts
• There lies a very subtle difference between a normal
semaphore and a bounded-semaphore .
• A bounded semaphore only differs in terms of not
allowing more releases to be made than acquires.
• If it does exceed the value then a ValueError is raised
Example
# Create and start 5 threads
import threading threads = []
import time for i in range(5):
t=
# Allow only 3 threads to access the resource at the same time
threading.Thread(target=access_resour
semaphore = threading.BoundedSemaphore(value=3)
ce, args=(i,))
def access_resource(thread_id): threads.append(t)
print(f"Thread {thread_id} is waiting to access the resource.") t.start()
with semaphore: # Acquires semaphore
print(f"Thread {thread_id} has accessed the resource.") # Wait for all threads to complete
time.sleep(2)
for t in threads:
print(f"Thread {thread_id} is releasing the resource.")
t.join()
What’s the Difference Between CPU-Bound and
I/O-Bound Tasks?

• A CPU-bound task spends most of its time doing heavy calculations with
the CPUs.
• In this case, you should use multiprocessing to run your jobs in parallel
and make full use of your CPUs.
• An I/O-bound task spends most of its time waiting for I/O responses,
which can be responses from web pages, databases or disks.
• If you’re developing a web page where a request needs to fetch data
from APIs or databases, it’s an I/O-bound task.
• Concurrency can be achieved for I/O-bound tasks with either asyncio or
threading to minimize the waiting time from external resources.
Concurrency with AsyncIO

• Asyncio is a Python library that allows us to write concurrent code


using the async/await syntax
• The asyncio module provides cooperative multitasking using async
and await.
• Best suited for I/O-bound operations like web requests or file I/O.
• Uses an event loop to handle multiple tasks without blocking the
main thread.
Differences
• Asyncio Uses One Thread, Multithreading Uses Multiple
Threads
• Asyncio Uses Cooperative Multitasking, Multithreading
Uses Preemptive Multitasking
Cooperative Vs Preemptive
Multitasking
• Asyncio achieves concurrency with cooperative multitasking.
• We decide which part of the code can be awaited, which then switches the
control to run other parts of the code.
• The tasks need to cooperate and announce when the control will be switched out.
• And all this is done in a single thread with the await command.
• Threading achieves concurrency with preemptive multitasking, which means we
can’t determine when to run which code in which thread.
• It’s the operating system that determines which code should be run in which
thread.
• The operating system can switch the control at any point between threads.
• This is why we often see random results with threading.
What Is a Coroutine in Asyncio?

• Coroutines are a more generalized form of subroutines.


• Subroutines are entered at one point and exited at another point.
• Coroutines can be entered, exited, and resumed at many different
points.
• They can be implemented with the async def statement”
Example
import asyncio

async def main():


print("Hello")
await asyncio.sleep(1)
print("World")

await main() # use await instead of asyncio.run()


Parallelism with Multiprocessing

• The multiprocessing module allows parallel execution using multiple


CPU cores.
• Bypasses GIL by using separate processes.
• Ideal for CPU-intensive tasks like image processing or mathematical
computations.
Number of CPUs
• There are plenty of classes in python multiprocessing module for
building a parallel program.
• Among them, three basic classes are Process, Queue and Lock.
• These classes will help you to build a parallel program.
import multiprocessing

print("Number of cpu : ", multiprocessing.cpu_count())


Python multiprocessing Process
class
• Process class is an abstraction that sets up another Python process, provides
it to run code and a way for the parent application to control execution.
• There are two important functions that belongs to the Process class - start()
and join() function.
• At first, we need to write a function, that will be run by the process.
• Then, we need to instantiate a process object and start the process via start().
• Then, the process will run and return its result.
• After that we tell the process to complete via join() function.
• Without join() function call, process will remain idle and won’t terminate.
• So if you create many processes and don’t terminate them, you may face
scarcity of resources.
Python multiprocessing Queue
class
• Python Multiprocessing modules provides Queue class that is exactly
a First-In-First-Out data structure.
• They can store any pickle Python object (though simple ones are best)
and are extremely useful for sharing data between processes.
• Queues are specially useful when passed as a parameter to a Process’
target function to enable the Process to consume data.
• By using put() function we can insert data to then queue and using
get() we can get items from queues.
Queue Example
from multiprocessing import Queue

colors = ['red', 'green', 'blue', 'black']


cnt = 1
# instantiating a queue object
queue = Queue()
print('pushing items to queue:')
for color in colors:
print('item no: ', cnt, ' ', color)
queue.put(color)
cnt += 1

print('\npopping items from queue:')


cnt = 0
while not queue.empty():
print('item no: ', cnt, ' ', queue.get())
Python multiprocessing Lock
Class

• The task of Lock class is to claim lock so that no other process can
execute the similar code until the lock has been released.
• So the task of Lock class is mainly two. One is to claim lock and other
is to release the lock.
• To claim lock the, acquire() function is used and to release lock
release() function is used.
When to Use What?

• Use threading for I/O-bound tasks (e.g., network requests, file I/O)
where tasks spend time waiting.
• Use asyncio when you need to handle many asynchronous tasks
efficiently.
• Use multiprocessing for CPU-bound tasks (e.g., heavy computations,
data processing) where actual parallel execution is required.
• Hybrid Approaches: Sometimes, a mix of these techniques is
beneficial, such as combining threading with async operations.

You might also like