Unit 1 Advanced Process and Thread Management
Unit 1 Advanced Process and Thread Management
1. Multithreading Models
Multithreading models define the relationships between user-level and kernel-level threads, providing
different trade-offs in performance, flexibility, and complexity.
• Kernel Thread: A thread that is directly known and managed by the operating system kernel. The kernel
schedules kernel threads onto CPU cores.
• User-level Thread: A thread that is managed entirely by a thread library at the user level, without kernel
support.
Many-to-One Model - Multiple user-level threads are mapped to a single kernel-level thread.
Advantages:
• Thread management is done entirely in user space, making it efficient and fast.
• This model is portable and can be implemented on any operating system that supports threads.
Disadvantages:
• The entire process blocks if one thread makes a blocking system call because only one kernel thread
is active at a time.
• It cannot take advantage of multiprocessor architectures as only one thread can run at a time.
Examples: Some early implementations of user-level thread libraries.
One-to-One Model - Each user-level thread is mapped to a separate kernel-level thread.
Advantages:
• Allows true parallelism on multiprocessor systems.
• If one thread blocks, other threads continue to run.
Disadvantages:
• Creating a user thread requires creating a corresponding kernel thread, causing overhead.
• The operating system limits the number of threads.
Examples: Windows, Linux, macOS.
Many-to-Many Model - Multiple user-level threads are mapped to an equal or smaller number of kernel-
level threads.
Advantages:
• Combines benefits of many-to-one and one-to-one models.
• Multiple user threads can run in parallel on multiprocessors.
• Blocking system calls by one thread does not necessarily block the process.
• The number of kernel threads can be dynamically adjusted.
Disadvantages:
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• More complex to implement.
Thread User-level thread: Fast, in User-level thread & kernel User-level thread: Fast, in
Generation user space. thread: Created together user space.
via syscall (slower).
Kernel thread: One per Kernel thread: Pool created
process. via syscalls.
Does the None. Sees only a single- Full. Manages every thread Partial. Manages the pool of
kernel know? threaded process. individually. KLTs, not the ULTs.
Blocking Blocks entire process. Blocks only that thread. Library can schedule around
Calls it.
Parallelism None. (Runs on one core). Yes. (True parallelism on Yes. (Parallelism up to the
multiple cores). kernel thread pool size).
Scalability High (for number of user Low (limited by kernel Very High (best for many
level threads), but resources). threads).
ineffective.
2. Thread pools
A thread pool is a collection of pre-created, reusable threads maintained to execute tasks, to reduce the
overhead of frequent thread creation and destruction.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
Task Queue → A queue storing tasks awaiting execution
Dispatcher thread → A special thread for monitoring the task queue, assigning tasks to idle worker threads.
Pool Manager → Schedules tasks and adjusts pool size
Working:
1. Tasks are enqueued into the task queue.
2. Idle threads fetch tasks from the queue.
3. Threads perform tasks and return to the pool when done.
4. If all threads are busy, tasks wait in the queue.
Implementation:
initialize thread_pool with N threads
while True:
if task arrives:
enqueue task to task_queue
for thread in thread_pool:
if thread is idle:
dequeue task from task_queue and assign to thread
Design issues:
Pool Sizing
• Fixed Size: Predefine max/min number of threads. Well-suited for predictable workloads.
Too small → CPU underutilization.
Too large → Context switching overhead, memory pressure.
• Dynamic Size: Thread pool grows/shrinks based on demand. Complex implementation.
Task Scheduling
• FIFO Queue: Tasks processed in arrival order (commonly used).
• Priority Queues: Higher-priority tasks processed first (used in some OS/runtime systems).
• Bounded vs. Unbounded Queue: An unbounded queue can lead to out-of-memory errors under heavy
load. A bounded queue can force the task submitter to block or reject tasks when full, providing
backpressure.
Use Cases: Web Servers handling multiple client HTTP requests.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
Fig. A multithreaded web server
Limitations:
Deadlocks: Tasks waiting for other tasks inside the same pool.
Starvation: Long tasks blocking short tasks.
3. Context switching
Context switching is the process by which the CPU stops running one process/thread, saves its current state,
and loads the saved state of another process/thread.
It is best suited for multitasking, multiprogramming, and time-sharing OS. It also prevents the CPU from
staying idle when a process waits for I/O.
For Processes:
1. Save current process state to Process Control Block (PCB)
2. Update process state (running → ready/blocked)
3. Select next process to run
4. Restore new process state from its PCB
5. Update memory management structures
For Threads:
1. Save thread state to Thread Control Block (TCB)
2. Threads share same memory space (faster switching)
3. Only CPU registers, stack pointer, and program counter saved/restored
4. No memory map changes required
Components of a PCB/Context
• Process State: Running, ready, waiting, blocked, etc.
• Program Counter (PC): The address of the next instruction to execute.
• CPU Registers: Contents of all registers.
• CPU Scheduling Information: Process priority, pointers to scheduling queues.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• Memory Management Information: Base/limit register values, page tables, segment tables.
• Accounting Information: CPU time used, real-time clock, time limits.
• I/O Status Information: List of I/O devices allocated to the process, list of open files.
Context Switch Time = High overhead (no useful computation).
When switching processes, the data in the CPU cache and Translation Lookaside Buffer (TLB) is for the old
process. The new process will likely require different data and memory mappings, resulting in cache
misses and TLB flushes, which significantly slow down initial memory access.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• Synchronization Problem:
o Producer must wait if buffer is full.
o Consumer must wait if buffer is empty.
• Solution: Semaphores or mutex + condition variables.
2. Readers–Writers Problem
Multiple readers can read a shared resource at the same time.
Writers need exclusive access (no reader/writer allowed simultaneously).
• Synchronization Problem:
o Ensure no conflict between read and write.
o Handle fairness (avoid starvation of writers/readers).
• Solution: Read–Write locks, semaphores.
3. Dining Philosophers Problem
Five philosophers sit around a table with five forks. Each needs two forks to eat.
• Synchronization Problem: If each philosopher picks one fork, they all starve → deadlock.
• Solution: Avoid circular wait (enforce an order in fork picking).
Synchronization solutions
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• flag[2]: Boolean array → flag[i] = true → Process i wants to enter critical section.
o flag = true means Process 0 wants to enter
o flag = true means Process 1 wants to enter
do {
flag[i] = true; // Pi want to enter
turn = j; // HUMBLE Ask Pj to enter
while (flag[j] && turn == j); // Wait if other process also wants to enter and it's their turn
// CRITICAL SECTION - Only one process here
flag[i] = false; // Current process finished, others can enter
// REMAINDER SECTION
} while (true);
Example with Process P0 and P1
Scenario 1: Only Process 0 wants to enter
1. Process 0: flag = true, turn = 1
2. Process 0 checks: flag && turn == 1?
• flag = false (Process 1 doesn't want to enter)
• Condition is false, so Process 0 enters immediately
Scenario 2: Both processes want to enter simultaneously
1. Process 0: flag = true, turn = 1
2. Process 1: flag = true, turn = 0
3. Process 0 checks: flag && turn == 1?
• flag = true but turn = 0 (not 1)
• Process 0 enters
4. Process 1 checks: flag && turn == 0?
• flag = true and turn = 0
• Process 1 waits until Process 0 finishes
Peterson's Algorithm satisfies all requirements for a critical section solution: Mutual Exclusion, Progress and
Bounded Waiting.
Semaphores
Semaphores are synchronization solutions that use atomic operations to control access to shared resources
(critical section) by maintaining a counter. Process/threads acquire (decrement) or release (increment) the
semaphore.
A semaphore S is an integer variable accessed only through two atomic operations:
• wait() (also called P or down) → Decrements the counter
o If the counter becomes negative, the calling process/thread blocks until it can proceed
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• signal() (also called V or up) → Increments the counter
o If the counter was negative (process/threads waiting), wakes one of them
wait(S) or P(S): signal(S) or V(S):
wait(S) { signal(S) {
while (S <= 0) S++;
; // busy wait }
S--;
}
Note: Operations must be atomic to prevent race conditions.
Types of Semaphores:
1. Counting Semaphore: The value S can be any integer (≥ 0). Used to control access to a resource with a
finite number of instances (e.g., a buffer pool with 10 slots in in Producer-Consumer problem).
2. Binary Semaphore (Mutex Lock): The value S can only be 0 or 1. Used to enforce mutual exclusion for
a single instance of a resource protecting a critical section. Acts like a lock: 1 → resource available and 0
→ resource unavailable
semaphore mutex = 1; // Binary semaphore
do {
wait(mutex); // Decrement: mutex becomes 0; proceed
// Critical Section
signal(mutex); // Increment: mutex becomes 1; if a thread is waiting, wake it
// Remainder Section
} while (true); // If mutex is 0, Thread B blocks; when A signals, mutex=1 and B proceeds
Monitors
A high-level synchronization construct that encapsulates shared data and its associated operating procedures.
Provides mutual exclusion and condition variables for signaling between threads. A monitor contains:
1. Shared Variables: Data accessed by multiple threads.
2. Procedures (Methods): Operations on shared variables.
3. Condition Variables: Allow processes to wait for certain conditions before proceeding.
Condition Variables: Used within monitors to allow threads to wait for specific conditions to become true.
Threads block on condition variables and are awakened by other threads once conditions are met. Condition
variables c allows threads to wait within a monitor until a particular condition holds.
• wait(c): Atomically release monitor lock, suspend thread on c
• signal(c): Resume one thread waiting on c
• broadcast(c): Resume all threads waiting on c
Condition variables are not semaphores. It does not increment a counter; it simply wakes a waiting process.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
Semaphores Monitors
Abstraction Level Low-level primitive High-level language construct
Mutual Exclusion Programmer-managed, error-prone Compiler-enforced, automatic and safe
Data No; data and sync mechanism are Yes; shared data and procedures are
Encapsulation separate bundled
Mechanism Integer value (S) with wait/signal Condition variables with wait/signal
Spinlocks
• A spinlock is a type of lock mechanism in which a thread repeatedly checks (or "spins") to see if the lock
is available before acquiring it. This busy-wait loop continues until the lock can be acquired.
• Spinlocks are particularly useful for short-duration critical sections, where waiting threads can quickly
acquire the lock without the overhead of context switching.
• However, if the lock is held for a long time, spinlocks can waste CPU cycles because threads keep spinning
without doing productive work.
• Spinlocks are lightweight and efficient compared to other synchronization methods, such as semaphores
or mutexes, because they avoid blocking and thread suspension overhead.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
Differences from Single-Core Scheduling:
• Multicore systems introduce cache hierarchies, memory coherence protocols, and Non-Uniform Memory
Access (NUMA) architectures that significantly impact scheduling decisions
• Multiple processes can truly execute simultaneously, enabling better system throughput and
responsiveness
Challenges in Multicore Scheduling
✓ Cache Affinity: A process builds state in a cache and Translation Lookaside Buffer (TLB); moving it
to another CPU forces cache reloads, degrading performance
✓ Cache Coherence: Hardware protocols (e.g., bus snooping) maintain coherence but add latency to
memory operations
✓ Balancing workload across cores often conflicts with maintaining cache affinity. Poor balancing leaves
cores idle; excessive migration destroys cache locality
✓ Scheduler data structures need synchronization, causing bottlenecks as the core count grows
✓ Algorithms effective for a few cores may degrade on many-core systems
Scheduling in Multiprocessor Systems:
• Processor Affinity: Attempting to keep a process running on the same processor to exploit cache locality.
• Load Balancing: Distributing processes evenly across multiple processors to maximize utilization.
• Gang Scheduling: Scheduling a set of related threads or processes to run simultaneously on different
processors.
• Priority Scheduling with Aging: Dynamically adjusting priorities to prevent starvation of low-priority
processes.
• Multilevel Queue Scheduling: Using multiple queues with different scheduling algorithms for different
process types like interactive and batch processes.
• Multilevel Feedback Queue Scheduling: Allowing processes to move between queues based on their CPU
burst characteristics.
• Fair-Share Scheduling: Ensuring that each user or group of users receives a fair share of CPU time.
• Single-Queue Multiprocessor Scheduling (SQMS): One global ready queue shared by all CPUs.
✓ Affinity mechanisms partially reduce cache penalties but add complexity
• Multi-Queue Multiprocessor Scheduling (MQMS): Each CPU has its own ready queue; processes assigned
on arrival.
• Work Stealing Algorithms: Local queues per CPU. Idle CPUs steal tasks from busy ones. Tasks stolen
from the back of the queue.
✓ Variants: Balanced Work Stealing (throttles wasteful thieves); Hierarchical Work Stealing for large
systems.
• NUMA-Aware Scheduling: In NUMA, CPUs access local memory faster than remote.
• Real-Time Scheduling: Critical in systems with strict timing constraints.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission
• Rate-Monotonic Scheduling: Assigning priorities based on the inverse of the period of a periodic task.
• Earliest Deadline First (EDF): Assigning priorities based on the earliest deadline of a task.
✓ Variants: Global, Partitioned, Hybrid EDF.
References
1. Tanenbaum, A. S., & Bos, H. (2023). Modern Operating Systems. Pearson.
2. Silberschatz, A., Galvin, P. B., & Gagne, G. (2022). Operating System Concepts. Wiley.
3. Arpaci-Dusseau, R. H., & Arpaci-Dusseau, A. C. (2020). Operating Systems: Three Easy Pieces.
4. Anderson, T., & Dahlin, M. (2021). Operating Systems: Principles and Practice. Recursive Books.
Dr. Chandra Priya Jayabal | © 2025 [Chandra Priya]. All rights reserved. No part of this document may
be reproduced without permission