Process Management
Process Management
The process of CPU-bound needs more CPU time i.e., it spends more time in the running
state. Whereas the process of I/O-bound needs more I/O time and less CPU time i.e., it
spends more time in the waiting state.
Process planning is an integral part of the process management operating system. It refers to
the mechanism used by the operating system to determine which process to run next. The
goal of process scheduling is to improve overall system performance by maximizing CPU
utilization, minimizing throughput time, and improving system response time.
Process Management Tasks
Process management is a key part in operating systems with multi-programming or
multitasking.
Process Creation and Termination : Process creation involves creating a Process ID,
setting up Process Control Block, etc. A process can be terminated either by the
operating system or by the parent process. Process termination involves clearing all
resources allocated to it.
CPU Scheduling : In a multiprogramming system, multiple processes need to get the
CPU. It is the job of Operating System to ensure smooth and efficient execution of
multiple processes.
Deadlock Handling : Making sure that the system does not reach a state where two or
more processes cannot proceed due to cyclic dependency on each other.
Inter-Process Communication : Operating System provides facilities such as shared
memory and message passing for cooperating processes to communicate.
Process Synchronization : Process Synchronization is the coordination of execution of
multiple processes in a multiprogramming system to ensure that they access shared
resources (like memory) in a controlled and predictable manner.
Process Operations
Please remember a process goes through different states before termination and these state
changes require different operations on processes by an operating system. These operations
include process creation, process scheduling, execution and killing the process. Here are the
key process operations:
Process Operations
Process Creation
Process creation in an operating system (OS) is the act of generating a new process. This new
process is an instance of a program that can execute independently.
Scheduling
Once a process is ready to run, it enters the "ready queue." The scheduler's job is to pick a
process from this queue and start its execution.
Execution
Execution means the CPU starts working on the process. During this time, the process might:
Move to a waiting queue if it needs to perform an I/O operation.
Get blocked if a higher-priority process needs the CPU.
Killing the Process
After the process finishes its tasks, the operating system ends it and removes its Process
Control Block (PCB).
Context Switching of Process
The process of saving the context of one process and loading the context of another process is
known as Context Switching. In simple terms, it is like loading and unloading the process
from the running state to the ready state.
When Does Context Switching Happen?
Context Switching Happen:
When a high-priority process comes to a ready state (i.e. with higher priority than the
running process).
An Interrupt occurs.
User and kernel-mode switch (It is not necessary though)
Preemptive CPU scheduling is used.
Context Switch vs Mode Switch
A mode switch occurs when the CPU privilege level is changed, for example when a system
call is made or a fault occurs. The kernel works in more a privileged mode than a standard
user task. If a user process wants to access things that are only accessible to the kernel, a
mode switch must occur. The currently executing process need not be changed during a mode
switch. A mode switch typically occurs for a process context switch to occur. Only
the kernel can cause a context switch.
Process Scheduling Algorithms
The operating system can use different scheduling algorithms to schedule processes. Here are
some commonly used timing algorithms:
First-Come, First-Served (FCFS): This is the simplest scheduling algorithm, where
the process is executed on a first-come, first-served basis. FCFS is non-preemptive,
which means that once a process starts executing, it continues until it is finished or
waiting for I/O.
Shortest Job First (SJF): SJF is a proactive scheduling algorithm that selects the
process with the shortest burst time. The burst time is the time a process takes to
complete its execution. SJF minimizes the average waiting time of processes.
Round Robin (RR): Round Robin is a proactive scheduling algorithm that reserves a
fixed amount of time in a round for each process. If a process does not complete its
execution within the specified time, it is blocked and added to the end of the queue.
RR ensures fair distribution of CPU time to all processes and avoids starvation.
Priority Scheduling: This scheduling algorithm assigns priority to each process and
the process with the highest priority is executed first. Priority can be set based on
process type, importance, or resource requirements.
Multilevel Queue: This scheduling algorithm divides the ready queue into several
separate queues, each queue having a different priority. Processes are queued based
on their priority, and each queue uses its own scheduling algorithm. This scheduling
algorithm is useful in scenarios where different types of processes have different
priorities.
Advantages of Process Management
Running Multiple Programs: Process management lets you run multiple
applications at the same time, for example, listen to music while browsing the web.
Process Isolation: It ensures that different programs don't interfere with each other,
so a problem in one program won't crash another.
Fair Resource Use: It makes sure resources like CPU time and memory are shared
fairly among programs, so even lower-priority programs get a chance to run.
Smooth Switching: It efficiently handles switching between programs, saving and
loading their states quickly to keep the system responsive and minimize delays.
Disadvantages of Process Management
Overhead: Process management uses system resources because the OS needs to keep
track of various data structures and scheduling queues. This requires CPU time and
memory, which can affect the system's performance.
Complexity: Designing and maintaining an OS is complicated due to the need for
complex scheduling algorithms and resource allocation methods.
Deadlocks: To keep processes running smoothly together, the OS uses mechanisms
like semaphores and mutex locks. However, these can lead to deadlocks, where
processes get stuck waiting for each other indefinitely.
Increased Context Switching: In multitasking systems, the OS frequently switches
between processes. Storing and loading the state of each process (context switching)
takes time and computing power, which can slow down the system.
Stack
Temporary data like method or function parameters, return address, and local variables are
stored in the process stack.
Heap
This is the memory that is dynamically allocated to a process during its execution.
Text
This comprises the contents present in the processor’s registers as well as the current activity
reflected by the value of the program counter.
Data
The global as well as static variables are included in this section.
Process Life Cycle
When a process runs, it goes through many states. Distinct operating systems have different
stages, and the names of these states are not standardised. In general, a process can be in one
of the five states listed below at any given time.
Start
When a process is started/created first, it is in this state.
Ready
Here, the process is waiting for a processor to be assigned to it. Ready processes are waiting
for the operating system to assign them a processor so that they can run. The process may
enter this state after starting or while running, but the scheduler may interrupt it to assign the
CPU to another process.
Running
When the OS scheduler assigns a processor to a process, the process state gets set to running,
and the processor executes the process instructions.
Waiting
If a process needs to wait for any resource, such as for user input or for a file to become
available, it enters the waiting state.
Terminated or Exit
The process is relocated to the terminated state, where it waits for removal from the main
memory once it has completed its execution or been terminated by the operating system.
The PCB is kept for the duration of a procedure and then removed once the process is
finished.
The Different Process States
The operating system’s processes can be in one of the following states:
NEW – The creation of the process.
READY – The waiting for the process that is to be assigned to any processor.
RUNNING – Execution of the instructions.
WAITING – The waiting of the process for some event that is about to occur (like an
I/O completion, a signal reception, etc.).
TERMINATED – A process has completed execution.
Process vs Program
A program is a piece of code that can be as simple as a single line or as complex as millions
of lines. A computer program is usually developed in a programming language by a
programmer. The process, on the other hand, is essentially a representation of the computer
program that is now running. It has a comparatively shorter lifetime.
Here is a basic program created in the C programming language as an example:
#include <stdio.h>
int main() {
printf(“Hi, Subhadip! \n”);
return 0;
}
A computer program refers to a set of instructions that, when executed by a computer,
perform a certain purpose. We can deduce that a process refers to a dynamic instance of a
computer program when we compare a program to a process. An algorithm is an element of a
computer program that performs a certain task. A software package is a collection of
computer programs, libraries, and related data.
Process Scheduling
When there are several or more runnable processes, the operating system chooses which one
to run first; this is known as process scheduling.
A scheduler is a program that uses a scheduling algorithm to make choices. The following are
characteristics of a good scheduling algorithm:
For users, response time should be kept to a bare minimum.
The total number of jobs processed every hour should be as high as possible, implying
that a good scheduling system should provide the highest possible throughput.
The CPU should be used to its full potential.
Each process should be given an equal amount of CPU time.
4.1 Introduction
A process is the unit of work in modern time-sharing systems. A system has a collection of
processes – user processes as well as system processes. All these processes can execute
concurrently with the CPU multiplexed among them. This module describes what processes
are, gives an introduction to process scheduling and explains the various operations that can
be done on processes.
4.2 Processes
A process is a program in execution. The execution of a process progresses in a sequential
fashion. A program is a passive entity while a process is an active entity. A process includes
much more than just the program code. A process includes the text section, stack, data
section, program counter, register contents and so on. The text section consists of the set of
instructions to be executed for the process. The data section contains the values of initialized
and uninitialized global variables in the program. The stack is used whenever there is a
function call in the program. A layer is pushed into the stack when a function is called. The
arguments to the function and the local variables used in the function are put into the layer of
the stack. Once the function call returns to the calling program, the layer of the stack is
popped. The text, data and stack sections comprise the address space of the process. The
program counter has the address of the next instruction to be executed in the process.
It is possible to have two processes associated with the same program. For example, consider
an editor program, say Microsoft Word. The program has the same text section. But, the data
section will be different for each file that is opened in Microsoft Word, that is, each file has a
different data section.
Figure 4.1 shows the state transition diagram of a process. The process is in the new state
when it is being created. Then the process is moved to the ready state, where it waits till it is
taken for execution. There can be many such processes in the ready state. One of these
processes will be selected and will be given the processor, and the selected process moves to
the running state. A process, while running, may have to wait for I/O or wait for any other
event to take place. That process is now moved to the waiting state. After the event for which
the process was waiting gets completed, the process is moved back to the ready state.
Similarly, if the time-slice of a process ends while still running, the process is moved back to
the ready state. Once the process completes execution, it moves to the terminated state.
In addition to the ready queue, there are other queues in the system in which a process may
be kept during its lifetime. When a process has to wait for I/O, the PCB of the process is
removed from the ready queue and is placed in a device queue. The device queue corresponds
to the I/O device from/to which the process is waiting for I/O. Hence, there are a number of
device queues in the system corresponding to the devices present in the system.
Figure 4.4 shows the ready queue and the various device queues in the system. Any process
during its lifetime will migrate between the various queues.
Fig. 4.4 Ready queue and various I/O device queues (Source: [2])
Figure 4.5 shows a common representation of process scheduling using a queuing diagram.
The rectangular boxes represent the various queues. The circles denote the resources that
serve the queues and the arrows show the flow of processes in the system.
A new process is initially placed in the ready queue. When the process is given the CPU and
is running one of the following may occur:
The process may request for I/O and may be placed in an I/O queue. After I/O gets
completed, the process is moved back to the ready queue.
The time slice allotted to the process may get over. The process is now forcibly
removed from the CPU and is placed back in the ready queue.
The process may create (fork) a new process and may wait for the created child
process to finish completion. Once the child completes execution, the process moves
back to the ready queue.
While the process executes, an interrupt may occur. The process is now removed from
the CPU, the process waits till the interrupt is serviced and is then moved to the ready
state.
4.5.2 Schedulers
A process moves between different queues during its lifetime. The OS should select the
process that should move to the next queue and the queue to which the selected process
should move, in some fashion. This selection is done by schedulers. The different schedulers
available are long-term scheduler, short-term scheduler and medium-term scheduler.
In batch systems, more processes are submitted than those that can be executed. These
processes are placed in a job pool. The long-term scheduler (or job scheduler) selects those
processes that should be brought into the ready queue from the job pool, and brings them
from the job pool to main memory. The short-term scheduler (or CPU scheduler), selects the
process that be executed next and allocates the CPU.
The main difference between the job scheduler and the CPU scheduler is the f requency of
execution. The long-term scheduler controls the degree of multiprogramming (number
of processes in memory). It is invoked only when a process leaves the system. The short-term
scheduler is invoked whenever the CPU switches from one process to another. Hence, the
short-term scheduler is run more frequently than the long-term scheduler.
As the long-term scheduler controls the degree of multiprogramming, it should select a good
mix of I/O-bound and CPU-bound processes and bring them into the main memory. An I/O-
bound process spends most of its time performing I/O than computation. A CPU-bound
process spends most of its time performing computations than I/O. If all processes are I/O-
bound, then the CPU will be under-utilized. If all processes are CPU-bound, the I/O devices
will not be used fully. Hence, proper selection of jobs by the job scheduler will ensure that
the system is stable.
This means that all reasonable uses of fork() look essentially like this:
#include <stdio.h>
#include <unistd.h>
int main() {
pid_t pid = fork();
if (pid == 0) { // Child.
printf("Hello from the child process!\n");
} else if (pid > 0) { // Parent.
printf("Hello from the parent process!\n");
} else {
perror("fork");
}
return 0;
}
In other words, after your program calls fork(), it should immediately check which universe it
is living in: are we now in the child process or the parent process? Otherwise, the processes
have the same variable values, memory contents, and everything else—so they’ll behave
exactly the same way, aside from this check.
Another way of putting this strange property of fork() is this: most functions return once. fork
returns twice!
exec
The exec function call “morphs” the current process, which is currently executing program A,
so that it instead starts executing program B. You can think of it swapping out the contents of
memory to contain the instructions and data from executable file B and then jumping to the
first instruction in B’s main.
There are many variations on the exec function; check out the manual page to see them all.
Let’s look at a fairly simple one, execl. Here’s the function signature, copied from the
manual:
int execl(const char *path, const char *arg, ...);
You need to provide the executable you want to run (a path on the filesystem) and a list of
command-line arguments (which will be passed as argc in the target program’s main).
#include <stdio.h>
#include <unistd.h>
int main() {
if (execl("/bin/ls", "ls", "-l", NULL) == -1) {
perror("error in exec call");
}
return 0;
}
That transforms the current process into an execution of ls -l. There’s one tricky thing in the
argument list: by convention, the first argument is always the name of the executable. (This is
also true when you look at argc[0] in your own main function.) So the first argument to the
execl call here is the path to the ls executable file, and the second argument to execl is the
first argument to pass to the executable, which is the name ls. We also terminate the variadic
argument list with NULL.
fork + exec = spawn a new command
To Summarize, exec() and fork()
The exec() family of functions replaces the current process image with a new process
image. The exec() functions return only if an error has occurred. The return value is -
1, and errno is set to indicate the error.
fork() creates a new process by duplicating the calling process. The new process is
referred to as the child process. The calling process is referred to as the parent
process. The process id of the child process is returned on calling fork().
The fork and exec functions seem kind of weird by themselves. Who wants an identical copy
of a process, or to completely erase and overwrite the current execution with a new program?
n practice, fork and exec are almost always used together. If you pair them up, you can do
something much more useful: spawn a new child process that runs a new command. You first
fork the parent process, and then you exec in the child (and only the child) to transform that
process to execute a new program.
The recipe looks like this:
fork()
Check if you’re the child. If so, exec the new program.
Otherwise, you’re the parent. Wait for the child to exit (see below).
Here that is in code:
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t pid = fork();
if (pid == 0) { // Child.
if (execl("/bin/ls", "ls", "-l", NULL) == -1) {
perror("error in exec call");
}
} else if (pid > 0) { // Parent.
printf("Hello from the parent!");
waitpid(pid, NULL, 0);
} else {
perror("error in fork call");
}
return 0;
}
This code spawns a new execution of ls -l in a child process. This is a useful pattern for
programs that want to delegate some work to some other command. (Don’t worry about the
waitpid call; we’ll cover that next.)
waitpid
Finally, when you write code that creates new processes, you will also want to wait for them
to finish. The waitpid function does this. You supply it with a pid of the process you want to
wait for (and, optionally, an out-parameter for some status information about it and some
options), and the call blocks until the process somehow finishes.
It’s usually important to waitpid all the child processes you fork. Try deleting the waitpid call
from the example above, and then compile and run it. What happens? Can you explain what
went wrong when you didn’t wait for the child process to finish?
Orphan Process
An Orphan Process is an unterminated process whose parent process has finished or
terminated. In Unix-like OS, an orphan process is adopted by the init process. In some OS, an
orphan process is immediately terminated.
Zombie Process
A Zombie Process is a process that has terminated but still has an entry in the process table.
This generally occurs when the parent process needs to read the child process's exit status
before the child process can be removed from the table.
Threads
A thread of execution is the smallest sequence of instructions that can be independently
managed by a scheduler. Threads are small components of a process and multiple threads can
run concurrently and share the same code, memory, variables, etc.
Each thread shares the same code, data, and heap blocks but will have its own stack. Threads
are often called lightweight processes because they have their own stack memory.
Multiple threads are run together through a CPU functionality known as multithreading.
Process vs Threads
The primary difference between a process and a thread is that different processes
cannot share the same memory space (code, variables, etc) whereas different threads
in the same process share the same memory space.
Threads are lightweight whereas Processes are heavyweight.
Threads switching does not need to interact with the OS whereas process switching
requires interaction with OS.
When a thread makes changes to any variable, all other threads in the same process
can see it. The same is not true for different processes.
Multithreading
Multithreading is the ability of a CPU to allow multiple threads of execution to run
concurrently. Multithreading is different from multiprocessing as multithreading aims to
increase the utilization of a single core. In most modern system architectures, multithreading
and multiprocessing are combined.
#include<stdio.h>
#include<sys/types.h>
Int main()
{ int pid;
pid=fork();
if(pid==0)
{
printf("\n I am the child");
printf("\n I am the parent :%d",getppid());
printf("\n I am the child :%d",getpid());
}
else
{
printf("\n I am the parent ");
printf("\n I am the parents parent :%d",getppid());
printf("\n I am the parent :%d\n",getpid());
return 0;
}
}