Module 4 (UNIX)
Module 4 (UNIX)
Process Control
Introduction
• Process control in UNIX refers to the set of mechanisms and
system calls provided by the operating system to manage
processes throughout their lifecycle including their creation,
execution and termination.
• The system tracks different types of Ids for each process:
• Real, effective ,user IDs and group IDs
• The process IDs are properties of each process and how they
are affected by process control functions.
Process Identifiers
• Process identifiers in UNIX are special IDs assigned to every process
for tracking and control purposes.
• Every process has a unique process ID, a non negative integer
• There are some processes:
• Process ID 0 is usually the scheduler process known as swapper
• Process ID 1 is usually the init process and is invoked by the kernel at
the end of the bootstrap procedure
• Process ID 2 is the pagedeamon responsible for supporting the paging
of the virtual memory system.
• In addition to the process ID, there are other identifiers for every
process.
#include<unistd.h>
pid_t getpid(void);
returns: process ID of calling process
pid_t getppid(void);
returns: parent process ID of calling process
uid_t getuid(void);
returns: real user ID of calling process
uid_t geteuid(void);
returns: effective user ID of calling process
gid_t getgid(void);
returns: real group ID of calling process
gid_t getegid(void);
returns: effective group ID of calling process
fork Function
• An existing process can create a new one by calling the fork function
#include<unistd.h>
pid_t fork(void);
returns: 0 in child, process ID of child in parent, -1 on error
• The new process crated by fork is called the child process
• This function is called once but returns twice.
• The only difference in the returns is that the return value in the child is 0,
whereas the return value in the parent is the process Id of the new child.
• The reason the child’s process ID is returned to the parent is that a
process can have more than one child, and there is no function that
allows a process to obtain the process IDs of its children
• The reason fork returns 0 to the child is that a process can have only a
single parent and the child can always call getppid to obtain the process ID
of its parent. (process ID 0 is reserved for use by the kernel, so it’s not
possible for 0 to be the process Id of a child)
• Both the child and the parent continue executing with the instruction that
follows the call to fork
• The child is the copy of the parent
• For example, the child gets a copy of the parent’s data space, heap and
stack.
• Note that this is a copy for the child; the parent and the child do not share
these portions of memory.
• The parent and the child share the text segment.
Example programs
File sharing
• It is the context of OS referring to the ability of multiple processes to access the
same open file.
• Consider a process that has three different files opened for standard input,
standard output and standard error. On return from fork, we have the arrangement
shown:
• It is important that the parent and the child share the same file offset.
• Consider a process that forks a child, then waits for the child to complete.
• Assume that both processes write to standard output as part of their normal
processing.
• If the parent has its standard output redirected (by a shell) it is essential that
the parent’s file offset be updated by the child when the child writes to
standard output.
• In this case, the child can write to standard output while the parent is waiting
for it; on completion of the child, the parent can continue writing to standard
output, knowing that its output will be appended to whatever the child wrote.
• If the parent and the child did not share the same file offset (the position within
an open file that indicates where the next read or write operators will start) , this type
of interaction would be more difficult to accomplish and would require
explicit actions by the parent.
There are two normal cases for handling the descriptors
after a fork.
1. The parent waits for the child to complete. In this case, the
parent does not need to do anything with its descriptors.
When the child terminates, any of the shared descriptors that
the child read from or write to will have their file offsets
updated accordingly.
2. Both the parent and the child go their own ways. Here, after
the fork, the parent closes the descriptors that it doesn’t need
and the child does the same thing. This way, neither
interferes with the other’s open descriptors.
There are numerous other properties of the parent that are inherited by the child:
• Real user ID, real group ID, effective • File mode creation mask
user ID, effective group ID • Signal mask and dispositions
• Supplementary group IDs • The close-on-exec flag for any open
• Process group ID file descriptors
• Session Id • Environment
• Controlling terminal • Attached shared memory segments
• The set-user-ID and set-group-ID flags • Memory mappings
• Current working directory • Resource limits
• Root directory
The differences between the parent and child are:
• The return value from fork
• The process IDs are different
• The two processes have different parent process IDs: the parent
process ID of the child is the parent; the parent process ID of the
parent doesn’t change
• The child’s tms_utime(user CPU time), tms_stime(system (kernel)
CPU time), tms_cstime(user CPU time of terminated children) values
are set to 0
• File locks set by the parent are not inherited by the child
• Pending alarms are cleared for the child
• The set of pending signals for the child is set to the empty set.
The two main reasons for fork to fail are
a. If too many processes are already in the system, which usually
means that something else is wrong or
b. If the total number of processes for this real user ID exceeds the
system’s limit.
There are two uses for fork:
• Fork is used when a process wants to make a copy of itself. Both the
parent and child can then do different tasks at the same time. For
example, servers use fork to let the parent wait for new requests while
the child handles each request.
• Fork is also used when a process wants to run a completely new
program. This is common for shell programs, where the child process
runs a new program after the fork is done.
vfork function
• The function vfork has the same calling sequence and same return values as
fork.
• The vfork function is intended to create a new process when the purpose of
the new process is to exec a new program.
• The vfork function crates the new process just like fork, without copying
the address space of the parent into the child, as the child wont reference
that address space; the child simply calls exec(or exit) right after the vfork
• Instead, while the child is running and until it calls either exec( replaces its
current program with a new one) or exit, the child runs in the address space
of the parent.
• Another difference between the two functions is that vfork guarantees that
the child runs first, until the child calls exec or exit. When the child calls
either of these functions, the parent resumes.
#include “apue.h” glob++;
int glob=6; var++;
int main(void) _exit(0);
{ }
/*parent continues here
int var; */
pid_t pid; printf(“pid=%d,glob=%d, var=%d\n”,
var=88; getpid(),glob, var);
printf(“before vfork\n”); exit(0);
if((pid=vfork())<0) {//process creation }
failed Output
err_sys(“vfork error”); $./a.out
} before vfork
else if(pid ==0) {//child process pid=29039, glob=7, var=89
exit Functions
A process can terminate normally in five ways:
• Executing a return from the main function
• Calling the exit function (end a running program with a status code)
• Calling the _exit or _Exit function (performs cleanup)
in most UNIX system implementations, exit(3) is a function in the
standard C library, whereas _exit(2) is a system call.
• Executing a return from the start routine of the last thread in the
process. When the last thread returns from its start routine, the process
exits with a termination status of 0.
• Calling the pthread_exit function from the last thread in the process.
The three forms of abnormal termination are as follows:
1. Calling abort(): The program itself calls the abort() function,
which causes it to end abnormally. Sends the SIGABRT signal
to the process.
2. Receiving certain signals: The program is forcefully killed
by the OS because it did something wrong or the user sent a
signal like (SIGKILL-immediate kill, SIGSEGV-segmentation
fault, SIGINT-interrupt signal)
3. Thread cancellation request: in multi-threaded programs,
one thread can request another thread to be cancelled.
• When any process ends, the kernel runs the same closing operations for it:
closing files, freeing its memory, and cleaning up its resources.
• The parent process can find out how its child ended using special exit
functions (exit, _exit, Exit)
• An “exit status” is given to the parent to show how the child terminated
• Exit status: what the child processes gives when it ends.
• Termination status: what the parent sees, converted by the system from
exit status to show exactly how the child finished, especially in case of
errors
• If a parent process ends before its child has finished, a special process
called init (process ID 1) becomes the new parent. This ensures every
process always has a parent, and system can always track them.
wait and waitpid functions
• When a process terminates, either normally or abnormally, the kernel notifies
the parent by sending the SIGCHLD signal to the parent. Because the
termination of a child is an asynchronous event- it can happen at any time
while the parent is running-this signal is the asynchronous notification from
the kernel to the parent.
• The parent can choose to ignore this signal or it can provide a function that is
called when the signal occurs: a signal handler
• A process that calls wait or waitpid can:
• Block, if all of its children are still running
• Returns immediately with info if at least one child has ended and its status
is available. All children have ended and their statuses have already been
collected.
• Returns an error if the process has no child process.
#include<sys/wait.h>
pid_t wait(int *statloc);
pid_t waitpid(pid_t pid, int *statloc, int options);
Both return: ID if Ok, 0 or -1 on error
The difference between these two functions are:
• The wait function can block the caller until a child process
terminates, whereas waitpid has an option that prevents it from
blocking
• The waitpid function doesn’t wait for the child that terminates
first; it has a number of options that control which process it
waits for.
Macros to examine the termination status returned by wait and waitpid
Macro Description
WIFEXITED(status) Returns true if the child process terminated normally (i.e., called exit() or
(Wait If Exited) returned from main)
WIFSIGNALED(status) Returns true if the child process was killed by a signal(like SIGKILL or
(Wait If Signaled) SIGSEGV not a normal exit)
Check if the child was terminated by an error or external kill
WIFSTOPPED(status) Returns true if the child was stopped (not ended ) by a signal
(Wait If Stopped) It is used to detect if the child is paused.
The above table shows the differences among the 6 exec functions.
We’ve mentioned that the process ID does not change after an exec, but
the new program inherits additional properties from the calling process:
• Process ID and parent process • Root directory
ID • File mode creation mask
• Real user ID and real group ID • File locks
• Supplementary group IDs • Process signal mask
• Process group ID • Pending signals
• Session ID • Resource limits
• Controlling terminal • Value for tms_utime,
• Time left until alarm clock tms_stime, tms_cutime and
• Current working directory tms_cstime
Example of exec functions
#include “apue.h” }
#include<sys/wait.h> if (waitpid(pid,NULL,0)<0)
char *env_init[]= {“USER=unknown”, err_sys(“wait error”);
”PATH=/tmp”,NULL};
if((pid=fork())<0) {
int main(void){
pid_t pid; err_sys(“fork error”);
if((pid=fork())<0) { } else if(pid==0) {
err_sys(“fork error”); if (execlp(“echoall”,”echoall”,”only
} else if(pid==0) { 1 arg”,(char*)0)<0)
if(execle(“/home/sar/bin/ err_sys(execlp error”);
echoall”,”echoall”,”myarg1”,”MY ARG2”, } exit(0); }
(char *)0,env_init<0)
err_sys(“execle error”);
Output:
$ ./a.out
argv[0]: echoall
argv[1]: myarg1
argv[2]: MY ARG2
USER=unknown
PATH=/tmp
$ argv[0]: echoall
argv[1]: only 1 arg
USER=sar
LOGNAME=sar
SHELL=/bin/bash
47 more lines that aren't shown
HOME=/home/sar
Chapter 2
Interprocess communication
Interprocess communication
Overview of IPC methods:
• Pipes
• Popen and pclose functions
• Coprocesses
• FIFOs
• System V IPC
• Message Queues
• Semaphores
Introduction
• Interprocess communication(IPC) is a set of techniques that allow processes
to exchange data and information with each other during execution. This is
essential in modern operating systems where multiple processes need to
coordinate or share resources.
• Common IPC methods are:
1) Half duplex pipes 6) Shared memory
2) FIFOs 7) Semaphores
3) Full duplex pipes 8) Sockets
4) Named full duplex pipes 9) STREAMS
5) Message queues
• The first seven forms of IPC are usually restricted to IPC between processes
on the same host.
• The final two i.e., Sockets and STREAMS are the only two that are generally
supported for IPC between processes on different hosts.
Pipes
• Pipes are the oldest form of UNIX system IPC and are provided by all
UNIX systems. Pipes have two limitations:
• Historically, they have been half duplex(i.,data flows in only one
direction). Some systems now provide full-duplex pipes.
• Pipes can be used only between processes that have a common ancestor.
• Normally, a pipe is created by a process, that process calls fork, and the
pipe is used between the parent and the child.
• A pipe is created by calling the pipe function.
#include<unistd.h>
int pipe(int filedes[2]);
Returns:0 if ok, -1 on error
• Two file descriptors are returned through the filedes argument: filedes[0] is
open for reading and filedes[1] is open for writing.
• The output of filedes[1] is the input for filedes[0].
• The left half of the figure shows the two ends of the pipe connected in a single
process. The right half of the figure emphasizes that the data in the pipe flows
through the kernel.
• A pipe in a single process is next to useless
• Normally, the process that calls pipe then calls fork, creating an IPC
channel from the parent to the child or vice versa.
• What happens after the fork depends on which direction of data flow we
want
• For a pipe from the parent to the child, the parent closes the read end of
the pipe (fd[0]), and the child closes the write end (fd[1]).
• When using a pipe for communication from the child to the
parent:
• The parent closes the writing end (fd[1])
• The child closes the reading end (fd[0]).
• When the end of a pipe is closed, the following rules will apply:
• If we try to read from a pipe and the other side (the writing end) is
closed, the read operation will return 0.
• This means “end of file” (EOF), there is no more data to read.
• If we try to write to a pipe and the other side (the reading end) is
closed, the system send a signal called SIGPIPE to the process.
• If it is ignored or handle the signal, the write function returns -1 and
sets an error indicating EPIPE.
Program to create a pipe between a parent and its
child and to send data down the pipe
#include “apue.h” close(fd[0]);
int main(void) write (fd[1], “hello world\n”,12);
{ } else { /*child */
int n; close(fd[1]);
int fd[2]; n=read(fd[0], line,MAXLINE);
pid_t pid; write(STDOUT_FILENO, line,n);
char line[MAXLINE]; }
if(pipe(fd)<0) exit(0);
err_sys(“pipe error”); }
} else if (pid>0) { /*parent */
popen and pclose functions
• popen and pclose are functions in C that make it easy to run another
program and read its output or send input to it. They automatically
take care of pipes, child process creation and cleanup.
popen:
• popen starts a new process (like running a shell command).
• You can choose to read what the process prints or write to this process
as its input.
pclose:
• It closes the pipe and waits for the process to finish
• pclose should be called after using popen.
#include<stdio.h>
FILE *popen(const *cmdstring, const char *type);
returns: file pointer if ok, NULL on error
int pclose(FILE *fp);
returns: termination status of cmdstring, or -1 on
error
• cmdstring is the command to run
• type is “r” (read from the process” or “w” (write to the process)
• The function popen does a fork and exec to execute the
cmdstring, and returns a standard I/O file pointer.
• If type is “r”, the file pointer is connected to the
standard output od cmdstring
result of fp=popen(cmdstring, “r”)
• If type is “w”, the file pointer is connected to the standard input of
cmdstring
result of fp=popen(cmdstring, “w”)
Coprocesses
• A Unix system filter is a program that reads from standard input
and writes to the standard output.
• Normally, filters are connected in a straight line (pipeline) in a
shell pipelines.
• A filter becomes a coprocess when the same program generates the
filters output.
• A coprocess normally runs in the background from a shell, and its
standard input and standard output are connected to another
program using a pipe.
• The process creates a two pipes: standard input and standard
output of the coprocess.
Driving a coprocess by writing its standard
input and reading its standard output
Program of simple filter to add two numbers
#include “apue.h”
int main(void)
{
int n, int1,int2;
char line[MAXLINE];
while((n=read(STDIN_FILENO, line,MAXLINE))>0)
{ line[n]=0; /*null terminate*/
if(sscanf(line, “%d%d”, &int1, &int2)==2){
sprintf(line, “5d\n”,int1+int2);
n=strlen(line);
if(write(STDOUT_FILENO, line,n)!=n)
err_sys(“write error”);
} else {
if(write(STDOUT_FILENO, “invalid args\n”,13)!=13)
err_sys(“write error”);
}
}
exit(0);
}
FIFOs
• FIFOs are sometimes called named pipes. Pipes can be used only
between related processes when a common ancestor has created the pipe.
#include<sys/stat.h>
int mkfifo(const char *pathname, mode_t mode);
returns: 0 if ok, -1 on error
• mkfifo is a function used to create a FIFO (named pipe) in UNIX
systems. After creating a FIFO with mkfifo, it should be opened with a
open function.
• const char *pathname: This is a string pathname that specifies the
name(path) of the FIFO file
• mode_t mode: it sets the permission for the FIFO file(read, write or
execute)
Once we have used mkfifo to create a FIFO, we open it using open. When
we open a FIFO, the non-blocking flag(O_NONBLOCK) affects
Normal case(O_NONBLOCK not set):
• If you open the FIFO for reading, it blocks (waits) until another process
opens it for writing.
• If you open it for writing, it blocks until another process opens it for
reading.
Nonblocking case(O_NONBLOCK set):
• If you open for reading, it returns immediately even if no one opened it for
writing.
• If you open for writing, it only returns with an error if no one opened it for
reading.
There are two uses for FIFOs
• FIFOs are used by shell commands to pass data from one
shell pipeline to another without crating intermediate
temporary files.
• FIFOs act like meeting places where a client and a server can
safely exchange messages or data.
Example using FIFOs to duplicate Output
Streams
• FIFOs can be used to duplicate an output stream in a series of shell
commands.
• This prevents writing the data to an intermediate disk file.
mkfifo fifo1
prog3 <fifo1 &
prog1 < infile | tee fifo1 | prog2
• We create the FIFO and then start prog3 in the background, reading from
the FIFO.
• We then start prog1 and use tee to send its input to both the FIFO and
prog2.