[go: up one dir, main page]

0% found this document useful (0 votes)
12 views118 pages

Os 2 All Assignments2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 118

Explain the architecture of the UNIX System.

duction to UNIX System





Unix is an Operating System that is truly the base of all Operating Systems like Ubuntu, Solaris,
POSIX, etc. It was developed in the 1970s by Ken Thompson, Dennis Ritchie, and others in the
AT&T Laboratories. It was originally meant for programmers developing software rather than
non-programmers.
Unix and the C were found by AT&T and distributed to government and academic institutions,
which led to both being ported to a wider variety of machine families than any other operating
system. The main focus that was brought by the developers in this operating system was the
Kernel. Unix was considered to be the heart of the operating System. The system Structure of
Unix OS are as follows:
UNIX is a family of multitasking, multiuser computer operating systems developed in the mid
1960s at Bell Labs. It was originally developed for mini computers and has since been ported to
various hardware platforms. UNIX has a reputation for stability, security, and scalability, making
it a popular choice for enterprise-level computing.
The basic design philosophy of UNIX is to provide simple, powerful tools that can be combined
to perform complex tasks. It features a command-line interface that allows users to interact with
the system through a series of commands, rather than through a graphical user interface (GUI).
Some of the key features of UNIX include:
1. Multiuser support: UNIX allows multiple users to simultaneously access the same
system and share resources.
2. Multitasking: UNIX is capable of running multiple processes at the same time.
3. Shell scripting: UNIX provides a powerful scripting language that allows users to
automate tasks.
4. Security: UNIX has a robust security model that includes file permissions, user
accounts, and network security features.
5. Portability: UNIX can run on a wide variety of hardware platforms, from small
embedded systems to large mainframe computers.
6. Communication: UNIX supports communication methods using the write command,
mail command, etc.
7. Process Tracking: UNIX maintains a record of the jobs that the user creates. This
function improves system performance by monitoring CPU usage. It also allows you
to keep track of how much disk space each user uses, and the use that information to
regulate disk space.
Today, UNIX is widely used in enterprise-level computing, scientific research, and web servers.
Many modern operating systems, including Linux and macOS, are based on UNIX or its
variants.
Figure – system structure
 Layer-1: Hardware: It consists of all hardware related information.
 Layer-2: Kernel: This is the core of the Operating System. It is a software that acts
as the interface between the hardware and the software. Most of the tasks like
memory management, file management, network management, process management,
etc., are done by the kernel.
 Layer-3: Shell commands: This is the interface between the user and the kernel.
Shell is the utility that processes your requests. When you type in a command at the
terminal, the shell interprets the command and calls the program that you want. There
are various commands like cp, mv, cat, grep, id, wc, nroff, a.out and more.
 Layer-4: Application Layer: It is the outermost layer that executes the given
external applications.
Figure – kernel and its block diagram
This diagram shows three levels: user, kernel, and hardware.

 The system call and library interface represent the border between user programs and
the kernel. System calls look like ordinary function calls in C programs. Assembly
language programs may invoke system calls directly without a system call library.
The libraries are linked with the programs at compile time.
 The set of system calls into those that interact with the file subsystem and some
system calls interact with the process control subsystem. The file subsystem manages
files, allocating file space, administering free space, controlling access to files, and
retrieving data for users.
 Processes interact with the file subsystem via a specific set of system calls, such as
open (to open a file for reading or writing), close, read, write, stat (query the
attributes of a file), chown (change the record of who owns the file), and chmod
(change the access permissions of a file).
 The file subsystem accesses file data using a buffering mechanism that regulates data
flow between the kernel and secondary storage devices. The buffering mechanism
interacts with block I/O device drivers to initiate data transfer to and from the kernel.
 Device drivers are the kernel modules that control the operator of peripheral devices.
The file subsystem also interacts directly with “raw” I/O device drivers without the
intervention of the buffering mechanism. Finally, the hardware control is responsible
for handling interrupts and for communicating with the machine. Devices such as
disks or terminals may interrupt the CPU while a process is executing. If so, the
kernel may resume execution of the interrupted process after servicing the interrupt.
 Interrupts are not serviced by special processes but by special functions in the kernel,
called in the context of the currently running process.
Difference between Unix and Linux
Linux is essentially a clone of Unix. But, basic differences are shown below:
Linux Unix

The source code of Linux is freely available The source code of Unix is not freely
to its users available general public

It has graphical user interface along with


It only has command line interface
command line interface

Linux OS is portable, flexible, and can be


Unix OS is not portable
executed in different hard drives

Different versions of Linux OS are Ubuntu,


Different version of Unix are AIS, HP-UX,
Linux Mint, RedHat Enterprise Linux,
BSD, Iris, etc.
Solaris, etc.

The file systems supported by Linux are as


The file systems supported by Unix are as
follows: xfs, ramfs, vfat, cramfsm, ext3,
follows: zfs, js, hfx, gps, xfs, vxfs
ext4, ext2, ext1, ufs, autofs, devpts, ntfs

Linux is an open-source operating system Unix is a proprietary operating system that


that was first released in 1991 by Linus was originally developed by AT&T Bell Labs
Torvalds. in the mid 1960s.

The Unix kernel is modular, meaning that it


The Linux kernel is monolithic, meaning that
is made up of a collection of independent
all of its services are provided by a single
modules that can be loaded and unloaded
kernel.
dynamically.
Linux Unix

Unix was originally designed to run on large,


Linux has much broader hardware support expensive mainframe computers, while
than Unix. Linux was designed to run on commodity
hardware like PCs and servers.

User Interface of Linux is Graphical or text-


User Interface of unix is text-based.
based.

Command Line Interface of Linux is Bash, Command Line Interface of unix is Bourne,
Zsh, Tcsh. Korn, C, Zsh.

Advantages of UNIX:
1. Stability: UNIX is known for its stability and reliability. It can run for long periods of
time without requiring a reboot, which makes it ideal for critical systems that need to
run continuously.
2. Security: UNIX has a robust security model that includes file permissions, user
accounts, and network security features. This makes it a popular choice for systems
that require high levels of security.
3. Scalability: UNIX can be scaled up to handle large workloads and can be used on a
variety of hardware platforms.
4. Flexibility: UNIX is highly customizable and can be configured to suit a wide range
of needs. It can be used for everything from simple desktop systems to complex
server environments.
5. Command-line interface: UNIX’s command-line interface allows for powerful and
efficient interaction with the system.
Disadvantages of UNIX:
1. Complexity: UNIX can be complex and difficult to learn for users who are used to
graphical user interfaces (GUIs).
2. Cost: Some UNIX systems can be expensive, especially when compared to open-
source alternatives like Linux.
3. Lack of standardization: There are many different versions of UNIX, which can make
it difficult to ensure compatibility between different systems.
4. Limited software availability: Some specialized software may not be available for
UNIX systems.
5. Steep learning curve: UNIX requires a certain level of technical knowledge and
expertise, which can make it challenging for novice users.

Draw and explain block dig of kernel


With a neat diagram explain block diagram of system kernel.

Figure 2.1 gives a block diagram of the kernel, showing various modules and their relationships
to each other. In particular, it shows the file subsystem on the left and the process control
subsystem on the right, the two major component of the kernel.

The diagram serves as a useful logical view of the kernel, although in practice the kernel deviates
from the model because some modules interact with the internal operations of others.

Figure 2.1 shows three levels: user, kernel, and hardware. The system call and library interface
represent the border between user programs and the kernel depicted in Figure 1.1.

System calls look like ordinary function calls in C programs, and libraries map these function
calls to the primitives needed to enter the operating system.

Assembly language programs may invoke system calls directly without a system call library,
however. Programs frequently use other libraries such as the standard I/O library to provide a
more sophisticated use of the system calls. The libraries are linked with the programs at compile
time.
The figure partitions the set of system calls into those that interact with the file subsystem and
those that interact with the process control subsystem. The file subsystem manages files,
allocating file space, administering free space, controlling access to files, and retrieving data for
users.

Processes interact with the file subsystem via a specific set of system calls, such as open (to open
a file for reading or writing), close, read, write, stat (query the attributes of a file).

The file subsystem accesses file data using a buffering mechanism that regulates data flow
between the kernel and secondary storage devices.

Device drivers are the kernel modules that control the operation of peripheral devices. Block I/O
devices are random access storage devices alternatively, their device drivers make them appear
to be random access storage devices to the rest of the system.

The process control subsystem is responsible for process synchronization, interprocess


communication, memory management, and process scheduling.

The memory management module controls the allocation of memory.

The scheduler module allocates the CPU to processes. It schedules them to run in turn until they
voluntarily relinquish the CPU while awaiting a resource or until the kernel preempts them when
their recent run time exceeds a time quantum.
Explain with example: Building block primitives.

List the building block primitives of UNIX with examples.

Unix enables to Write Small modular Program that can be Used as building block to build more
complex programs. There 2 Primitives:

1. I/O redirection
2. Pipe(|)

I/O REDIRECTION

Most unix command read input, such as a file or another attribute for the command, and write
output. By default, input is being given with the keyboard, and output is displayed on your
screen. Your keyboard is your standard input (stdin) device, and the screen or a particular
terminal window is the standard output (stdout) device.

Output redirection with > and | :

Sometimes you will want to put output of a command in a file, or you may want to issue another
command on the output of one command. This is known as redirecting output. Redirection is
done using either the “>” (greater-than symbol), or using the “|” (pipe) operator which sends the
standard output of one command to another command as standard input. The file descriptor 0 is
assigned to the standard output file.

Eg: cat1 test1 > test2

the cat command concatenates files and puts them all together to the standard output. By
redirecting this output to a file, this file name will be created - or overwritten if it already exists,
so take care.

Input redirection

In another case, you may want a file to be the input for a command that normally wouldn't accept
a file as an option. This redirecting of input is done using the “<” (less-than symbol) operator.
The file descriptor 1 is assigned to the standard input file.

Eg: cat <test1

In this the cat will take each line of file test1 as input and display it on the monitor.

Error Redirection

When invalid command is typed at the $prompt the shell displays error message on the
monitor.Thus the monitor is reffered as standard error file. The file descriptor 2 is assigned to the
standard error file.

PIPES

Unix has a feature by which filters and other commands can be combined in such a way that the
standard output of 1 filter command can be sent as standard input to another filter or command.

$ ls –l|wc –l

It is the command to display the no of total lines in current directory.

$ grep –n `and’ test

Command to display the no of lines where the word ‘and’ occurs in a file test.

What is a buffer? Explain the structure of the buffer header.

During system initialization, the kernel allocates space for a number of buffers, configurable
according to memory size and system performance constraints. A buffer consists of two parts: a
memory array that contains data from the disk and a buffer header that identifies the buffer.
Because there is a one to one mapping of buffer headers to data arrays, the ensuing text will
frequently refer to both parts as a "buffer," and the context should make clear which part is being
discussed.

The data in a buffer corresponds to the data in a logical disk block on a file system, and the
kernel identifies the buffer contents by examining identifier fields in the buffer header. The
buffer is the in-mcmory copy of the disk block; the contents of the disk block map into the
buffer, but the mapping is temporary until the kernel decides to map another disk block into the
buffer. A disk block can never map into more than one buffer at a time. If two buffers were to
contain data for one disk block, the kernel would not know which buffer contained the current
data and could write incorrect data back to disk. For example, suppose a disk block maps into
two buffers, A and B. If the kernel writes data first into buffer A and then into buffer B, the disk
block should contain the contents of buffer B if all write operations completely fill the buffer.
However, if the kernel reverses the order when it copies the buffers to disk, the disk block will
contain incorrect data.

The buffer header (Figure 3.1) contains a device number field and a block number field that
specify the file system and block number of the data on disk and uniquely identify the buffer.
The device number is the logical file system number

1. The buffer cache is a software structure that should not be confused with hardware caches that
speed memory references.

on free list

Figure 3.1. Buffer Header on free list


Figure 3.1. Buffer Header

(see Section 2.2.1), not a physical device (disk) unit number. The buffer header also contains a
pointer to a data array for the buffer, whose size must be at least as big as the size of a disk
block, and a status field that summarizes the current status of the buffer. The status of a buffer is
a combination of the following conditions:

• The buffer is currently locked (the terms "locked" and "busy" will be used interchangeably, as
will "free" and "unlocked"),

• The buffer contains valid data,

• The kernel must write the buffer contents to disk before reassigning the buffer; this condition is
known as "dclayed-write,"

• The kernel is currently reading or writing the contents of the buffer to disk,

• A process is currently waiting for the buffer to become free.

The buffer header also contains two sets of pointers, used by the buffer allocation algorithms to
maintain the overall structure of the buffer pool, as explained in the next section.

Draw and explain data structure of file subsystem.

An Overview of the File Subsystem

Internal representation of a file is given by an inode, which contains a description of the disk
layout of the file data and other information such as the file owner, access permissions, and
access times. The term inode is a contraction of the term index node. Every file has one inode,
but it may have several names, all of which map into the inode. Each name is called a link. When
a process creates a new file, the kernel assigns it an unused inode.

Inodes are stored in the file system, but the kernel reads them into an in-core (in-memory) inode
table when manipulating files. The kernel contains two other data structures, the file table and
the user file descriptor table. The file table is a global kernel structure, but the user file
descriptor table is allocated per process. When a process opens or creats a file, the kernel
allocates an entry from each table, corresponding to the file's inode. Entries in the three
structures -- user file descriptor table, file table, and inode table -- maintain the state of the file
and the user's access to it. The file table keeps track of the byte offset in the file where the user's
next *read8 or write will start, and the access rights allowed to the opening process. The user file
descriptor table identifies all open files for a process.
The kernel returns a file descriptor for the open and creat system calls, which is an index into the
user file descriptor table. These three tables also enable sharing of files.

On a logical level, the kernel deals with file systems rather than with disks. It treats each file
system as a logical device identified by a logical device number.

A file system consists of a sequence of logical blocks, each containing 512, 1024, 2048, or any
convenient multiple of 512 bytes. The size of a local block is homogeneous within a file system
but may vary between different file systems in a system configuration.

In this text, the size of a "block" is assumed to be 1K, unless stated otherwise.

A file system has the following structure:

 The boot block occupies the beginning of a file system, typically the first sector, and may
contain the bootstrap code that is read into the machine to boot , or initialize, the
operating system. Although only one boot block is needed to boot the system, every file
system has a (possibly empty) boot block.
 The super block describes the state of a file system -- how large it is, how many files it
can store, where to find free space on the file system, and other information.
 The inode list is a list of inodes that follows the super block in the file system.
Administrators specify the size of the inode list when configuring a file system. The
kernel references inodes by index into the inode list. One inode is the root inode of the
file system: it is the inode by which the directory structure of the file system is accessible
after execution of the mount system call.
 The data blocks start at the end of the inode list and contain file data and administrative
data. An allocated data block can belong to one and only one file in the file system.

Processes

A process is the execution of a program and consists of a pattern of bytes that the CPU interprets
as machine instructions (called text), data, and stack. Processes communicate with other
processes and with the rest of the world via system calls.

A process on a UNIX system is the entity that is created by the fork system call. Every process
except process 0 is created when another process executes the fork system call. The process
which invoked fork system call is called the parent process and the newly created process is
called the child process. A process can have only one parent process but it can have many child
processes. The kernel identifies each process by its process number, called the process ID (PID).
Process 0 is a special process that is created "by hand" when the system boots; after forking a
child process (process 1), process 0 becomes the swapper process. Process 1, known as init is the
ancestor of every other process.

An executable file consists of the following parts:

 a set of "headers" that describe the attributes of the file


 the program text
 a machine language representation of the data that has initial values when the program
starts execution, and an indication of how much space the kernel should allocate for
uninitialized data, called bss (block started by symbol).
 other sections, such as symbol table information.

#include <stdio.h>

char buffer[2048];
int version = 1;

main() {
printf("Hello, world!");
}

In the code given above, the initialized data is the variable version and the uninitialized data
(i.e bss) is the array buffer.
The kernel loads an executable file into memory during an exec system call, and the loaded
process consists of at least three parts, called regions: text, data, and stack. The text and data
regions correspond to the text and data-bss sections of the executable file, but the stack region is
automatically created and its size is dynamically adjusted by the kernel at runtime. The stack
consists of logical stack frames that are pushed when calling a function and popped when
returning; a special register called the stack pointer indicates the current stack depth. A stack
frame consists of parameters to a function, its local variables and the data necessary to recover
the previous stack frame, including the value of the program counter and stack pointer at the time
of the function call.

Because a process in the UNIX system can execute in two modes, kernel or user, it uses a
separate stack for each mode. When a system call is made, a trap instruction is executed which
causes an interrupt which makes the hardware switch to kernel mode. The kernel stack of a
process is null when the process executes in user mode.

Every process has an entry in the kernel process table, and each process is allocated a u area ("u"
stands for "user") that contains private data manipulated only by the kernel. The process table
contains (or points to) a per process region table, whose entries point to entries in a region table.
A region is a contiguous area of a process's address space, such as text, data, and stack. Region
table entries describe the attributes of the region, such as whether it contains text or data, whether
it is shared or private, and where the "data" of the region is located in memory. The extra level of
indirection (from the per process region table to the region table) allows independent processes
to share regions.

Important fields in the process table are:

 a state field
 identifiers indicating the user who owns the process (user IDs, or UIDs)
 an event descriptor set when a process is suspended (in the sleep state)

The u area contains information that needs to be accessible only when the process is executing.
Important fields in the u area are:

 a pointer to the process table slot of the currently executing process


 parameters of the current system call, return values and error codes
 file descriptors for all open files
 internal I/O parameters
 current directory and current root
 process and file size limits

The kernel internally uses a structure variable u which points to the u area of the currently
executing process. When another process executes, the kernel rearranges its virtual address space
that u refers to the u area of the new process.
Context of a process

Context of a process consists of the following:

 text region
 values of global variables and data structures
 values of machine registers
 values stored in its process table slot
 u area
 contents of user and kernel stacks

The text of the operating system and its global data structures are shared by all processes but do
not constitute part of the context of a process.

When the kernel decides that it should execute another process, it does a context switch, so that
the system executes in the context of the other process.

The kernel services the interrupts in the context of the interrupted process even though it may not
have caused the interrupt. Interrupts are served in kernel mode.

Explain different scenarios for retrieval of buffer.

if (there are no buffers on free list) /* scenario 4 •/ [

sleep (event any buffer becomes free); continue; /* back to while loop */

remove buffer from free list;

if (buffer marked for delayed write) ( /* scenario 3 */ asynchronous write buffer to disk;
continue; /* back to while loop */

/* scenario 2 — found a free buffer "/ remove buffer from old hash queue; put buffer onto new
hash queue; return buffer;

Figure 3.4. Algorithm for Buffer Allocation


hash queue headers

(b) Remove Block 4 from Free List Figure 3.5. Scenario 1 in Finding a Buffer: Buffer on Hash
Queue algorithm brelse input: locked buffer output: none (

wakeup all procs: event, waiting for any buffer to become free; wakeup all procs: event, waiting
for this buffer to become free; raise processor execution level to block interrupts; if (buffer
contents valid and buffer not old) enqueue buffer at end of free list else enqueue buffer at
beginning of free list lower processor execution level to allow interrupts; unlock(buffer);

Figure 3.6. Algorithm for Releasing a Buffer

Before continuing to the other scenarios, let us consider what happens to a buffer after it is
allocated. The kernel may read data from the disk to the buffer and manipulate it or write data to
the buffer and possibly to the disk. The kernel leaves the buffer marked busy; no other process
can access it and change its contents while it is busy, thus preserving the integrity of the data in
the buffer. When the kernel finishes using the buffer, it releases the buffer according to algorithm
brelse (Figure 3.6). It wakes up processes that had fallen asleep because the buffer was busy and
processes that had fallen asleep because no buffers remained on the free list. In both cases,
release of a buffer means that the buffer is available for use by the sleeping processes, although
the first process that gets the buffer locks it and prevents the other processes from getting it
(recall Section 2.2.2.4). The kernel places the buffer at the end of the free list, unless an I/O error
occurred or unless it specifically marked the buffer "old," as will be seen later in this chapter; in
the latter cases, it places the buffer at the beginning of the free list. The buffer is now free for
another process to claim it.

Just as the kernel invokes algorithm brelse when a process has no more need for a buffer, it also
invokes the algorithm when handling a disk interrupt to release buffers used for asynchronous
I/O to and from the disk, as will be seen in Section 3.4. The kernel raises the processor execution
level to prevent disk interrupts while manipulating the free list, thereby preventing corruption of
the buffer pointers that could result from a nested call to brelse. Similar bad effects could happen
if an interrupt handler invoked brelse while a process was executing getblk, so the kernel raises
the processor execution level at strategic places in getblk, too. The exercises explore these cases
in greater detail.

In the second scenario in algorithm getblk, the kernel searches the hash queue that should contain
the block but fails to find it there. Since the block cannot be on another hash queue because it
cannot "hash" elsewhere, it is not in the buffer blkno 3 mod 4
(a) Search for Block 18 - Not in Cache blkno 2 mod 4

freelist header hash queue headers blkno 0 mod 4

blkno 1 mod 4

blkno 1 mod 4

blkno 2 mod 4

hash queue headers blkno 0 mod 4

blkno 3 mod 4

freelist header

(a) Search for Block 18 - Not in Cache

blkno 3 mod 4

hash queue headers blkno 0 mod 4 •

(b) Remove First Block from Free List, Assign to 18 Figure 3.7. Second Scenario for Buffer
Allocation blkno 1 mod 4
blkno 2 mod 4

blkno 3 mod 4

freelist header cache. So the kernel removes the first buffer from the free list instead; that buffer
had been allocated to another disk block and is also on a hash queue. If the buffer has not been
marked for a delayed write (as will be described later), the kernel marks the buffer busy, removes
it from the hash queue where it currently resides, reassigns the buffer header's device and block
number to that of the disk block for which the process is searching, and places the buffer on the
correct hash queue. The kernel uses the buffer but has no record that the buffer formerly
contained data for another disk block. A process searching for the old disk block will not find it
in the pool and will have to allocate a new buffer for it from the free list, cxactly as outlined here.
When the kernel finishes with the buffer, it releases it as described above. In Figure 3.7, for
example, the kernel searches for block 18 but docs not find it on the hash queue marked "blkno 2
mod 4." It therefore removes the first buffer from the free list (block 3), assigns it to block 18,
and places it on the appropriate hash queue.

In the third scenario in algorithm getblk, the kernel also has to allocate a buffer from the free list.
However, it discovers that the buffer it removes from the free list has been marked for "delayed
write," so it must write the contents of the buffer to disk before using the buffer. The kernel starts
an asynchronous write to disk and tries to allocate another buffer from the free list. When the
asynchronous write completes, the kernel releases the buffer and places it at the head of the free
list. The buffer had started at the end of the free list and had traveled to the head of the free list.
If, after the asynchronous write, the kernel were to place the buffer at the end of the free list, the
buffer would get a free trip through the free list, working against the least recently used
algorithm. For example, in Figure 3.8, the kernel cannot find block 18, but when it attempts to
allocate the first two buffers (one at a time) on the free list, it finds them marked for delayed
write. The kernel removes them from the free list, starts write operations to disk for the blocks,
and allocates the third buffer on the free list, block 4. It reassigns the buffer's device and block
number fields appropriately and places the buffer, now marked block 18, on its new hash queue.

In the fourth scenario (Figure 3.9), the kernel, acting for process A, cannot find the disk block on
its hash queue, so it attempts to allocate a new buffer from the free list, as in the second scenario.
However, no buffers arc available on the free list, so process A goes to sleep until another
process executes algorithm brelse, freeing a buffer. When the kernel schedules process A, it must
search the hash queue again for the block. It cannot allocate a buffer immediately from the free
list, because it is possible that several processes were waiting for a free buffer and that one of
them allocated a newly freed buffer for the target block sought by process A. Thus, searching for
the block again insures that only one buffer contains the disk block. Figure 3.10 depicts the
contention between two processes for a free buffer.

The final scenario (Figure 3.11) is complicated, because it involves complex relationships
between several processes. Suppose the kernel, acting for process A, searches for a disk block
and allocates a buffer but goes to sleep before freeing the

writing
blkno 0 mod 4

blkno 1 mod 4

blkno 2 mod 4

blkno 3 mod 4

frcclist header writing

(b) Writing Blocks 3, 5, Reassign 4 to 18

Figure 3.8. Third Scenario for Buffer Allocation hash queue headers hash queue headers blkno 0
mod 4

Write and explain algorithms: 1. Buffer allocation (getblk), 2. Releasing a buffer (brelse), 3.
Reading disk block (bread), 4. Block read ahead (breada)
1. Buffer Allocation (getblk):
 Purpose: The getblk algorithm is used to allocate a buffer for a disk block in the
buffer cache.
 Algorithm:
1. Check if the requested block is already in the buffer cache. If so, return the
corresponding buffer.
2. If the requested block is not in the buffer cache, search for a free buffer.
3. If a free buffer is found, allocate it and mark it as busy.
4. If no free buffer is available, use a buffer replacement algorithm (e.g.,
LRU) to select a victim buffer for replacement.
5. Write back the contents of the victim buffer if it is dirty.
6. Re-initialize the victim buffer with the new block number and mark it as
busy.
7. Return the allocated buffer.
2. Releasing a Buffer (brelse):
 Purpose: The brelse algorithm is used to release a buffer after its use.
 Algorithm:
1. If the buffer is marked as busy, wait until it becomes free.
2. Mark the buffer as not busy.
3. If the buffer is marked for delayed write, initiate the write operation to
disk asynchronously.
4. If the buffer is marked for synchronous write, wait for the write operation
to complete.
5. Clear the buffer's dirty flag.
6. Release the buffer for reuse.
3. Reading Disk Block (bread):
 Purpose: The bread algorithm is used to read a disk block into a buffer.
 Algorithm:
1. Allocate a buffer using the getblk algorithm for the specified disk block.
2. Check if the buffer contains the required data. If so, return the buffer.
3. If the buffer does not contain the required data, read the block from disk
into the buffer.
4. Update the buffer's metadata to reflect the new block number and mark it
as not dirty.
5. Return the buffer.
4. Block Read Ahead (breada):
 Purpose: The breada algorithm is used to read multiple disk blocks ahead of time
to improve I/O performance.
 Algorithm:
1. Determine the number of blocks to read ahead (e.g., based on the disk
access pattern or heuristic).
2. For each block to be read ahead: a. Allocate a buffer using the getblk
algorithm for the specified disk block. b. Check if the buffer contains the
required data. If so, skip to the next block. c. If the buffer does not contain
the required data, read the block from disk into the buffer. d. Update the
buffer's metadata to reflect the new block number and mark it as not dirty.
3. Return the buffers containing the read-ahead blocks.

These algorithms are essential components of the buffer cache management in the Linux kernel,
responsible for efficiently handling disk I/O operations and improving overall system
performance.

Explain advantages and disadvantages of buffer cache.


Buffer cache is a mechanism used by the operating system to cache frequently accessed data in
memory for faster access. The buffer cache acts as a middle layer between the application and
the disk, buffering read and write operations to and from the disk.

Advantages of Buffer Cache:

1. Improved Performance: The buffer cache improves system performance by reducing the
number of disk accesses required for read and write operations. This reduces the overall disk
access time and improves the system’s response time.

2. Reduced I/O Operations: The buffer cache reduces the number of I/O operations required to
access frequently accessed data. This reduces the load on the disk and improves its lifespan.

3. Caching Mechanism: The buffer cache uses a caching mechanism that stores frequently
accessed data in memory. This reduces the need to read data from the disk, which can be slow
and time-consuming.

4. Sharing Data Between Processes: The buffer cache can be shared between multiple
processes, reducing the need for each process to have its own copy of frequently accessed data.

Disadvantages of Buffer Cache:

1. Memory Usage: The buffer cache requires a significant amount of memory to store the
cached data. This can reduce the amount of available memory for other processes and
applications.

2. Cache Coherency Issues: The buffer cache can cause cache coherency issues, where multiple
caches containing the same data become inconsistent. This can lead to data corruption and other
issues.

3. Stale Data: The buffer cache can contain stale data that has not been updated on the disk. This
can cause issues if the data is accessed and modified by another process.

4. Cache Thrashing: The buffer cache can suffer from cache thrashing, where the cache is
continually filled and flushed with data. This can reduce the effectiveness of the cache and cause
performance issues.

Ch 2 Internal Representatio n of Files


What is inode? Types of inode? Summarise the fields from disk inode.
In Unix based operating system each file is indexed by an Inode. Inode are special disk blocks
they are created when the file system is created. The number of Inode limits the total number
of files/directories that can be stored in the file system.

An Inode is a data structure in UNIX operating system that contains important information
pertaining to files within a file system. When a file system is created in UNIX, a set amount of
indoes is created as well. Usually, about 1 percent of the file system disk space is allocated to
the inode table.
The Inode contains the following information:

14 Bytes 2 Bytes

File name 1 i-node 1

File name 2 i-node 2

Directory name 1 i-node 3

1.Numeric UID of the owner.


2.Numeric GUID of the owner.
3.Size of the file.
4.File type: regular,directory,device etc…
5.Date and Time of Last modification of the file data.
6.Date and Time of Last access of file data.
7.Date and Time of Last change of the I-node.
 Administrative information (permissions, timestamps, etc).
 A number of direct blocks (typically 12) that contains the first 12 blocks of the files.
 A single indirect pointer that points to a disk block which in turn is used as an index
block, if the file is too big to be indexed entirely by the direct blocks.
 A double indirect pointer that points to a disk block which is a collection of pointers
to disk blocks which are index blocks, used if the file is too big to beindexed by the
direct and single indirect blocks.
 A triple indirect pointer that points to an index block of index blocks of index
blocks.
Inode Total Size:

 Number of disk block address possible to store in 1 disk block = (Disk Block Size /
Disk Block Address).
 Small files need only direct blocks, so there is little waste in space or extra disk
reads in those cases. Medium-sized files may use indirect blocks. Only large files
use double or triple indirect blocks, which is reasonable since those files are large
anyway. The disk is now broken into two different types of blocks: Inode and Data
Blocks.
 There must be some way to determine where the Inodes are, and to keep track of
free Inodes and disk blocks. This is done by a Superblock. Superblock is located
at a fixed position in the file system. The Superblock is usually replicated on the
disk to avoid catastrophic failure in case of corruption of the main Superblock.
 Index allocation schemes suffer from some of the same performance problems. As
does linked allocation. For example, the index blocks can be cached in memory, but
the data blocks may be spread all over a partition.
Inode Structure :

Explain the following algorithms- 1. namei (convert pathname to inode),


2. iget, 3. iput, 4. Bmap
1. namei (convert pathname to inode):
 Purpose: This algorithm is used to translate a pathname (such as
"/home/user/file.txt") into an inode.
 Description: The namei algorithm typically traverses the directory hierarchy
starting from the root directory ("/"). It parses each component of the pathname
(directory names and file name) and searches for corresponding directory entries
within each directory's inode. Once the algorithm successfully resolves the entire
pathname, it returns the inode corresponding to the file or directory specified in
the pathname.
2. iget (get an inode):
 Purpose: This algorithm is used to obtain an inode from the inode table given its
inode number.
 Description: When a file or directory is accessed, the file system needs to load its
inode into memory to perform operations on it efficiently. The iget algorithm
searches the inode table maintained by the file system for the specified inode
number. If the inode is not already in memory, it is read from disk into memory.
The algorithm returns a reference to the in-memory inode.
3. iput (put back an inode):
 Purpose: This algorithm is used to release an inode that is no longer needed,
allowing it to be reused or written back to disk.
 Description: When an inode is no longer actively used by a process, the iput
algorithm is called to release it. If the inode is marked as dirty (i.e., its contents
have been modified), iput writes the changes back to disk before releasing the
inode. It then marks the inode as available for reuse. This helps in efficiently
managing system resources and ensuring data consistency.
4. Bmap (block mapping):
 Purpose: This algorithm is used to map logical block numbers to physical disk
block numbers.
 Description: File systems typically organize data into blocks, and each file is
composed of multiple blocks. Bmap is used to translate logical block numbers
(block numbers within a file) into physical block numbers on the disk. It involves
traversing the file system's block allocation structures, such as block allocation
bitmaps or block pointers within inodes, to determine the physical location of the
data blocks associated with the file. Bmap is essential for reading and writing data
to disk efficiently while maintaining file system integrity.

Explain fields of In core inode.


In the context of operating systems and file systems, an "inode" (index node) is a data structure
used to represent a file or a directory. The in-core inode refers to the inode that resides in the
main memory (RAM) of the computer system while the file or directory is being accessed or
modified.

The fields of an in-core inode can vary depending on the specific file system implementation and
the features it supports. However, typically, an in-core inode contains metadata about the file or
directory it represents. Here are some common fields found in an in-core inode:

1. File Type: Indicates whether the inode represents a regular file, a directory, a symbolic
link, a device file, or some other type of file.
2. File Permissions: Stores information about who can read, write, or execute the file,
typically represented as a set of permission bits.
3. Owner and Group: Stores the user identifier (UID) and group identifier (GID) of the
file's owner and group.
4. File Size: Records the size of the file in bytes.
5. Pointers to Data Blocks: Contains pointers or references to the data blocks that store the
actual content of the file. For large files, these pointers may point to indirect blocks,
double indirect blocks, or even triple indirect blocks to accommodate the file's size.
6. Timestamps: Keeps track of various timestamps associated with the file, such as the time
of creation, last access, and last modification.
7. Link Count: Tracks the number of hard links pointing to the inode. When this count
drops to zero, the file system knows that the file or directory can be safely removed.
8. File System Identifier and Inode Number: Identifies the file system to which the inode
belongs and its unique inode number within that file system.
9. File Attributes: Stores additional attributes or metadata associated with the file, such as
extended file attributes or Access Control Lists (ACLs).

In-core inode refers to the in-memory representation of an inode, which is a data structure used
by file systems to store metadata about files and directories. The in-core inode contains various
fields that hold information about the associated file or directory. Below are some common fields
typically found in an in-core inode:

1. File Mode: This field specifies the permissions and type of the file. It includes
information about whether the inode represents a regular file, directory, symbolic link,
device file, or other types. It also includes permission bits indicating read, write, and
execute permissions for the owner, group, and others.
2. Owner and Group IDs: These fields store the numeric identifiers (UID and GID) of the
file's owner and group.
3. File Size: This field indicates the size of the file in bytes.
4. Timestamps: In-core inodes typically store several timestamps indicating important
times related to the file, such as the time of creation (ctime), last access (atime), and last
modification (mtime). These timestamps help track changes to the file over time.
5. Link Count: This field keeps track of the number of hard links pointing to the inode.
When the link count drops to zero, it indicates that the file or directory can be safely
removed.
6. Pointers to Data Blocks: Inodes contain pointers or references to the data blocks that
hold the actual contents of the file. Depending on the file system, these pointers may
include direct pointers, indirect pointers, double indirect pointers, and triple indirect
pointers to accommodate files of varying sizes.
7. File System Identifier and Inode Number: These fields uniquely identify the file
system to which the inode belongs and the inode number assigned to it within that file
system.
8. Extended Attributes: Some file systems support extended attributes that allow
additional metadata to be associated with files or directories. These attributes may
include things like file capabilities, security information, or user-defined metadata.
9. File System-specific Information: Depending on the file system implementation, there
may be additional fields in the in-core inode that hold file system-specific information or
optimizations.

These are some of the key fields typically found in an in-core inode. The exact structure and
fields

Explain the structure of the Regular file. (Draw dig and explain)
Internal Representation of Files

Every file a UNIX system has a unique inode. Processes interact with files using well defined
system calls. The users specify a file with a character string which is the file's path and then the
system get the inode which is mapped to the file which corresponds to the path.

The algorithms described below are above the layer of buffer cache. Diagrammatically, it can be
shown like this:

Inodes

Inodes exist in a static form on the disk. The kernel reads them into in-core inodes and modifies
them.

Disk inodes consists of the following fields:

 Owner information: ownership is divided into a user and a group of users. Root user has
access to all the files.
 File type: it states whether a file is a normal file, a directory, a block or character special
file, or a device file.
 File access permissions: there are 3 types of access permissions: owner, group and others.
There are separate permissions for reading, writing and executing. Since execute
permission is not applicable to a directory, execute permission for a directory gives the
right to search inside the directory.
 Access times: the times at which the file was last accessed and last modified, and the time
at which the inodes was last modified
 Number of links: number of places from which the file is being referred.
 Array of disk blocks: even if the users get a logically sequential representation of data in
files, the actual data is scattered across the disk. This array keeps the addresses of the disk
blocks on which the data is scattered.
 File size: the addressing of the file begins from location 0 from relative to the starting
location and the size of the file is the maximum offset of the file + 1. For example, if a
user creates a file and writes a byte at offset 999, the size of the file is 1000.

The inode does not specify the pathname/pathnames that access the file.

The in-core inodes contain the following fields in additional to the fields of the disk inode:

 Status of the inode


i. Locked.
ii. A process is (or many processes are) waiting for it to be unlocked.
iii. The data in the inode differs from the disk inode due to change in the inode data.
iv. The data in the inode differs from the disk inode due to change in the file data.
v. The file is a mount point (discussed later).
 The device number of the logical device of the file system on which the file resides.
 Inode number: the disk inodes are placed in an array. So the number of the inode is
nothing but the index of the inode in the array. That is why the disk copy does not need to
store the inode number.
 Points to inodes: just like the buffer cache, the in-core inodes are nothing but a cache for
disk inodes (but with some extra information which is deterministic). In-core inodes also
have hash queues and a free list and the lists behave in a very similar way to the buffer
lists. The inode has next and previous pointers to the inodes in the hash queue and free
lists. The hash queues have a hash function based on the device number and inode
number.
 Reference count: it gives the number of instances of files that are active currently.

The most striking difference between an in-core inode and a buffer header is the reference count.
The reference count increases when a process allocates that inode, for example, by opening a
file. An inode is on the free list if and only if the reference count of the inode is 0. If it is greater,
some process is still accessing the inode. This means, that an inode can be on a hash queue and
not on the free list even if its not locked. That is not the case with a buffer, a buffer will always
be on the free list if it is not locked.

Accessing Inodes

The algorithm iget allocates an in-core copy of an inode. If the inode is not found on a hash
queue, it allocates an inode from the free list and reads the disk copy into the in-core inode. It
already knows the inode number and device number. It calculates the logical block number on
which the disk inode resides according to how many inodes fit into one disk block. The formula
for calculating the logical block number is:

block number = ((inode number - 1) / number of inodes per block) + start block of inode list
where the division operation returns the integer part of the quotient.

To find the byte offset of the inode in that block, this following formula is used:

byte offset = ((inode number - 1) % number of inodes per block) * size of disk inode
The algorithm iget is given below:

/* Algorithm: iget
* Input: file system inode number
* Output: locked inode
*/

{
while (not done)
{
if (inode in inode cache)
{
if (inode locked)
{
sleep (event: inode becomes unlocked);
continue;
}
// special processing for mount points, covered later
if (inode on free list)
remove inode from free list;
increment reference count of the inode;
return inode;
}
// inode not in the cache
if (free list is empty)
return error;
remove inode from free list;
reset inode number and file system;
remove inode from old hash queue and place it on the new hash queue;
read inode from disk (algorithm: bread);
initialize inode;
return inode;
}
}
The kernel manipulates inode lock and reference count independently. It locks an inode
when it is being accessed or modified. An inode is never locked across system calls. But the
reference count remains set across system calls. The kernel can lock and unlock an inode
independent of the value of the reference count.

The algorithm iget is very similar to getblk but there's one big difference. In getblk, a process
sleeps if the free list is empty. But in iget, an error returned. The reason behind this difference is
that the process have control over inodes at user level with open and close system calls. If a
process opens a file and never closes it, the inode will have reference count of at least 1 and it
will never be on the free list unless the process closes the file. This is not the case with buffers.
Processes don't have user level control over buffers. Therefore, buffers are guaranteed to get free
but inodes are not.

Releasing Inodes

The algorithm iput is used to release an inode:

/* Algorithm: iput
* Input: in-core inode
* Output: none
*/
{
lock inode if not already locked;
decrement inode reference count;
if (reference count == 0)
{
if (inode link count == 0)
{
free disk blocks for file (algorithm: free); // free is described later
set file type to 0;
free inode (algorithm: ifree); // ifree is described later
}
if (file accessed or inode changed or file changed)
update disk inode;
put inode on free list;
}
release inode lock;
}

Structure of a Regular File


In UNIX, the data in files is not stored sequentially on disk. If it was to be stored sequentially,
the file size would not be flexible without large fragmentation. In case of sequential storage, the
inode would only need to store the starting address and size. Instead, the inode stores the disk
block numbers on which the data is present. But for such strategy, if a file had data across 1000
blocks, the inode would need to store the numbers of 1000 blocks and the size of the inode
would differ according to the size of the file.

To be able to have constant size and yet allow large files, indirect addressing is used. The inodes
have array of size 13 which for storing the block numbers, although, the number of elements in
array is independent of the storage strategy. The first 10 members of the array are "direct
addresses", meaning that they store the block numbers of actual data. The 11th member is "single
indirect", it stores the block number of the block which has "direct addresses". The 12th member
is "double indirect", it stores block number of a "single indirect" block. And the 13th member is
"triple indirect", it stores block number of a "double indirect" block. This strategy can be
extended to "quadruple" or "quintuple" indirect addressing.

If a logical block on the file system holds 1K bytes and that a block number is addressable by a
32 bit integer, then a block can hold up to 256 block numbers. The maximum file size with 13
member data array is:

10 direct blocks with 1K bytes each = 10K bytes


1 indirect block with 256 direct blocks = 256K bytes
1 double indirect block with 256 indirect blocks = 64M bytes
1 triple indirect block with 256 double indirect blocks = 16G bytes
But the file size field in the inode is 32 bits, the size of a file is effectively limited to 4 gigabytes.

The algorithm bmap is used to convert logical byte offset of a file to a physical disk block:

/* Algorithm: bmap
* Input: inode
* byte offset
* Output: block number in file system
* byte offset into block
* bytes of I/O in block
* read ahead block number
*/
{
calculate logical block number in file from byte offset;
calculate start byte in block for I/O; // output 2
calculate number of bytes to copy to user; // output 3
check if read-ahead applicable, mark inode; // output 4
determine level of indirection;
while (not at necessary level of indirection)
{
calculate index into inode or indirect block from logical block number in file;
get disk block number from inode or indirect block;
release buffer from previous disk read, if any (algorithm: brelse);
if (no more levels of indirection)
return (block number);
read indirect disk block (algorithm: bread);
adjust logical block number in file according to level of indirection;
}
}
Some examples of conversions (refer to the figure):
1. To access byte offset 9000: The first 10 blocks contain 10K bytes. So 9000 should be in
the first 10 block. 9000 / 1024 = 8 so it is in the 8th block (starting from 0). And 9000 %
1024 = 808 so the byte offset into the 8th block is 808 bytes (starting from 0). (Block 367
in the figure.)
2. To access byte offset 350000: The first 10 blocks contain 10K bytes (350000 - 10240 =
339760). A single indirect block contains 256K bytes. (339760 - (256 * 1024) = 77616).
So a double indirect block (block 9156 in the figure) must be used. Every single indirect
block in the double indirect block contains 256K, so data must be in the 0th single
indirect block (block 331 in the figure). Every direct block in the single indirect block
addresses 10K bytes (77616 / 10240 = 7). So the data must be in 7th (starting from 0)
direct block (block number 3333 in the figure). And the byte offset will be 77616 % 1024
= 816.

Directories

Directory files have entries of sub directories and files that reside inside them. Directory files
have the mapping of a file name and its inode number. One directory entry takes 16 bytes. 14
bytes are given for the name of the file and 2 bytes for inode number. For example:
The entry . has the inode number of the the directory file itself. And .. has the inode number of
the parent directory. . and .. for the root directory are nothing but inode numbers of the root
directory itself. Entries that have the inode number 0 are empty (i.e. deleted files).
The kernel has exclusive writes to write to a directory. The access permission for directories
mean different things. Read permission is for reading the contents of the directory, write
permission given the permission to create files and directories in that directory, and the execute
permission gives the right to search in that directory.

Conversion of Path Name to an Inode

Algorithm namei (name to inode) is used for converting a path to an inode number. The kernel
parses the path by accessing each inode in the path and finally returning the inode of the required
file. Every process has a current directory. The current directory of process 0 is the root
directory. For every other process, it is the current directory of its parent process. Later the
process can change the current directory with the system call chdir. The inode of the current
directory is stored in the u-area. Searches start with respect to the current directory unless the
path begins with the / component, stating that the search should start with the root directory.
Inode number of the root is present as the global variable. Even if the current root is changed by
using chroot (more on this later), the current root inode number is stored in the u-area.

Algorithm namei is given below:

/* Algorithm: namei
* Input: pathname
* Output: locked inode
*/

{
if (path name starts from root)
working inode = root inode (algorithm: iget);
else
working inode = current directory inode (algorithm: iget);

while (there is more path name)


{
read next path name component from input;
verify that working inode is of a directory and access permissions are OK;
if (working inode is of root and component is "..")
continue;
read directory (working inode) by repeated use of algorithms: bmap, bread,
brelse;
if (component matches an entry in the directory (working inode)
{
get inode number for matched component;
release working inode (algorithm: iput);
working inode = inode of matched component (algorithm: iget);
}
else
return (no inode) // component not in the directory
}

return (working inode);


}
Here, working inode is used for the intermediate inodes in the path.

Superblock

The contents of the super block are:

 size of the file system.


 number of free blocks in the file system.
 list of free blocks in the file system.
 pointer to the next free block in the free blocks list
 size of the inodes list.
 number of free inodes in the file system.
 list of free inodes in the file system.
 pointer to the next free inode in the free inodes list.
 lock fields for the free blocks and free inodes list.
 a field indicating whether the super block has changed.

The kernel periodically writes the superblock to the disk if it had been modified so that it is
consistent with the data on the disk.

Inode Assignment to a New File

Algorithm ialloc is used to assign an inode to a newly created file:

/* Algorithm: ialloc
* Input: file system
* Output: locked inode
*/

{
while (not done)
{
if (super block locked)
{
sleep (event: super block becomes free);
continue;
}
if (inode list in super block is empty)
{
lock super block;
get remembered inode for free inode search;
search disk for free inodes until super block full, or no more free
inodes (algorithm: bread and brelse);
unlock super block;
wake up (event: super block becomes free);
if (no free inodes found on disk)
return (no inode);
set remembered inode for next free inode search;
}
// there are inodes in super block inode list
get inode number form super block inode list;
get inode (algorithm: iget);
if (inode not free after all)
{
write inode to disk;
release inode (algorithm: iput);
continue;
}
// inode is free
initialize inode;
write inode to disk;
decrement file system free inode count;
return inode;
}
}
If the list of inodes numbers in the super block is not empty, the kernel assigns the next inode
number, allocates a free in-core inode for the newly assigned disk inode using algorithm iget,
copies the disk inode to the in-core copy, initializes the fields in the inode and returns the locked
inode. It updates the inode on disk to indicate that the inode is in use.

If the super block list of free inodes is empty, the kernel searches the disk and places as many
free inode numbers as possible into the super block. The kernel reads the inode list on disk, block
by block, and fills the super block list of inode numbers to capacity, remembering the highest-
numbered inode that it finds ("remembered" inode). It is the last one saved in the super block.
The next time the kernel searches the disk for free inodes, it uses the remembered inode as its
starting point, thereby assuring that it wastes no time reading disk blocks where no free inodes
should exist.

An example:
Super block lists are maintained such that the last inode it dispenses from the list is the
remembered inode.

Algorithm for freeing inodes (ifree) is a simple one:

/* Algorithm: ifree
* Input: file system inode number
* Output: none
*/

{
increment file system free inode count;
if (super block locked)
return;
if (inode list full)
{
if (inode number less than remembered inode for search)
set remembered inode for search = input inode number;
}
else
store inode number in inode list;
return;
}
If the super block is locked, the algorithm returns, the inode number is not updated in the free list
of inode numbers. But the inode can be found when searching the disk blocks. If the disk blocks
list is full and the inode number is less than the remembered inode number, the replace the
remembered inode number with the input inode number, so that the search will start from that
inode.

Ideally, there should never be free inodes whose inode number is less than the remembered inode
number, but exceptions are possible. If an inode is being freed and the super block is locked, in
such situation, the it is possible to have an inode number that is free and is less than the
remembered inode number.

An example:

In ialloc, there is a race condition where an inode can be in use even if we get it from the list of
free inode numbers in the super block. Following is an example where such condition is possible.

Assume that there are 3 processes, A, B, and C. If process A assigns inode I but goes to sleep
before it copies the disk inode into the in-core copy. Algorithms iget (invoked by alloc)
and bread (invoked by iget) give ample opportunity to go to sleep. If, while process A sleeps,
process B tries to acquire an inode but finds that the list of free inode numbers is empty. So it
starts searching (assume that it starts searching at a lower inode number than inode I) the disk
blocks for free inodes and finds inode I, as inode I is still marked free. Process B completes the
search, fills up the list and departs taking an inode number. Process A wakes up and completes
the assignment of inode I. Now suppose process C later requests an inode and happens to pick
inode I from the super block free list. When it gets the in-core copy of the inode, it will find its
file type set, implying that the inode was already assigned. That is why that check is required
in ialloc.

Allocation of Disk Blocks


When data is written on a file, the kernel must allocate disk blocks from the file system (for
direct data blocks or sometimes, indirect data blocks). The file system super block contains an
array that is used to cache the numbers of free disk blocks in the file system. The utility
program mkfs (make file system) organizes the data blocks of a file system in a linked list, such
that each link of the list is a disk block that contains an array of free disk block numbers, and one
array entry is the number of the next block of the linked list. For example:

Algorithm for allocation of disk blocks (alloc) is given below:

/* Algorithm: alloc
* Input: file system number
* Output: buffer for new block
*/

{
while (super block locked)
sleep (event: super block not locked);
remove block from super block free list;
if (removed last block from free list)
{
lock super block;
read block just taken from free list (algorithm: bread);
copy block numbers in block into super block;
release block buffer (algorithm: brelse);
unlock super block;
wakeup processes (event: super block not locked);
}
get buffer for block removed from super block list (algorithm: getblk);
zero buffer contents;
decrement total count of free blocks;
mark super block modified;
return buffer;
}
The program mkfs tries to organize the original linked list of free block numbers so that block
numbers dispensed to a file are near each other. This helps performance, because it reduces disk
seek time and latency when a process reads a file sequentially. The kernel makes no attempt to
sort block numbers on the free list.

The algorithm free for freeing a block is the reverse of the one for allocating a block. If the super
block list is not full, the block number of the newly freed block is placed on the super block list.
If, however, the super block list is full, the newly freed block becomes a link block; the kernel
writes the super block list into the block and writes the block to disk. It then places the block
number of the newly freed block in the super block list: That block number is the only member
of the list.

An example of how the super block free data blocks list

works:
There is a difference is how free inode numbers are managed and how free disk block numbers
are managed. All the free disk block numbers are stored as a linked list of arrays but in case of
free inode numbers, all the free inode numbers are not stored. After the capacity of the free inode
numbers exceeds, the other free inode numbers are not stored anywhere. There are 3 reasons for
this different treatment:

1. The kernel can determine whether an inode is free by inspecting its file type. However,
there is no way to know whether a disk block is free by looking at the data in it.
2. Disk block lend themselves to the use of linked list: a disk block easily holds large lists of
free block numbers. But inodes have no convenient place for bulk storage of large list of
free inode numbers.
3. Users tend to use free disk blocks more than free inodes, so the performance lag in
searching would be more noticeable in case of disk blocks than inodes.

Other File Types

The UNIX system supports two other file types: pipes and special files. A pipe, sometimes called
a fifo (first-in-first-out), differs from a regular file in that its data is transient; once data is read
from a pipe, it cannot be read again. Also, the data is read in the order that it was written to the
pipe, and the system allows no deviation from that order. The kernel stores data in a pipe the
same way it stores data in an ordinary file, except that it uses only the direct blocks, not the
indirect blocks. Pipes are described later.

Other types in UNIX system are special files, including block device special files and character
device special files. Both types specify devices, and therefore the file inodes do not reference any
data. Instead, the inode contains two numbers known as the major and minor device numbers.
The major number indicates a device type such as terminal or disk, and the minor number
indicates the unit number of the device. Device files are covered later.

Explain Directories with layout example.


In Linux operating systems, directories are special types of files that serve as containers for other
files and directories. They organize the file system hierarchy, providing a structured way to store
and access data. Each directory contains a list of entries, where each entry maps a filename to an
inode number. The inode number represents the metadata associated with the file or directory.

Let's discuss the layout of directories in Linux and provide an example:

Directory Layout in Linux:

1. Root Directory ("/"):


 The root directory is the top-level directory in the file system hierarchy.
 It serves as the parent directory for all other directories and files.
 In the root directory, you can find essential system directories such as /bin, /etc,
/home, /usr, etc.
2. Subdirectories:
 Directories can contain other directories, creating a hierarchical structure.
 For example, the /home directory typically contains user home directories
(/home/user1, /home/user2, etc.).
 Directories can be nested arbitrarily deep to organize files and directories
logically.
3. Directory Entries:
 Each directory entry consists of a filename and its corresponding inode number.
 The filename is the name by which the file or directory is known within its parent
directory.
 The inode number points to the metadata of the file or directory, including its
type, permissions, size, and pointers to data blocks.

Example:

Let's consider a simplified example to illustrate the layout of directories in Linux:

arduinoCopy code
/ (Root Directory) ├── home │ ├── user1 │ │ ├── file1.txt │ │ ├── file2.txt │ │ └──
Documents │ │ ├── doc1.pdf │ │ └── doc2.txt │ └── user2 │ ├── image.jpg │ └──
Music │ ├── song1.mp3 │ └── song2.mp3 └── etc ├── passwd └── hosts

In this example:

 /home is a directory containing user home directories (user1, user2).


 Each user directory (user1, user2) contains files and possibly subdirectories.
 The Documents directory within user1 contains files doc1.pdf and doc2.txt.
 The Music directory within user2 contains files song1.mp3 and song2.mp3.
 The /etc directory contains system configuration files such as passwd and hosts.

Each file and directory in the example above is associated with an inode, which stores its
metadata. The directory entries map filenames to inode numbers, allowing the operating system
to locate and access files efficiently.

This directory layout provides a structured organization of files and directories, making it easier
for users and applications to navigate and manage data in the file system.

What is a Super block? List and explain fields of super block.


The superblock is essentially file system metadata and defines the file system type, size, status,
and information about other metadata structures (metadata of metadata). The superblock is very
critical to the file system and therefore is stored in multiple redundant copies for each file
system. The superblock is a very "high level" metadata structure for the file system. For
example, if the superblock of a partition, /var, becomes corrupt then the file system in question
(/var) cannot be mounted by the operating system. Commonly in this event, you need to
run fsck which will automatically select an alternate, backup copy of the superblock and attempt
to recover the file system. The backup copies themselves are stored in block groups spread
through the file system with the first stored at a 1 block offset from the start of the partition. This
is important in the event that a manual recovery is necessary. You may view information about
ext2/ext3/ext4 superblock backups with the command dumpe2fs /dev/foo | grep -i
superblock which is useful in the event of a manual recovery attempt. Let us suppose that the
dumpe2fs command outputs the line Backup superblock at 163840, Group descriptors at 163841-
163841. We can use this information, and additional knowledge about the file system structure,
to attempt to use this superblock backup: /sbin/fsck.ext3 -b 163840 -B 1024 /dev/foo. Please note
that I have assumed a block size of 1024 bytes for this example.
An inode exists in, or on, a file system and represents metadata about a file. For clarity, all
objects in a Linux or UNIX system are files; actual files, directories, devices, and so on. Please
note that, among the metadata contained in an inode, there is no file name as humans think of it,
this will be important later. An inode contains essentially information about ownership (user,
group), access mode (read, write, execute permissions), file type, and the data blocks with the
file's content.
A dentry is the glue that holds inodes and files together by relating inode numbers to file names.
Dentries also play a role in directory caching which, ideally, keeps the most frequently used files
on-hand for faster access. File system traversal is another aspect of the dentry as it maintains a
relationship between directories and their files.
A file, in addition to being what humans typically think of when presented with the word, is
really just a block of logically related arbitrary data. Comparatively very dull considering all of
the work done (above) to keep track of them.

Explain following algorithms- 1. ialloc, 2. alloc, 3. Ifree


1. ialloc (Inode Allocation):
 Purpose: The ialloc algorithm is responsible for allocating an inode (index node)
for a new file or directory within the file system.
 Algorithm:
1. Search the inode bitmap to find a free inode. The inode bitmap is a data
structure that keeps track of which inodes are allocated and which are free.
2. Once a free inode is found, mark it as allocated in the inode bitmap.
3. Initialize the inode with default metadata values, such as file type,
permissions, timestamps, and pointers to data blocks.
4. Return the allocated inode number for use by the file or directory.
2. alloc (Data Block Allocation):
 Purpose: The alloc algorithm is responsible for allocating a data block for storing
file data.
 Algorithm:
1. Search the data block bitmap to find a free data block. The data block
bitmap tracks the allocation status of data blocks in the file system.
2. Once a free data block is found, mark it as allocated in the data block
bitmap.
3. Return the block number of the allocated data block for use by the file
system.
3. Ifree (Inode Deallocation):
 Purpose: The Ifree algorithm is responsible for deallocating an inode that is no
longer in use.
 Algorithm:
1. Mark the inode as free in the inode bitmap.
2. Release any resources associated with the inode, such as data blocks and
file metadata.
3. Update any directory entries or indirect pointers that reference the
deallocated inode to reflect its deallocation.
4. Return the inode to the pool of available inodes for future allocation.

These algorithms are essential for managing the allocation and deallocation of resources within a
file system. They ensure efficient utilization of storage space while maintaining the integrity and
consistency of the file system structure. Proper implementation of these algorithms is crucial for
the overall performance and reliability of the file system.

System Calls for File System


Explain dup() system call with example
The dup() system call in Unix-like operating systems is used to duplicate an open file descriptor.
It creates a new file descriptor that refers to the same open file description as the original file
descriptor. This allows multiple file descriptors to refer to the same file or resource, enabling
more flexible I/O operations. Here's an explanation of dup() with an example:

Syntax:

cCopy code
#include <unistd.h> int dup(int oldfd) ;

Parameters:

 oldfd: The file descriptor to be duplicated.

Return Value:

 If successful, dup() returns a new file descriptor that is a duplicate of oldfd. If an error
occurs, it returns -1, and errno is set to indicate the error.

Explain difference between Named pipe and Unnamed pipe


Named pipes (also known as FIFOs) and unnamed pipes are both mechanisms for inter-process
communication (IPC) in Unix-like operating systems. However, they have some fundamental
differences:

1. Naming:
 Named pipes: As the name suggests, named pipes have a name associated with
them and can be created using the mkfifo command or by simply creating a
special type of file in the filesystem. These named pipes exist persistently in the
file system, and multiple processes can access them using their name.
 Unnamed pipes: Unnamed pipes, also known as anonymous pipes, do not have a
name. They are created using the pipe() system call and exist only within the
context of the processes that created them. They are typically used for
communication between a parent process and its child processes.
2. Persistence:
 Named pipes: Named pipes persist even after the processes using them have
finished execution. They are akin to regular files in the filesystem.
 Unnamed pipes: Unnamed pipes exist only as long as the processes using them
are alive. Once the processes close their ends of the pipe, the pipe is destroyed
and its resources are deallocated by the operating system.
3. Access:
 Named pipes: Named pipes are accessed by their name, much like regular files.
Any process with appropriate permissions can open a named pipe using its name
and communicate through it.
 Unnamed pipes: Unnamed pipes are typically used for communication between
related processes, such as a parent process and its child processes. They are
created by one process and the file descriptors referring to the ends of the pipe are
passed to the related processes through inheritance during forking.
4. Use cases:
 Named pipes: Named pipes are useful for communication between unrelated
processes that may not share a parent-child relationship. They are often used in
scenarios where multiple processes need to communicate with each other
indirectly.
 Unnamed pipes: Unnamed pipes are typically used for communication between a
parent process and its child processes, where the pipe is set up before forking to
establish communication channels between the parent and child processes.

Explain read() system call with example


The read() system call in Unix-like operating systems is used to read data from an open file
descriptor into a buffer. It allows processes to retrieve data from files, devices, pipes, sockets, or
other file-like objects. Here's an explanation of read() with an example:

Syntax:
cCopy code
#include <unistd.h> ssize_t read(int fd, void *buf, size_t count) ;

Parameters:

 fd: The file descriptor from which to read data.


 buf: A pointer to the buffer where the read data will be stored.
 count: The maximum number of bytes to read.

Return Value:

 If successful, read() returns the number of bytes read. If the end of the file (EOF) is
reached, it returns 0. If an error occurs, it returns -1, and errno is set to indicate the error.

Example: Suppose we have a file named example.txt with the following contents:

Copy code
Hello, world!

Write short note change directory and change root.


Change Directory (cd):

In Unix-like operating systems, the cd command is used to change the current working directory.
When you open a terminal or command prompt, you typically start in a specific directory known
as the "home directory" of the user. Using the cd command, you can navigate to different
directories in the file system hierarchy.

Syntax: cd [directory_path]

 If you provide a directory path as an argument to cd, it changes the current working
directory to the specified directory.
 If you don't provide any arguments, cd changes the current working directory to the
user's home directory.

Examples:

1. cd /home/user/Documents: Changes the current directory to /home/user/Documents.


2. cd ..: Moves up one directory level (parent directory).
3. cd: Changes the current directory to the user's home directory.

Change Root (chroot):


The chroot command is used to change the apparent root directory for the current running
process and its children. It creates a restricted environment where the specified directory
becomes the new root directory (/) for the process, effectively isolating it from the rest of the
system's file hierarchy.

Syntax: chroot new_root [command]

 new_root: Specifies the directory to be set as the new root directory.


 command: Optionally, you can provide a command to execute within the new root
environment.

Example:

bashCopy code
chroot /mnt/newroot /bin/bash

In this example, /mnt/newroot is set as the new root directory, and /bin/bash is executed within
this new environment.

Short Note:

 Change Directory (cd): cd is a command used to navigate the file system hierarchy in
Unix-like operating systems. It allows users to switch between directories conveniently.
 Change Root (chroot): chroot is a command used to create a restricted environment by
changing the root directory for a process. It isolates the process and its children from the
rest of the file system, providing a controlled environment for running applications or
performing system maintenance tasks.

Explain algorithm creat for creating file.


The creat system call in Unix-like operating systems is used to create a new file or open an
existing file for writing, with a specific file mode or permissions. Here's an explanation of the
algorithm typically used by the creat system call to create a file:

1. Input Parameters:
 File Path: The path to the file to be created.
 File Mode: The permissions to be set on the newly created file. This typically
includes read, write, and execute permissions for the owner, group, and others.
2. Permission Checking:
 Before creating the file, the operating system checks the permissions of the
calling process to ensure it has the necessary permissions to create a file in the
specified directory.
3. File Creation:
 The operating system performs the following steps to create the file:
 It searches the directory specified in the file path to ensure that a file with
the same name does not already exist. If a file with the same name exists
and the O_CREAT flag is not specified, the creat system call returns an
error.
 If the file does not exist or the O_CREAT flag is specified, the operating
system creates a new file entry in the directory's inode table.
 It allocates an inode (index node) for the file and initializes its metadata,
such as file type, permissions, owner, group, and timestamps.
 The file's data blocks are allocated on the disk to store the file's content.
For small files, direct blocks may be used, while larger files may require
indirect or doubly indirect blocks.
 If the O_TRUNC flag is specified, the operating system truncates the file,
discarding its previous contents. Otherwise, the file is created with zero
size.
4. File Descriptor:
 After successfully creating the file, the creat system call returns a file descriptor,
which is an integer representing the newly created file. This file descriptor can be
used for subsequent read and write operations on the file.
5. Error Handling:
 If any errors occur during the file creation process, such as insufficient
permissions, disk full, or file system errors, the creat system call returns an error
code to the calling process, indicating the nature of the error.
6. Permissions Setting:
 The operating system sets the permissions of the newly created file according to
the specified file mode or permissions. These permissions determine who can
read, write, or execute the file.
7. Completion:
 Finally, the creat system call returns control to the calling process, indicating
whether the file creation was successful or if an error occurred.

In summary, the creat system call creates a new file in the file system by allocating resources,
initializing metadata, setting permissions, and returning a file descriptor to the calling process. It
provides a fundamental mechanism for file creation in Unix-like operating systems.

Draw the file system before and after executing following mount system call-
Mount("/dev/dsk1/","/user",0)
Explain algorithm for mounting the file.
Mounting a file system involves making the contents of a storage device or another file system
available at a specific directory location within an existing file system hierarchy. Here's a
simplified explanation of the algorithm for mounting a file system:
1. Check Permissions and Privileges:
 Ensure that the user performing the mount operation has the necessary
permissions and privileges to mount file systems. Typically, this requires
superuser (root) privileges.
2. Specify Mount Point:
 Determine the directory where the file system will be mounted, known as the
mount point. This directory should exist within the existing file system hierarchy.
3. Identify File System:
 Identify the file system to be mounted, which could be on a physical disk
partition, a logical volume, a network share, or another storage device. The file
system type (e.g., ext4, NTFS, NFS) must be known.
4. Check File System Integrity:
 Optionally, perform a file system check (e.g., fsck command) to ensure the
integrity of the file system structure. This step is particularly important if the file
system is coming from an unreliable source or has not been cleanly unmounted in
the past.
5. Mount File System:
 Use the mount command to mount the file system onto the specified mount point.
The general syntax is:
phpCopy code
mount -t <file_system_type> <source> <mount_point>
 <file_system_type>: The type of file system being mounted (e.g., ext4,
NTFS, NFS).
 <source>: The device or file representing the file system to be mounted
(e.g., /dev/sdb1, /mnt/network_share).
 <mount_point>: The directory where the file system will be mounted
within the existing file system hierarchy.
6. Update Mount Table:
 Update the system's mount table (e.g., /etc/fstab on Linux) to ensure that the file
system is automatically mounted upon system boot, if desired. This step involves
adding an entry specifying the file system type, source, mount point, and mount
options.
7. Verify Mount:
 Verify that the file system has been successfully mounted by listing the contents
of the mount point directory (ls <mount_point>) or checking the output of the
mount command to see if the file system appears in the list of mounted file
systems.
8. Unmounting (Optional):
 To unmount the file system and detach it from the mount point, use the umount
command:
phpCopy code
umount <mount_point>
This step is necessary when the file system is no longer needed or when
performing system maintenance.
By following this algorithm, administrators can mount file systems effectively, ensuring that the
desired data becomes accessible within the file system hierarchy of the operating system.

Explain Read and Write operation in the file.


Read and write operations are fundamental operations performed on files in a file system. Here's
an explanation of each operation:

Read Operation:

1. Opening the File:


 The first step in reading from a file is to open the file. This is typically done using
system calls like open() in Unix-like systems or fopen() in C programming.
 The file is identified by its pathname, and the system allocates resources to
manage the file's access.
2. Seeking the Desired Position:
 If the read operation is not meant to start from the beginning of the file, the file
pointer must be positioned at the desired location within the file.
 This can be achieved using system calls like lseek() in Unix-like systems.
3. Reading Data:
 Once the file is open and the file pointer is set to the desired position (if
necessary), the actual reading of data from the file occurs.
 Data can be read from the file in various ways, such as reading a single byte,
reading a line, or reading a block of data of a specified size.
 System calls like read() in Unix-like systems or functions like fread() in C
programming are typically used for reading data from files.
4. Processing Read Data:
 The data read from the file can then be processed or used by the program as
required.
 This may involve parsing the data, performing calculations, or any other relevant
operations based on the content of the file.
5. Closing the File:
 After reading is complete, it is essential to close the file using system calls like
close() in Unix-like systems or fclose() in C programming.
 Closing the file releases the associated resources and ensures that the file is
properly closed, preventing resource leaks and potential data corruption.

Write Operation:

1. Opening the File:


 Similar to the read operation, the first step in writing to a file is to open the file
using system calls like open() in Unix-like systems or fopen() in C programming.
 The file is identified by its pathname, and the system allocates resources to
manage the file's access.
2. Seeking the Desired Position:
 If the write operation is not meant to append data at the end of the file, the file
pointer must be positioned at the desired location within the file.
 This can be achieved using system calls like lseek() in Unix-like systems.
3. Writing Data:
 Once the file is open and the file pointer is set to the desired position (if
necessary), the actual writing of data to the file occurs.
 Data can be written to the file in various ways, such as writing a single byte,
writing a line, or writing a block of data of a specified size.
 System calls like write() in Unix-like systems or functions like fwrite() in C
programming are typically used for writing data to files.
4. Closing the File:
 After writing is complete, it is essential to close the file using system calls like
close() in Unix-like systems or fclose() in C programming.
 Closing the file ensures that any pending data is flushed to the file, releases the
associated resources, and properly closes the file, preventing resource leaks and
potential data corruption.

The Structure of Processes


Draw and explain complete Process state transition diagram.
The Structure of Processes

The kernel has a process table where it stores the state of the process and other information about
the process. The information of the entry and the u-area of the process combined is the context of
the process.

Process States And Transitions

The complete set of process states:

1. Executing in user mode.


2. Executing in kernel mode.
3. Ready to run.
4. Sleeping in memory.
5. Ready to run, but in swap space (covered later).
6. Sleeping in swap space.
7. Preempted. (the process is returning from kernel to user mode, but the kernel preempts it
and does a context switch to schedule another process. Very similar to state 3)
8. Newly created. Not ready run, nor sleeping. This is the start state for all processes expect
process 0.
9. The process executed exit system call and is in the zombie state. The process no longer
exists, but it leaves a record containing an exit code and some timing statistics for its
parent process to collect. The zombie state is the final state of a process.

The process enters the created state when the parent process executes the fork system call model
and eventually moves into a state where it is ready to run (3 or 5). The scheduler will eventually
pick the process and the process enters the state kernel running, where it completes its part
of fork system call. After the completion of system call, it may move to user running. When
interrupts occur (such as system call), it again enters the state kernel running. After the servicing
of the interrupt the kernel may decide to schedule another process to execute, so the first process
enters the state preempted. The state preempted is really same as the state ready to run in
memory, but they are depicted separately to stress that a process executing in kernel mode can be
preempted only when it is about to return to user mode. Consequently, the kernel could swap a
process from the state preempted if necessary. Eventually, it will return to user running again.

When a system call is executed, it leaves the state user running and enters the state kernel
running. If in kernel mode, the process needs to sleep for some reason (such as waiting for I/O),
it enters the state asleep in memory. When the event on it which it has slept, happens, the
interrupt handler awakens the process, and it enters the state ready to run in memory.

Suppose the system is executing many processes that do not fit simultaneously into main
memory, then the swapper (process 0) swaps out a process to make room for another process that
is in the state ready to run swapped. When evicted from main memory, the process enters the
state ready to run swapped. Eventually, swapper chooses the process as most eligible to run and
it re-enters the state ready to run in memory. And then when it is scheduled, it will enter the
state kernel running. When a process completes and invokes exit system call, thus entering the
states kernel running and finally, the zombie state.

Some state transitions can be controlled by the users, but not all. User can create a process. But
the user has no control over when a process transitions to sleeping in memory to sleeping in
swap, or ready to run in memory to ready to run in swap, etc. A process can make a system call
to transition itself to kernel running state. But it has no control over when it will return from
kernel mode. Finally, a process can exit whenever it wants, but that is not the only reason
for exit to be called.

The process transitions follow a rigid model encoded in the kernel, reacting to events in a
predictable way according to formulated rules (studied later). For example, no process can
preempt another process executing in the kernel.

Two kernel data structures describe the state of a process: the process table entry and the u-area.
The process table contains information that should be accessible to the kernel and the u-area
contains the information that should be accessible to the process only when its running. Kernel
allocates space for u-area only when creating a process. It does not need u-area for process table
entries that do not have processes. For example, the process table entries that contain the
information about the kernel context, do not have processes.

The fields in the process table are the following:

 State of the process


 Fields that allow the kernel to locate the process and its u-area in main memory or in
secondary storage. This information is used to do a context switch to the process when the
process moves from state ready to run in memory to the state kernel running or from the
state preempted to the state user running or when swapping the process. It contains a
field that gives the size of the process so that the kernel knows how much space to
allocate for the process.
 Several user identifiers (user IDs or PIDs) specify the relationship of processes to each
other. These ID fields are set up when the process enters the state created in
the fork system call.
 Event descriptor when the process is sleeping.
 Scheduling parameters allow the kernel to determine the order in which processes move
to the states kernel running and user running.
 A signal fields enumerates the signals sent to a process but not yet handled.
 Various timers give process execution time and kernel resource utilization. These are used
for calculation of process scheduling priority. One field is a user-set timer used to send an
alarm signal to a process.

The u-area contains these fields (some are covered previously as well) :

 A pointer in the process table identifies the entry that corresponds to the u-area.
 The real and effective user IDs determine various privileges allowed the process, such as
file access rights.
 Timer fields record the time the process spent executing in user mode and in kernel
mode.
 An array indicates how the process wishes to react to signals.
 The control terminal field identifies the "login terminal" associated with the process, if
one exists.
 An error field records errors encountered during a system call.
 A return value field contains the result of system calls.
 I/O parameters describe the amount of data to transfer, the address of the source (or
target) data array in user space, file offsets for I/O, and so on.
 The current directory and current root describe the file system environment of the
process.
 The user file descriptor table records the files the process has open.
 Limit fields restrict the size of a process and the size of a file it can write.
 A permission modes field masks mode settings on files the process creates.

Layout of System Memory

The physical memory is addressable. Starting from offset 0, going up to the amount of physical
memory. A process in UNIX contains three sections: text, data, and stack. Text section contains
the instructions. Those instructions could refer to other addresses, for example, addresses of
different subroutines, addresses of global variables in the data section, or the addresses of local
data structures on the stack. If the addresses generated by the compiler were to be treated as
physical addresses, it would be impossible to run more than one process at a time, because the
addresses could overlap. Even if the compiler tries to address this problem by using heuristics, it
would be difficult and impractical.

To solve this problem, the kernel treats the addresses given by the compiler as virtual addresses.
And when the program starts executing, the memory management unit translates the virtual
addresses to physical addresses. The compiler doesn't need to know which physical addresses the
process will get. For example, two instances of a same program could be executing in memory
using the same virtual addresses but different physical addresses. The subsystems of the kernel
and the hardware that cooperate to translate virtual to physical addresses comprise the memory
management subsystem.

Regions

The UNIX system divides its virtual address space in logically separated regions. The regions
are contiguous area of virtual address space. A region is a logically distinct object which can be
shared. The text, data, and stack are usually separate regions. It is common to share the text
region among instances of a same process.

The region table entries contain the physical locations at which the region is spread. Each
process contains a private per process regions table, called a pregion. The pregion entry contains
a pointer to an entry in the region table, and contains starting virtual address of the
region. pregion are stored in process table, or u-area, or a separately allocated memory space,
according to the implementation. The pregion entries contain the access permissions: read-only,
read-write, or read-execute. The pregion and the region structure is analogous to file table and
the in-core inode table. But since, pregions are specific to a process, pregion table is private to a
process, however the file table is global. Regions can be shared amongst processes.

An example of regions:

The concept of regions is independent of the memory management policies.

Pages and Page Tables

In a memory model based a pages, the physical memory is divided into equal sized blocks
called pages. Page sizes are usually between 512 bytes to 4K bytes, and are defined by the
hardware. Every memory location can be address by a "page number" and "byte offset" in the
page. For example, a machine with 2^32 bytes of memory has pages of size 1K bytes (2^10),
then it will have 2^22 pages.

When kernel assigns physical pages of memory to a region, it need not assign the pages
contiguously or in any particular order. Just like disk blocks are not assigned contiguously to
avoid fragmentation.

The kernel maintains a mapping of logical to physical page numbers in a table which looks like
this:

These tables are called page tables. Region table entry has pointers to page tables. Since logical
address space is contiguous, it is just the index into an array of physical page numbers. The page
tables also contain hardware dependent information such as permissions for pages. Modern
machines have special hardware for address translation. Because software implementation of
such translation would be too slow. Hence, when a process starts executing, the kernel tells the
hardware where its page tables reside.

For being hardware independent, let us assume that the hardware has register triples (in
abundance) which the kernel uses for memory management. The first register in the triple
contains the address of the page table, the second register contains the first virtual address
mapped by the page table, and the third register contains control information such as number of
pages in page tables and page access permissions. When executing a process, the kernel loads
such register triples with the data in the pregion entries.

If a process accesses an address outside its virtual address space, an exception is generated.
Suppose a process has 0 to 16K bytes of address space and the process accesses the virtual
address 20K, an exception will generated, and it is caught by the operating system. Similarly, if a
process tries to access a page without having enough permission, an exception will be generated.
In such cases, process normally exit.

Layout of the Kernel

Even if the kernel executes in the context of a process, its virtual address space is independent of
processes. When the system boots up, the kernel is loaded into memory and necessary page
tables and registers are loaded with appropriate data. Many hardware systems slice a process'
address space into many sections, user and kernel being two of them. When in user mode, access
to kernel page tables is prohibited. When the process switches to kernel mode, only then it can
access kernel page tables. Some system implementations try to allocate the same physical pages
to the kernel, keeping the translation function, an identity function.

Example:

The Context of a Process

The context of a process consists of:

 Contents of its (user) address space, called as user level context


 Contents of hardware registers, called as register context
 Kernel data structures that relate to the process, called as system context

User level context consists of the process text, data, user stack and shared memory that is in the
virtual address space of the process. The part which resides on swap space is also part of the user
level context.

The register context consists of the following components:

 Program counter specifies the next instruction to be executed. It is an address in kernel or


in user address space.
 The processor status register (PS) specifies hardware status relating the process. It has
subfields which specify if last instruction overflowed, or resulted in 0, positive or
negative value, etc. It also specifies the current processor execution level and current and
most recent modes of execution (such as kernel, user).
 The stack pointer points to the current address of the next entry in the kernel or user
stack. If it will point to next free entry or last used entry it dependent on the machine
architecture. The direction of the growth of stack (toward numerically higher or lower
addresses) also depend on machine architecture.
 The general purpose registers contain data generated by the process during its execution.

The system level context has a "static part" and a "dynamic part". A process has one static part
throughout its lifetime. But it can have a variable number of dynamic parts.

The static part consists of the following components:

 The process table entry


 The u-area
 Pregion entries, region tables and page tables.

The dynamic part consists of the following components:

 The kernel stack contains the stack frames the kernel functions. Even if all processes
share the kernel text and data, kernel stack needs to be different for all processes as every
process might be in a different state depending on the system calls it executes. The
pointer to the kernel stack is usually stored in the u-area but it differs according to system
implementations. The kernel stack is empty when the process executes in user mode
 The dynamic part of the system level context consists of a set of layers, visualized as a
last-in-first-out stack. Each system-level context layer contains information necessary to
recover the previous layer, including register context of the previous layer.

The kernel pushes a context layer when an interrupt occurs, when a process makes a system call,
or when a process does a context switch. It pops a context layer when it returns from an interrupt
handler, returns from a system call, or when a context switch happens. Thus, a context switch
entails a push-pop operation: The kernel pushes the context of the old process and pops the
context layer of the new process. The process table entry stores the necessary information to
recover the current context layer.
The following figure shows the components that form the context of a process:

The right side of the figure shows the dynamic portion of the context. It consists of several stack
frames where each stack frame contains saved register context of the previous layer and the
kernel stack as it executes in that layer. System context layer 0 is a dummy layer that represents
the user-level context; growth of the stack here is in the user address space and the kernel stack is
null. The process table entry contains the information to recover the current layer in of the
process (shown by the arrow).

The maximum number of context layer depends on the number of levels of interrupts the system
supports. Suppose the system supports 5 levels interrupts (software, terminal, disk, peripheral
devices, clock), then 7 context layers are enough. 1 for user-level context, 1 for system call and 5
for different interrupt levels. As the kernel blocks all the lower level interrupts when a higher
level interrupt occurs, there could be maximum 1 interrupt of each level. Therefore, in the worst
case (interrupts occurring in the sequence of level 1 to 5), 7 context layers will be used. Although
the kernel executes in the context of a process, the logical function it performs do not necessarily
pertain to the process. For instance, interrupts handlers do not generally modify the static parts of
the the process.

Saving the Context of a Process

Interrupts and Exceptions

The system is responsible for handling interrupts and exceptions. If the system is executing at a
lower processor execution level, when an interrupts occurs, the kernel accepts the interrupt
before decoding the next instruction and then raises the processor execution level to block other
interrupts of that or lower level. It handles the interrupt by performing following sequence of
operations:

1. It saves the current register context of the executing process and creates (pushes) a new
context layer.
2. The kernel determines the source (cause) of the interrupt, and if applicable, unit number
(such as which drive caused the interrupt). When the system receives an interrupt, it gets
a number. It uses that number as an index into the interrupt vector, which stores the
actions to be taken (interrupt handlers) when interrupts occur. Example of interrupt
vector:

3. The kernel invokes the interrupt handler. The kernel stack of the new context layer is
logically distinct from the kernel stack of the previous context layer. Some
implementations use the processes kernel stack to store the stack frame of an interrupt
handler, while some implementations use a global interrupt stack for the interrupt
handlers which are guaranteed to return without a context switch.
4. The kernel returns from the interrupt handler and executes a set of hardware instructions
which restore the previous context. The interrupt handler may affect the behavior of the
process as it might modify the kernel data structures. But usually, the process resumes
execution as if the interrupt never occurred.

The algorithm for interrupt handling is given below:

/* Algorithm: inthand
* Input: none
* Output: none
*/

{
save (push) current context layer;
determine interrupt source;
find interrupt vector;
call interrupt handler;
restore (pop) previous context layer;
}
Example of state of the context layer as stack as system call is executed and interrupts occur:

System Call Interface

The library functions such as open, read, etc. in the standard C library are not actually system
calls. Those are normal functions and normal functions cannot change the mode of execution of
the system. These functions invoke a special instruction which makes the system change its
execution mode to kernel mode and start executing the system call code. The instruction is called
as operating system trap. The system calls are a special case of interrupt handling. The library
routines pass the a number unique for each system call, as the parameter to the operating system
trap through a specific register or on the stack. Through that number, the kernel determines
which system call to execute.

The algorithm for executing system calls is given below:

/* Algorithm: syscall
* Input: system call number
* Output: result of system call
*/

{
find entry in the system call table corresponding to the system call number;
determine number of parameters to the system call;
copy parameters from the user address space to u-area;
save current context for abortive return; // studied later
invoke system call code in kernel;
if (error during execution of system call)
{
set register 0 in user saved register context to error number;
turn on carry bit in PS register in user saved register context;
}
else
set register 0, 1 in user saved register context to return values from system
call;
}
Register 0 and 1 are used to exchange information between user mode and kernel mode. For
being machine independent, the terms register 0 and 1 are used. The kernel calculates the address
of the parameters by adding (or subtracting, according to the direction of growth of the stack) an
offset to the user stack pointer, corresponding to the number of parameters to the system call.
The setting of carry bit indicates that there was an error in the system call execution. After
execution of this algorithm, the library function determines the return value from registers 0 and
1 and returns it to the user.

Consider the following code which calls the creat function of the C library. And the assembly
code generated by the compiler (on a Motorola 68000) :
Consider the stack configuration for the above program:
The code for main begins at address 58. It copies the parameters 0666 and the variable "name"
onto the user stack. The order in which the parameters are copied is machine dependent. Then it
calls the C library function creat at address 64. The function is at address 7a. The return address
from the function call is 6a, and the process pushes this number on the stack. The function places
the value 8 (number of the creat system call) in register 0 and then executes the trap instruction
which makes the system switch to kernel mode. The kernel checks the number of the system call
and determines that the routine for creat is to be executed. It determines the number of
parameters to the system call and copies them to the u-area from the user stack. When the system
call returns, the interrupt handler checks it the u-area field for error is set, and if it is, it sets the
carry bit and places the error code in register 0, and returns. The library routine then checks if
carry bit is set, if it is, it jumps to address 13c. It moves the error code from register 0 to a global
variable errno at address 20e and moves -1 in the data register 0, and returns to address 86. If the
carry bit was not set, the creat library function will return and further execution will start after
the call at address 64. Register 0 contains the return value from the system call.

Context Switch

As seen previously, the kernel permits a context switch under 4 situations:

1. When a process sleeps


2. When a process exits
3. When a process returns from a system call to user mode but is not the most eligible
process to run.
4. When a process returns from an interrupt handler to user mode but is not the most eligible
process to run.

The process followed for a context switch is similar to that of a system calls and interrupts.
Except that the context layer of a different process is restored instead of the previous layer of the
same process. The choice of which process to schedule is a policy decision and it is independent
of the mechanism of the context switch.

The code that implements the context switch on UNIX systems is usually the most difficult to
understand in the operating system, because function calls give the appearance of not returning
on some occasions and materializing from nowhere on the others. This is because the kernel, in
many implementations, saves the process context at one point in the code but proceeds to execute
the context switch and scheduling algorithms in the context of the "old" process. When it later
restores the context of the process, it resumes execution according to the previously saved
context. To differentiate between the case where the kernel resumes the context of a new process
and the case where it continues to execute in the old context after having saved it, the return
values of critical functions may vary, or the program counter where the kernel executes may be
set artificially.

The pseudo code for a context switch is given below:

if (save_context()) // save context of executing process


{
// pick another process to run
.
.
.
resume_context (new_process);
// never gets here
}
// resuming process executes from here
Suppose, process A is doing a context switch, the function save_context will save the context of
process A and return the value 1. The current program counter (somewhere inside the
function save_context) is also saved. The value 0, to be returned from save_context is also saved.
In spite of saving the context, the kernel continues to execute in the context of process A, and
picks another process to schedule, say process B. And then calls resume_context for process B.
Then the context of process B is restored and it starts executing, hence the comment "never gets
here". When eventually, the process A is selected for scheduling, the context of process A is
restored, it begins executing in the function save_context and returns the value 0. Hence not
going in the if block resuming execution from below the if block, hence the comment "resuming
process executes from here".

Saving Context for Abortive Returns

Situations arise when kernel needs to abort the current execution sequence and restore a
previously save context. Such situations are discussed later in the sections
where sleep and signals are described. The mechanism for such context restore is discussed here.
For saving the context, the algorithm setjmp is used and the algorithm longjmp is used to restore
the previous context. setjmp is similar to save_context except that save_context pushes a new
context layer whereas setjmp saves the current context in the u-area and continues to execute in
the old context layer. When the kernel wishes to restore the context saved in setjmp, it does
a longjmp, restoring the context saved in the u-area and returning 1 from setjmp.

Copying Data Between System and User Address Space

The process executes in kernel mode or in user mode. But the system calls need to copy the data
from user to kernel address space and vice-a-versa. Addresses in kernel address space are of
course, not accessible from user address space. The kernel copies data from user address space
kernel address space and vice-a-versa. The kernel when accessing addresses is user address
space, must ascertain that the address is in user space, otherwise it could inadvertently, write on
some kernel data structures and corrupt the system. Therefore, copying data between kernel and
user address space is an expensive operation, requiring more than one instruction.

This is the VAX code for moving one character from user to kernel address space:
The prober instruction checks if one byte at the address argument pointer register + 4 (indicated
by : 4(ap)) could be read in user mode (mode 3). If not, the kernel branches to address eret,
stores -1 in register 0, and returns; the move failed. Otherwise, the kernel copies one byte from
the user given address to register 0 and returns the value to the caller. The procedure requires 5
instructions (with the call to fubyte) to move 1 character; the process is expensive.

Manipulation of the Process Address Space

The region table entry contains the following information:

 The inode of the file from which the region was initially loaded.
 The type of the region (text, shared memory, private data, or stack).
 The size of the region.
 The location of the region in physical memory.
 The state of the region:
o locked
o in demand
o being loaded into memory
o valid, loaded into memory
 The reference count, giving the number of processes that reference the region

The operations that manipulate regions are:

 lock a region
 unlock a region
 allocate a region
 attach a region to the memory space of a process
 change the size of a region
 load a region from a file into the memory space of a process
 free a region
 detach a region from the memory space of a process, and duplicate the contents of a
region

Locking and Unlocking a Region

The kernel can lock and unlock a region independent of the operations to allocate and free a
region, just like the the locking-unlocking mechanism of inodes is independent of the allocating
the releasing (iget and iput) inodes.

Allocating a Region

The kernel allocates a region in fork, exec, and shmget (shared memory get) system calls. Just
like inodes, the kernel contains a free list of regions, when a region is to be allocated, the kernel
picks up the first region from the free list and places it on the active list. The inode is used by the
kernel to enable other process to share the region. The kernel increments the inode reference
count to prevent other processes from removing its contents when unlinking it.

The algorithm allocreg is given below:

/* Algorithm: allocreg
* Input: indoe pointer
* region type
* Output: locked region
*/

{
remove region from linked list of free regions;
assign region type;
assign region inode pointer;
if (inode pointer not null)
increment inode reference count;
place region on linked list of active regions;
return (locked region);
}
We check if the inode pointer is not null because there are a few exceptions where a region is not
associated with an inode.

Attaching a Region to a Process

The kernel attaches the region to a processes address space with the attachreg system call. It is
used in fork, exec, and shmat. The region being attached might be newly allocated or might be an
already allocated region which is to be shared.

The algorithm is given below:

/* Algorithm: attachreg
* Input: pointer to (locked) region being attached
* process to which the region is being attached
* virtual address in process where region will be attached
* region type
* Output: pre process region table entry
*/

{
allocate per process region table entry for process;
initialize per process region table entry;
set pointer to region being attached;
set type field;
set virtual address field;
check legality of virtual address, region size;
increment region reference count;
increment process size according to attached region;
initialize new hardware register triple for process;
return (per process region table entry);
}

Changing the Size of a Region

The stack region of a process grows and shrinks automatically according to the nesting of calls.
It uses the growreg algorithm. The data region of a process can be extended with the sbrk system
call. It also internally calls growreg. Both of these regions are private to a process. Shared
regions cannot be extended, so there are no side-effects of growreg. Text regions also cannot
grow.

The algorithm growreg is given below:

/* Algorithm: growreg
* Input: pointer to per process region table entry
* change in size of region (positive or negative)
* Output: none
*/

{
if (region size increasing)
{
check legality of new region size;
allocate auxiliary tables (page tables);
if (not system supporting demand paging)
{
allocate physical memory;
initialize auxiliary tables, as necessary;
}
}
else // region size decreasing
{
free physical memory, as appropriate;
free auxiliary tables, as appropriate;
}

do (other) initialization of auxiliary tables, as necessary;


set size field in process table;
}

Loading a Region

In a system supporting demand paging, the kernel can "map" a file into process address space
during the exec system call, arranging to read individual physical pages later on demand (studied
later). If the kernel does not support demand paging, it must copy the executable file into
memory, loading the process regions at virtual addresses specified in the executable file. It may
attach a region at a different virtual address from where it loads the contents of the file, creating
a gap in the page table. For example, this feature is used to cause memory faults when user
programs access address 0 illegally. Programs with pointer variables sometimes use them
erroneously without checking that their value is 0 and, hence, that they are illegal for use as a
pointer reference. By protecting the page containing address 0 appropriately, processes that
errantly access address 0 incur a fault and abort, allowing programmers to discover such bugs
more quickly.

The algorithm loadreg is given below:

/* Algorithm: loadreg
* Input: pointer to per process region table entry
* virtual address to load region
* inode pointer of file for loading region
* byte offset in file for start of region
* byte count for amount of data to load
* Output: none
*/

{
increase region size according to eventual size of region (algorithm: growreg);
mark region state: being loaded into memory;
unlock region;
set up u-area parameters for reading file:
target virtual address where data is read to,
start offset value for reading file,
count of bytes to read from file;
read file into region (internal variant of read algorithm);
lock region;
mark region state: completely loaded into memory;
awaken all processes waiting for region to be loaded�;
}
For example, if the kernel wants to load text of size 7K into a region that is attached at virtual
address 0 of a process but wants to leave a gap of 1K bytes at the beginning of the region. By this
time, the kernel will have allocated a region table entry and will have attached the region at
address 0 using algorithms allocreg and attachreg. Now it invokes loadreg, which
invokes growreg twice -- first, to account for the 1K byte gap at the beginning of the region, and
second, to allocate storage for the contents of the region -- and growreg allocates a page table for
the region. The kernel then sets up fields in the u-area to read the file: It reads 7K bytes from a
specified byte offset in the file (supplies as a parameter by the kernel) into virtual address 1K of
the process. It is shown in the following diagram:

Freeing a Region

When a kernel no longer needs a region, it frees the region and places it on the free list again.
The algorithm freereg is given below:

/* Algorithm: freereg
* Input: pointer to a (locked) region
* Output: none
*/

{
if (region reference count non zero)
{
// some process still using region
release region lock;
if (region has an associated inode)
release inode lock;
return;
}
if (region has associated inode)
release inode (algorithm: iput);
free physical memory still associated with region;
free auxiliary tables associated with region;
clear region fields;
place region on region free list;
unlock region;
}

Detaching a Region from a Process

The kernel detaches regions in the exec, exit, and shmdt system calls. It updates the pregion entry
and cuts the connection to physical memory by invalidating the associated memory management
register triple. The address translation mechanisms thus invalidated apply specifically to
the process, not to the region (as in the algorithm freereg). The kernel decrements the region
reference count. If the region referenced count drops to 0 and there is no reason to keep the
region in memory (studied later), the kernel frees the region with algorithm freereg. Otherwise, it
only releases the region and inode locks.

The algorithm detachreg is given below:

/* Algorithm: detachreg
* Input: pointer to per process region table entry
* Output: none
*/

{
get auxiliary memory management tables for process, release as appropriate;
decrement process size;
decrement region reference count;
if (region reference count is 0 and region not stick bit) // studied later
free region (algorithm: freereg;)
else // either reference count non-0 or region sticky bit on
{
free inode lock, if applicable (inode associated with region);
free region lock;
}
}

Duplicating a Region
In the fork system call, the kernel needs to duplicate the regions of a process. If the region is
shared, the kernel just increments the reference count of the region. If it is not shared, the kernel
has to physically copy it, so it allocates a new region table entry, page table, and physical
memory for the region. The algorithm dupreg is given below:

/* Algorithm: dupreg
* Input: pointer to region table entry
* Output: pointer to a region that looks identical to input region
*/

{
if (region type shared)
// caller will increment region reference count with subsequent attachreg call
return (input region pointer);
allocate new region (algorithm: allocreg);
set up auxiliary memory management structures, as currently exist in input region;
allocate physical memory region contents;
"copy" region contents from input region to newly allocated region;
return (pointer to allocated region);
}
The state of regions when process A forks process B:

Sleep

Processes sleep inside of system calls awaiting for a particular resource or even if a page fault
occurs. In such cases, they push a context layer and do a context switch. The context layers of a
sleep process are shown below:
Sleep Event and Addresses

As seen previously, processes sleep on a particular event (for example, unlocking of a buffer).
When it is said that processes sleep on an event, they actually sleep on an address. This
mechanism has couple of disadvantages. If there are many processes sleeping on a buffer to
become unlocked, the kernel wakes up all the processes when the buffer gets unlocked, even
though many of them will go back to sleep after entering the kernel mode.

Another disadvantage is that, multiple events can map to a single address. As shown in the figure
below:
Many process are waiting on addr A but some are waiting for the buffer while a process is
awaiting I/O completion. On any of these two events, all the processes will be woken up. Even
though any one of these two events occurs, all the process will be woken up since they are
sleeping on the same address. It would have been better if there was a one-to-one mapping, but
practically, such clashes are rare and system performance is not affected.

Algorithms for Sleep and Wakeup

Algorithm for sleep is given below:

/* Algorithm: sleep
* Input: sleep address
* priority
* Output: 1 if a process awakened as a result of a signal that process catches,
* longjmp if the process is awakened as a result of a signal it does not catch,
* 0 otherwise
*/

{
raise processor execution level to block all the interrupts;
set process state to sleep;
put process on sleep hash queue, based on sleep address;
save sleep address in process table slot;
set process priority level to input priority;
if (process sleep is not interruptible)
{
do context switch;
// process resumes execution here when it wakes up
reset processor priority level to allow interrupts as when process went to
sleep;
return (0);
}

// here, process is interruptible by signals


if (no signal pending against process)
{
do context switch;
// process resumes execution here when it wakes up
if (no signal pending against process)
{
reset processor priority level to allow interrupts as when process
went to sleep;
return (0);
}
}
remove process from sleep hash queue, if still there;
reset processor priority level to what it was when the process went to sleep;
if (process sleep priority set to catch signals)
return 1;
do longjump algorithm;
}
The the kernel raises the processor execution level, it stores the old level so that it can be
restored when the process wakes up. The kernel saves the sleep address and sleep priority in the
process table.

The algorithm for wakeup is given below:

/* Algorithm: wakeup
* Input: sleep address
* Output: none
*/

{
raise process execution level to block all the interrupts;
find sleep hash queue for sleep address;
for (every process asleep on sleep address)
{
remove process from hash queue;
mark process as "ready to run";
put process on scheduler list of processes ready to run;
clear field in process table entry for sleep address;
if (process not loaded in memory)
wake up swapper process (0);
else if (awakened process is more eligible to run than currently running
process)
set scheduler flag;
}
restore processor execution level to original level;
}
If a process that is woken up is not loaded in the memory, the kernel wakes up the swapper
process to swap the process in memory. Otherwise, if the awakened process is more eligible to
run than currently executing process, the kernel sets a scheduler flag so that it will go through the
scheduling algorithm after the process enters the user mode.

wakeup does not cause the process to be scheduled immediately; it only makes the process
eligible to get scheduled.

Processes usually sleep on events that are guaranteed to occur. Such as, I/O of a buffer or an
inode/buffer to get unlocked. But processes also might sleep on events that are not guaranteed to
happen. In such cases, there must be a mechanism through which processes can wakeup and
regain control. The kernel can send a signal to such processes (signals are studied later) and
wake them up. A signal can be sent selectively to a process, and as a result, it wakes up and it can
recognize that a signal has been received.
If the sleep priority is above a threshold value, a process will not wake up on receiving a signal,
but will sleep until the event occurs. But if the priority is below a threshold value, it will wakeup
immediately on receipt of a signal.

In the case where a signal has arrived when a process enters the sleep algorithm, the process will
sleep if the sleep priority is above a threshold value, but if the priority value is below the
threshold, it will never sleep and respond to the signal as if it had arrived while it was sleeping. If
it had slept, the signal might not arrive later and the process might never wakeup.

If a process wakes up (or never sleeps as described above) on reception of a signal, the process
may do a longjmp depending on the event that it had slept on. The kernel does a longjmp if, after
reception of a signal, there is no way to complete the system call. For instance, if a process
is reading from a terminal and that terminal is switches off, there is no way for read to complete
and it should return with an error. This holds for all system calls that can be interrupted while
they are in sleep. In the algorithm syscall, the kernel saves the previous context using setjmp in
anticipation of the need for a later longjmp.

There are situations where kernel wants to wakeup on receipts of a signal but do not want to do
a longjmp. The kernel invokes the sleep algorithm with a special priority parameter that
suppresses execution of the longjmp and causes the sleep algorithm to return the value 1. The
purpose is to allow the kernel to cleanup the local data structures. For example, a device driver
may allocate private data structures and then sleep at an interruptible priority; if i

Write and explain algorithm for allocating a Region

Algorithm for Allocating a Region:

1. Input:
 The algorithm takes as input the size of the region to be allocated (size) and any
additional parameters or constraints relevant to the memory management system.
2. Search for Free Space:
 The first step is to search for a suitable region of memory that can accommodate
the requested size.
 This can involve traversing through memory data structures such as free lists,
memory maps, or other data structures maintained by the memory manager.
 The algorithm should consider fragmentation, memory alignment requirements,
and any other constraints specified by the memory management policy.
3. Select a Region:
 Once a suitable free space is identified, the algorithm selects the region for
allocation.
 The selection criteria may vary depending on the memory management policy.
For example, it may prioritize minimizing fragmentation, maximizing contiguous
allocation, or considering other factors such as memory locality.
4. Allocate Region:
The selected region is then marked as allocated in the memory management data
structures.
 This typically involves updating metadata associated with the memory region to
indicate that it is now in use.
 If necessary, adjustments may be made to account for alignment requirements or
other constraints.
5. Return Pointer to Allocated Region:
 Finally, the algorithm returns a pointer or reference to the allocated region, which
can be used by the requesting process to access the allocated memory.
6. Error Handling:
 If a suitable region cannot be found (e.g., due to insufficient free space), the
algorithm may return an error code or NULL pointer to indicate failure.
 Error handling mechanisms should be in place to handle such scenarios gracefully
and prevent system instability or crashes.

Explanation:

 Efficiency: The algorithm aims to efficiently allocate memory regions while adhering to
the constraints and policies defined by the memory management system.
 Optimization: Depending on the system requirements, the algorithm may employ
various optimizations such as best-fit, first-fit, or next-fit strategies to minimize
fragmentation and improve memory utilization.
 Concurrency: In multi-threaded or multi-process environments, the algorithm should
ensure thread safety and prevent race conditions by employing appropriate
synchronization mechanisms.
 Scalability: The algorithm should be scalable to handle a large number of allocation
requests efficiently without significant performance degradation.
 Flexibility: The algorithm should be flexible enough to accommodate different memory
management policies and adapt to varying workload characteristics.

Explain with diagram concept of Process in detail.

The main duty or the work of the Operating System is to complete the given process in
less than the given stipulated time. So, the term process is very important for the subject
named Operating Systems. Now, let us learn everything about the term process in a very
deep manner.

Definition of Process
Basically, a process is a simple program.

An active program which running now on the Operating System is known as the
process. The Process is the base of all computing things. Although process is relatively
similar to the computer code but, the method is not the same as computer code. A
process is a "active" entity, in contrast to the program, which is sometimes thought of as
some sort of "passive" entity. The properties that the process holds include the state of
the hardware, the RAM, the CPU, and other attributes.

Process in an Operating System


A process is actively running software or a computer code. Any procedure must be
carried out in a precise order. An entity that helps in describing the fundamental work
unit that must be implemented in any system is referred to as a process.

In other words, we create computer programs as text files that, when executed, create
processes that carry out all of the tasks listed in the program.

When a program is loaded into memory, it may be divided into the four components
stack, heap, text, and data to form a process. The simplified depiction of a process in the
main memory is shown in the diagram below.

Stack
The process stack stores temporary information such as method or function arguments,
the return address, and local variables.

Heap

This is the memory where a process is dynamically allotted while it is running.

Text
This consists of the information stored in the processor's registers as well as the most
recent activity indicated by the program counter's value.

Data

In this section, both global and static variables are discussed.

Program
Program is a set of instructions which are executed when the certain task is allowed to
complete that certain task. The programs are usually written in a Programming
Language like C, C ++, Python, Java, R, C # (C sharp), etc.

A computer program is a set of instructions that, when carried out by a computer,


accomplish a certain task

Difference between process and the


program

S. Process Program
No

1 A process is actively running software or Program is a set of instructions which are


a computer code. Any procedure must executed when the certain task is allowed to
be carried out in a precise order. An complete that certain task
entity that helps in describing the
fundamental work unit that must be
implemented in any system is referred
to as a process

2 Process is Dynamic in Nature Program is Static in Nature

3 Process is an Active in Nature Program is Passive in Nature

4 Process is created during the execution Program is already existed in the memory
and it is loaded directly into the main and it is present in the secondary memory.
memory
5 Process has its own control system Program does not have any control system.
known as Process Control Block It is just called when specified and it
executes the whole program when called

6 Process changes from time to time by Program cannot be changed on its own. It
itself must be changed by the programmer.

7 A process needs extra data in addition Program is basically divided into two parts.
to the program data needed for One is Code part and the other part is data
management and execution. part.

8 Processes have significant resource A program just needs memory space to


demands; they require resources like store its instructions; no further resources
Memory Addresses, Central Processing are needed.
Unit, Input or Output until their
presence or existence in the Operating
System.

Process Control Block


An Operating System helps in process creation, scheduling, and termination with the
help of Process Control Block. The Process Control Block (PCB), which is part of the
Operating System, aids in managing how processes operate. Every OS process has a
Process Control Block related to it. By keeping data on different things including their
state, I/O status, and CPU Scheduling, a PCB maintains track of processes.

Now, let us understand the Process Control Block with the help of the components
present in the Process Control Block.

A Process Control Block consists of :

1. Process ID
2. Process State
3. Program Counter
4. CPU Registers
5. CPU Scheduling Information
6. Accounting and Business Information
7. Memory Management Information
8. Input Output Status Information

Now, let us understand about each and every component in detail now.

1) Process ID

It is a Identification mark which is present for the Process. This is very useful for finding
the process. It is also very useful for identifying the process also.

2) Process State

Now, let us know about each and every process states in detail. Let me explain about
each and every state

i) New State

A Program which is going to be taken up by the Operating System directly into the Main
Memory is known as a New Process State

ii) Ready State

The ready state, when a process waits for the CPU to be assigned, is the first state it
enters after being formed. The operating system pulls new processes from secondary
memory and places them all in main memory.

The term "ready state processes" refers to processes that are in the main memory and
are prepared for execution. Numerous processes could be active at the moment.

iii) Running State


The Operating System will select one of the processes from the ready state based on the
scheduling mechanism. As a result, if our system only has one CPU, there will only ever
be one process operating at any given moment. We can execute n processes
concurrently in the system if there are n processors.

iv) Waiting or Blocking State

Depending on the scheduling mechanism or the inherent behavior of the process, a


process can go from the Running state to the Block or Wait states.

The OS switches a process to the block or wait state and allots the CPU to the other
processes while it waits for a specific resource to be allocated or for user input.

v) Terminated State

A process enters the termination state once it has completed its execution. The
operating system will end the process and erase the whole context of the process
(Process Control Block).

3) Program Counter

The address of the following instruction to be executed from memory is stored in a CPU
register called a program counter (PC) in the computer processor. It is a digital counter
required for both task execution speed and for monitoring the present stage of
execution.

An instruction counter, instruction pointer, instruction addresses register, or sequence


control register are other names for a program counter.

4) CPU Registers

When the process is in a running state, here is where the contents of the processor
registers are kept. Accumulators, index and general-purpose registers, instruction
registers, and condition code registers are the many categories of CPU registers.

5) CPU Scheduling Information

It is necessary to arrange a procedure for execution. This schedule determines when it


transitions from ready to running. Process priority, scheduling queue pointers (to
indicate the order of execution), and several other scheduling parameters are all
included in CPU scheduling information.
6) Accounting and Business Information

The State of Business Addressing and Information includes information such as CPU use,
the amount of time a process uses in real time, the number of jobs or processes, etc.

7) Memory Management Information

The Memory Management Information section contains information on the page,


segment tables, and the value of the base and limit registers. It relies on the operating
system's memory system.

8) Input Output Status Information

This Input Output Status Information section consists of Input and Output related
information which includes about the process statuses, etc.

Explain with example mapping of process virtual address to physical memory address.

Memory is one of the most important host resources. For workloads to


access global system memory, we need to make sure virtual memory
addresses are mapped to the physical addresses. There are several
components working together to perform these translations as
efficient as possible. This blog post will cover the basics on how
virtual memory addresses are translated.

Memory Translations
The physical address space is your system RAM, the memory modules
inside your ESXi hosts, also referred to as the global system memory.
When talking about virtual memory, we are talking about the memory
that is controlled by an operating system, or a hypervisor like vSphere
ESXi. Whenever workloads access data in memory, the system needs
to look up the physical memory address that matches the virtual
address. This is what we refer to as memory translations or mappings.

To map virtual memory addresses to physical memory addresses, page


tables are used. A page table consists of numerous page table entries
(PTE).
One memory page in a PTE contains data structures consisting of
different sizes of ‘words’. Each type of word contains multiple bytes of
data (WORD (16 bits/2 bytes), DWORD (32 bits/4 bytes)
and QWORD (64 bits/8 bytes)). Executing memory translations for
every possible word, or virtual memory page, into physical memory
address is not very efficient as this could potentially be billions of
PTE’s. We need PTE’s to find the physical address space in the
system’s global memory, so there is no way around them.

To make memory translations more efficient, we use page tables to


group chunks of memory addresses in one mapping. Looking at an
example of a DWORD entry of 4 bytes; A page table covers 4 kilobytes
instead of just the 4 bytes of data in a single page entry. For example,
using a page table, we can translate virtual address space 0 to 4095
and say this is found in physical address space 4096 to 8191. Now we
no longer need to map all the PTE’s separately, and be far more
efficient by using page tables.

MMU and TLB


The page tables are managed by a Memory Management Unit (MMU).
All the physical memory references are passed through the MMU. The
MMU is responsible for the translation between virtual memory
addresses and physical memory addresses. With vSphere ESXi, a
virtual machine’s vCPU will call out to MMU functionality by the Virtual
Machine Monitor (VMM) process, or a hardware MMU supported by a
vendor specific CPU offloading instruction.
The Memory Management Unit (MMU) works with the Translation
Lookaside Buffer (TLB) to map the virtual memory addresses to the
physical memory layer. The page table always resides in physical
memory, and having to look up the memory pages directly in physical
memory, can be a costly exercise for the MMU as it introduces
latency. That is where the TLB comes into play.

TLB in Detail
The TLB acts as a cache for the MMU that is used to reduce the time
taken to access physical memory. The TLB is a part of the MMU.
Depending on the make and model of a CPU, there’s more than one
TLB, or even multiple levels of TLB like with memory caches to avoid
TLB misses and ensuring as low as possible memory latency.

In essence, the TLB stores recent memory translations of virtual to


physical. It is a cache for page tables. Because it is part of the MMU,
the TLB lives inside the CPU package. This is why the TLB is faster
than main memory, which is where the page tables exists. Typically
access times for a TLB is ~10 ns where main memory access times are
around 100 ns.

Now that we covered the basics on memory translation, let’s take a


look at some example scenarios for the TLB.

TLB hit

A virtual memory address comes in, and needs to be translated to the


physical address. The first step is always to dissect the virtual
address into a virtual page number, and the page offset. The offset
consists of the last bits of the virtual address. The offset bits are not
translated and passed through to the physical memory address. The
offset contains bits that can represent all the memory addresses in a
page table.

So, the offset is directly mapped to the physical memory layer, and the
virtual page number matches a tag already in the TLB. The MMU now
immediately knows what physical memory page to access without the
need to look into the global memory.

In the example provided in the above diagram, the virtual page number
is found in the TLB, and immediately translated to the physical page
number.

1. The virtual address is dissected in the virtual page number and


the page offset.
2. The page offset is passed through as it is not translated.
3. The virtual page number is looked up in the TLB, looking for a tag
with the corresponding number.
4. There is an entry in the TLB (hit), meaning we immediately can
translate the virtual to the physical address.

TLB miss

What happens when a virtual page number is not found in the TLB, also
referred to as a TLB miss? The TLB needs to consult the system’s
global memory to understand what physical page number is used.
Reaching out to physical memory means higher latency compared to a
TLB hit. If the TLB is full and a TLB miss occurs, the least recent TLB
entry is flushed, and the new entry is placed instead of it. In the
following example, the virtual page number is not found in the TLB,
and the TLB needs to look into memory to get the page number.

1. The virtual address is dissected in the virtual page number and


the page offset.
2. The page offset is passed through as it is not translated.
3. The virtual page number is looked up in the TLB, looking for a tag
with a corresponding number. In this example, the TLB does not
yet have a valid entry.
4. TLB reaches out to memory to find page number 3 (because of
the tag, derived from the virtual page number). Page number 3 is
retrieved in memory with value 0x0006.
5. The memory translation is done and the entry is now cached in
the TLB.

Retrieve from storage

A TLB miss is not ideal, but the worst-case scenario is data that is not
residing in memory but on storage media (flash or disk). Where we are
talking nanoseconds to retrieve data in caches or global memory,
getting data from storage media will quickly run into milliseconds or
seconds depending on the media used.
1. The virtual address is dissected in the virtual page number and
the page offset.
2. The page offset is passed through as it is not translated.
3. The virtual page number is looked up in the TLB, looking for a tag
with a corresponding number. In this example, the TLB does not
yet have a valid entry.
4. TLB reaches out to memory to find page number 0 (because of
the tag, derived from the virtual page number). Page number 0 is
retrieved in memory but finds that the data does not resides in
memory, but on storage. A page fault is triggered, because we
cannot translate memory pages for data that is not in memory.
We need to wait for the data from storage.

List and explain fields of process table.


What is U area? List and explain fields from U area.

The U Area

Even if every process has a u-area, the kernel accesses them through its u variable. It needs to
access only one u-area at a time, of the currently executing process. The kernel knows where the
page table entry of the u-area is located, therefore, when a process is scheduled, the physical
address of its u-area is loaded into kernel page tables.

Example:
The first two register triples point to text and data and the third triple refers to the u-area of
currently executing process (in this case, process D). When a context switch happens, the entry
in this fields changes and points to the u-area of the newly scheduled process. Entries 1 and 2 do
not change as all the process share the kernel text and data.

In Unix-like operating systems, the U area, also known as the "User Area" or "User Structure," is
a data structure maintained by the kernel for each running process. The U area contains various
fields and information related to the process, its execution context, and its interaction with the
operating system. While the exact structure of the U area can vary between different Unix
variants, here are some common fields found in the U area:

1. Process ID (PID):
 The unique identifier assigned to the process by the kernel. It is used to
distinguish between different processes in the system.
2. Process State:
 Indicates the current state of the process, such as "Running," "Sleeping,"
"Waiting," or "Zombie." This field helps the kernel manage process scheduling
and resource allocation.
3. User ID (UID) and Group ID (GID):
 The user and group identifiers associated with the process. These fields determine
the permissions and access rights granted to the process when interacting with
system resources.
4. File Descriptor Table:
 An array or table containing entries for open files, sockets, and other I/O
resources associated with the process. Each entry in the table contains metadata
about the opened resource, such as its file descriptor number, file mode, and file
position.
5. Signal Handling Information:
 Information about the process's signal handlers, including registered signal
handlers, pending signals, and signal mask (which signals are currently blocked
from delivery).
6. Resource Usage Information:
 Statistics and accounting information related to the process's resource usage, such
as CPU time, memory usage, number of system calls executed, and number of
page faults.
7. Process Priority and Scheduling Information:
 Parameters and attributes used by the kernel for scheduling the process, such as
scheduling priority, scheduling policy (e.g., real-time or time-sharing), and
scheduling class.
8. Kernel Stack:
 A region of memory allocated by the kernel to store the process's execution stack.
This stack is used for storing local variables, function call frames, and other
execution context during the execution of system calls and kernel routines.
9. Process Environment:
 Environment variables and settings inherited by the process from its parent
process. These variables influence the behavior of the process and the execution
of user-level programs.
10. Process Credentials:
 Information about the process's credentials, including its effective user ID, real
user ID, effective group ID, and supplementary group IDs. These credentials
determine the process's permissions and access rights when interacting with
system resources.

These are some of the common fields found in the U area of a Unix-like operating system. The U
area provides the kernel with essential information about each process, allowing it to manage
process execution, resource allocation, and system interactions effectively.

Discuss mapping between per process region table and page table.

In operating systems that use virtual memory, both per-process region tables and page
tables play crucial roles in managing memory. Let's discuss each of them separately and
then explore how they are mapped to each other.

1. Per-Process Region Table:


 The per-process region table (also known as the process control block or
PCB) is a data structure used by the operating system to manage
information about each individual process.
 It typically contains information such as process ID, program counter, stack
pointer, CPU scheduling information, memory allocation details, and more.
 One important aspect of the per-process region table is its role in
managing the virtual memory space of a process. It holds information
about the different regions of memory allocated to the process, such as
code segment, data segment, heap, and stack.
 Each entry in the per-process region table describes a contiguous region
of the process's virtual address space and its corresponding attributes
(e.g., permissions, size, location).
2. Page Table:
 The page table is a data structure used by the memory management unit
(MMU) to translate virtual addresses generated by the CPU into physical
addresses in main memory.
 It provides the mapping between virtual pages (chunks of virtual memory)
and physical frames (blocks of physical memory).
 The page table is typically implemented as a hierarchical data structure to
efficiently manage large address spaces. Common implementations
include multilevel page tables or inverted page tables.
 Each entry in the page table contains information about a virtual page,
such as its corresponding physical frame number and any associated
metadata (e.g., page permissions, dirty bit).

Now, let's discuss the mapping between the per-process region table and the page
table:

 The per-process region table describes the entire virtual address space of a
process and the attributes of each region within that address space.
 The page table, on the other hand, provides the mapping between virtual pages
and physical frames.
 The mapping between these two tables is established during the process of
address translation, which occurs when the CPU generates a virtual address and
needs to map it to a physical address.
 When a process attempts to access memory, the MMU uses the virtual address to
look up the corresponding entry in the page table to find the physical frame
where the desired memory resides.
 However, before the MMU can perform this translation, it needs to know which
page table to use for the current process. This information is typically obtained
from the per-process region table, which stores a reference to the page table
associated with the process.
 In essence, the per-process region table points to the page table(s) that are
relevant for translating virtual addresses of that particular process.
 Therefore, the per-process region table and the page table are linked indirectly,
with the former guiding the MMU to the appropriate page table for address
translation.
Overall, the per-process region table and the page table are essential components of
memory management in modern operating systems, and their coordination ensures
efficient and secure memory access for processes.

Process Control and Scheduling


Explain the algorithm for exit() system call.
Explain different functions of clock interrupt handler.
The clock interrupt handler, also known as the timer interrupt handler, is a crucial component of
an operating system's kernel responsible for managing the system's timer hardware and
responding to periodic timer interrupts generated by the system's hardware clock. The clock
interrupt handler performs various functions essential for system operation and process
scheduling. Here are some of the key functions of the clock interrupt handler:

1. Maintaining System Time:


 One of the primary functions of the clock interrupt handler is to maintain the
system's current time. It does this by incrementing a system-wide timekeeping
variable or counter each time a timer interrupt occurs.
2. Process Scheduling:
 The clock interrupt handler plays a vital role in process scheduling by periodically
preempting the currently running process and allowing the kernel scheduler to
select a new process for execution. When a timer interrupt occurs, the currently
running process is interrupted, and the kernel's scheduler is invoked to determine
which process should run next.
3. Time Quantum Management:
 Many operating systems use time-sharing scheduling algorithms, such as round-
robin or priority-based scheduling. In these systems, the clock interrupt handler is
responsible for enforcing time quantum limits for each process. When a process
exhausts its allotted time quantum, the clock interrupt handler preemptively
switches to another process.
4. Timer Management:
 The clock interrupt handler interacts with the system's timer hardware to
configure and manage timer interrupts. It sets the timer's interval and handles
interrupt requests generated by the hardware clock at regular intervals.
5. Timeout and Alarm Mechanisms:
 Operating systems often provide mechanisms for setting timeouts and alarms,
allowing processes to perform actions after a specified period or at a specific time.
The clock interrupt handler is responsible for managing these timeouts and
generating appropriate signals or events when they expire.
6. Performance Monitoring:
 Some operating systems use timer interrupts for performance monitoring and
profiling purposes. The clock interrupt handler may collect performance statistics,
such as CPU usage, interrupt frequency, and context switch counts, for use in
system monitoring and analysis tools.
7. Real-Time Task Scheduling:
 In real-time operating systems (RTOS), where deterministic response times are
critical, the clock interrupt handler ensures timely execution of real-time tasks by
preempting non-real-time tasks and ensuring that high-priority tasks are scheduled
promptly.
8. Power Management:
 The clock interrupt handler may also play a role in power management by
periodically waking the system from low-power states (e.g., sleep or idle) to
perform essential tasks or by adjusting system clock frequencies to save power.

Overall, the clock interrupt handler is a fundamental component of the operating system's kernel,
responsible for managing timekeeping, process scheduling, and system responsiveness. Its
efficient operation is crucial for the overall performance and stability of the system.

Explain system calls for time What is the use of fork() system call? Explain the sequence of
operations kernel executes for fork.

What is the use of signal? Explain types of signals.

In the Linux operating system, signals are a fundamental mechanism for inter-process
communication and for handling various asynchronous events. They allow the kernel, other
processes, or even the terminal to interrupt a running process and notify it of specific events or
conditions. Signals play a vital role in process management, error handling, and user interaction.
Here's an explanation of their uses and types in Linux:

Uses of Signals in Linux:

1. Process Termination:
 Signals can be used to request the termination of a process. For example, the
SIGTERM signal is often sent to gracefully terminate a process, allowing it to
perform cleanup operations before exiting.
 Additionally, the SIGKILL signal can be used to forcefully terminate a process
without giving it a chance to clean up.
2. Handling Errors:
 Signals are used to indicate errors or exceptional conditions that occur during
program execution. For instance, the SIGSEGV signal indicates a segmentation
fault, usually caused by invalid memory access.
 Other signals like SIGFPE (floating-point exception) and SIGILL (illegal
instruction) signify arithmetic errors and invalid machine instructions,
respectively.
3. User Interaction:
Signals facilitate communication between the user and running processes. For
example, the SIGINT signal (typically generated by pressing Ctrl+C) is sent to
interrupt a process and request its termination.
 Similarly, the SIGTSTP and SIGCONT signals are used to suspend and resume
processes, respectively, allowing users to pause and resume background
processes.
4. Asynchronous Events:
 Signals are used to handle various asynchronous events, such as hardware
interrupts, timer expirations, or I/O events. For instance, the SIGCHLD signal is
sent to a parent process when a child process terminates, allowing the parent to
perform cleanup tasks.
 Signals like SIGALRM (alarm) can be used to set timers and receive
notifications when the timer expires.

Types of Signals in Linux:

1. Standard Signals:
 Linux defines a set of standard signals (denoted by numbers) with specific default
actions associated with them.
 Examples include SIGTERM (terminate), SIGINT (interrupt), SIGSEGV
(segmentation fault), SIGILL (illegal instruction), and SIGALRM (alarm).
2. Real-time Signals:
 Linux also supports real-time signals, which are used for high-priority
asynchronous event notification.
 Real-time signals have numbers in the range of SIGRTMIN to SIGRTMAX,
allowing for greater flexibility and customization compared to standard signals.
3. User-defined Signals:
 Applications can define and use their own signals for specific purposes.
 User-defined signals typically use numbers in the range SIGUSR1 to SIGUSR2
and provide a means for applications to define custom signal handlers.

In summary, signals in Linux are a versatile mechanism for process management, error handling,
and user interaction. Understanding how signals work and how to handle them is essential for
developing robust and responsive software applications in Linux.

System system bbot and init process.

It seems there might be a typo in your question. Assuming you're referring to two different
processes: "systemd" and "init process," let me explain each one:

1. systemd: Systemd is a system and service manager for Unix-like operating systems. It is a
replacement for the traditional SysV init system and Upstart. Systemd is responsible for
managing the system's services, daemons, devices, and other aspects of the system's operation.
Key features of systemd include:

 Service management: Systemd manages system services, allowing administrators to start,


stop, enable, disable, and monitor services using simple commands (systemctl).
 Dependency management: Systemd handles service dependencies, ensuring that services
start and stop in the correct order.
 Parallel startup: Systemd can start services in parallel, improving boot times and system
responsiveness.
 Logging: Systemd includes a centralized logging system (systemd-journald) for
collecting, storing, and analyzing system logs.
 Resource management: Systemd can control resource limits, cgroups, and other system
resources to ensure fair allocation and prevent resource exhaustion.
 Socket activation: Systemd supports socket-based activation, allowing services to be
started on-demand when incoming connections are received.

Systemd is widely adopted in modern Linux distributions, such as Fedora, Red Hat Enterprise
Linux, CentOS, Ubuntu, and Debian.

2. init process: The init process (short for initialization) is the first process started by the kernel
during the boot process of Unix-like operating systems. Its primary role is to initialize the system
and start other essential processes and services. The init process typically has process ID (PID) 1
and acts as the ancestor of all other processes in the system.

Traditionally, Unix systems used System V init (SysV init) as the init system. SysV init follows a
sequential boot process, where it executes a series of startup scripts (init scripts) located in
specific directories (e.g., /etc/init.d/) to bring the system to a functional state.

While SysV init served as the standard init system for many years, it has limitations, such as
slow boot times and lack of parallelism. As a result, newer init systems like systemd have been
developed to address these shortcomings.

In summary, systemd is a modern system and service manager widely used in Unix-like
operating systems, while the init process (usually managed by an init system like SysV init or
systemd) is responsible for initializing the system and starting essential processes during the boot
process.

Draw and explain user level and kernel level priority.


User-level threads and kernel-level threads are two different approaches to implementing
thread management in an operating system.
User-level threads are managed entirely by the application, without any involvement from the
operating system kernel. The application manages the creation, scheduling, and
synchronization of threads, using libraries such as pthreads or WinThreads. User-level threads
are generally faster to create and switch between than kernel-level threads, as there is no need
to switch to kernel mode or perform costly context switches.
However, user-level threads have some limitations. For example, if one thread blocks, the
entire process may block, as the operating system is unaware of the thread’s status.
Additionally, user-level threads cannot take advantage of multiple CPUs or CPU cores, as only
one user-level thread can be executing at a time.
Kernel-level threads, on the other hand, are managed by the operating system kernel. Each
thread is a separate entity with its own stack, register set, and execution context. The kernel
handles thread scheduling, synchronization, and management, allowing multiple threads to
execute simultaneously on different CPUs or cores.
Kernel-level threads are more heavyweight than user-level threads, as they require a context
switch to kernel mode for thread management operations. However, kernel-level threads
provide better isolation and control over thread execution, and can take advantage of multiple
CPUs or cores to improve performance.
In summary, user-level threads provide fast and efficient thread management, but lack some of
the benefits of kernel-level threads, such as support for multiple CPUs and better thread
isolation. Kernel-level threads provide more robust thread management, but are more
heavyweight and may be slower to create and switch between.
A task is accomplished on the execution of a program, which results in a process. Every task
incorporates one or many sub tasks, whereas these sub tasks are carried out as functions within
a program by the threads. The operating system (kernel) is unaware of the threads in the user
space.
There are two types of threads, User level threads (ULT) and Kernel level threads (KLT).
1. User Level Threads :
Threads in the user space designed by the application developer using a thread
library to perform unique subtask.

2. Kernel Level Threads :


Threads in the kernel space designed by the os developer to perform unique
functions of OS. Similar to a interrupt handler.

There exist a strong a relationship between user level threads and kernel level threads.
Dependencies between ULT and KLT :

1. Use of Thread Library :


Thread library acts as an interface for the application developer to create number of
threads (according to the number of subtasks) and to manage those threads. This
API for a process can be implemented in kernel space or user space. In real-time
application, the necessary thread library is implemented in user space. This reduces
the system call to kernel whenever the application is in need of thread creation,
scheduling or thread management activities. Thus, the thread creation is faster as it
requires only function calls within the process. The user address space for each
thread is allocated at run-time. Overall it reduces various interface and architectural
overheads as all these functions are independent of kernel support.
2. Synchronization :
The subtasks (functions) within each task (process) can be executed concurrently or
parallelly depending on the application. In that case, single-threaded process is not
suitable. There evokes multithreaded process. A unique subtask is allocated to every
thread within the process. These threads may use the same data section or different
data section. Typically, threads within the same process will share the code section,
data section, address space, open files etc.

When subtasks are concurrently performed by sharing the code section, it may result in data
inconsistency. Ultimately, requires suitable synchronization techniques to maintain the control
flow to access the shared data (critical section).
In a multithreaded process, synchronization adopted using four different models :

1. Mutex Locks – This allows only one thread at a time to access the shared resource.

2. Read/Write Locks – This allows exclusive writes and concurrent read of a shared
resource.

3. Counting Semaphore – This count refers to the number of shared resource that can
be accessed simultaneously at a time. Once the count limit is reached, the remaining
threads are blocked.

4. Condition Variables – This blocks the thread until the condition satisfies(Busy
Waiting).
All these synchronization models are carried out within each process using thread
library. The memory space for the lock variables is allocated in the user address
space. Thus, requires no kernel intervention.
1. Scheduling :
The application developer during the thread creation sets the priority and scheduling policy of
each ULT thread using the thread library. On the execution of program, based on the defined
attributes the scheduling takes place by the thread library. In this case, the system scheduler
has no control over thread scheduling as the kernel is unaware of the ULT threads.

2. Context Switching :
Switching from one ULT thread to other ULT thread is faster within the same process, as each
thread has its own unique thread control block, registers, stack. Thus, registers are saved and
restored. Does not require any change of address space. Entire switching takes place within the
user address space under the control of thread library.
3. Asynchronous I/O :
After an I/O request ULT threads remains in blocked state, until it receives the
acknowledgment(ack) from the receiver. Although it follows asynchronous I/O, it creates a
synchronous environment to the application user. This is because the thread library itself
schedules an other ULT to execute until the blocked thread sends sigpoll as an ack to the
process thread library. Only then the thread library, reschedules the blocked thread.
For example, consider a program to copy the content(read) from one file and to paste(write) in
the other file. Additionally, a pop-up that displays the percentage of progress completion.
This process contains three subtasks each allocated to a ULT,

 Thread A – Read the content from source file. Store in a global variable X within
the process address space.

 Thread B – Read the global variable X. Write in the destination file.

 Thread C – Display the percentage of progress done in a graphical representation.

Here, the application developer will schedule the multiple flow of control within a program
using the thread library.
Order of execution: Begins with Thread A, Then thread B and Then thread C.
Thread A and Thread B shares the global variable X. Only when after thread A writes on X,
thread B can read X. In that case, synchronization is to be adopted on the shared variable to
avoid thread B from reading old data.Context switching from thread A to Thread B and then
Thread C takes place within the process address space. Each thread saves and restores the
registers in its own thread control block (TCB). Thread C remains in blocked state, until thread
B starts its first write operation on the destination file. This is the reason behind, the graphical
indication of 100% pops-up a few seconds later although process completion.
Dependency between ULT and KLT :
The one and only major dependency between KLT and ULT arise when an ULT is in need of
the Kernel resources. Every ULT thread is associated to a virtual processor called Light-
weight process. This is created and bound to ULT by the thread library according to the
application need. Whenever a system call invoked, a kernel level thread is created and
scheduled to the LWPs by the system scheduler. These KLT are scheduled to access the kernel
resources by the system scheduler which is unaware of the ULT. Whereas the KLT is aware of
each ULT associated to it via LWPs.

What if the relationship does not exist?


If there is no association between KLT and ULT, then according to kernel every process is a
single-threaded process. In that case,

1. The system scheduler may schedule a process with threads that are of less priority
or idle threads. This leads to starvation of high-prioritized thread, which in turn
reduces the efficiency of the system.

2. When a single thread gets blocked, the entire process gets blocked. Then the CPU
utilization even in a multicore system will become much less. Though there may
exist executable threads, kernel considers every process as a single threaded process
and allocates only one core at a time.

3. System scheduler may provide a single time slice irrespective of the number of
threads within a process. A single threaded process and a process with 1000 threads
provided with same time slice will make system more inefficient.

Here are some advantages and disadvantages of user-level threads and kernel-level
threads:

Advantages of user-level threads:


1. Lightweight: User-level threads are lightweight and fast to create and switch
between, as they do not require kernel intervention.
2. Flexible: User-level threads allow for greater flexibility in thread management, as
the application can implement custom scheduling and synchronization strategies.
3. Portable: User-level threads are portable across different operating systems and
hardware architectures, as they are implemented entirely in user space.

Disadvantages of user-level threads:

1. Lack of concurrency: User-level threads cannot take advantage of multiple CPUs or


CPU cores, as only one thread can be executing at a time.
2. Limited system awareness: User-level threads are not visible to the operating
system, which can limit their ability to interact with system resources.
3. Poor scalability: User-level threads can suffer from poor scalability when there are
many threads, as the application may not have enough information to effectively
manage them.

Advantages of kernel-level threads:

1. Concurrency: Kernel-level threads can take advantage of multiple CPUs or CPU


cores, allowing for greater concurrency and improved performance.
2. Better resource management: Kernel-level threads are visible to the operating
system, allowing for better resource management and scheduling.
3. Robustness: Kernel-level threads are more robust than user-level threads, as they
are not affected by application-level issues such as blocking or crashes.

Disadvantages of kernel-level threads:

1. Overhead: Kernel-level threads are more heavyweight than user-level threads, as


they require kernel intervention for management operations.
2. Complexity: Kernel-level threads can be more complex to implement and manage,
as they require more coordination between the operating system and the application.
3. Less flexible: Kernel-level threads may be less flexible than user-level threads, as
they are subject to the scheduling and management policies of the operating system.

Explain simple process scheduling algorithm with example.

A Process Scheduler schedules different processes to be assigned to the CPU based on particular
scheduling algorithms. There are six popular process scheduling algorithms which we are going
to discuss in this chapter −

 First-Come, First-Served (FCFS) Scheduling


 Shortest-Job-Next (SJN) Scheduling
 Priority Scheduling
 Shortest Remaining Time
 Round Robin(RR) Scheduling
 Multiple-Level Queues Scheduling

These algorithms are either non-preemptive or preemptive. Non-preemptive algorithms are


designed so that once a process enters the running state, it cannot be preempted until it completes
its allotted time, whereas the preemptive scheduling is based on priority where a scheduler may
preempt a low priority running process anytime when a high priority process enters into a ready
state.

First Come First Serve (FCFS)

 Jobs are executed on first come, first serve basis.


 It is a non-preemptive, pre-emptive scheduling algorithm.
 Easy to understand and implement.
 Its implementation is based on FIFO queue.
 Poor in performance as average wait time is high.

Wait time of each process is as follows −

Process Wait Time : Service Time - Arrival Time

P0 0-0=0

P1 5-1=4

P2 8-2=6

P3 16 - 3 = 13

Average Wait Time: (0+4+6+13) / 4 = 5.75

Shortest Job Next (SJN)


 This is also known as shortest job first, or SJF
 This is a non-preemptive, pre-emptive scheduling algorithm.
 Best approach to minimize waiting time.
 Easy to implement in Batch systems where required CPU time is known in advance.
 Impossible to implement in interactive systems where required CPU time is not known.
 The processer should know in advance how much time process will take.

Given: Table of processes, and their Arrival time, Execution time

Process Arrival Time Execution Time Service Time

P0 0 5 0

P1 1 3 5

P2 2 8 14

P3 3 6 8

Waiting time of each process is as follows −

Process Waiting Time

P0 0-0=0

P1 5-1=4

P2 14 - 2 = 12

P3 8-3=5

Average Wait Time: (0 + 4 + 12 + 5)/4 = 21 / 4 = 5.25


Priority Based Scheduling

 Priority scheduling is a non-preemptive algorithm and one of the most common


scheduling algorithms in batch systems.
 Each process is assigned a priority. Process with highest priority is to be executed first
and so on.
 Processes with same priority are executed on first come first served basis.
 Priority can be decided based on memory requirements, time requirements or any other
resource requirement.

Given: Table of processes, and their Arrival time, Execution time, and priority. Here we are
considering 1 is the lowest priority.

Process Arrival Time Execution Time Priority Service Time

P0 0 5 1 0

P1 1 3 2 11

P2 2 8 1 14

P3 3 6 3 5

Waiting time of each process is as follows −

Process Waiting Time

P0 0-0=0

P1 11 - 1 = 10

P2 14 - 2 = 12

P3 5-3=2

Average Wait Time: (0 + 10 + 12 + 2)/4 = 24 / 4 = 6


Shortest Remaining Time

 Shortest remaining time (SRT) is the preemptive version of the SJN algorithm.
 The processor is allocated to the job closest to completion but it can be preempted by a
newer ready job with shorter time to completion.
 Impossible to implement in interactive systems where required CPU time is not known.
 It is often used in batch environments where short jobs need to give preference.

Round Robin Scheduling

 Round Robin is the preemptive process scheduling algorithm.


 Each process is provided a fix time to execute, it is called a quantum.
 Once a process is executed for a given time period, it is preempted and other process
executes for a given time period.
 Context switching is used to save states of preempted processes.

Wait time of each process is as follows −

Process Wait Time : Service Time - Arrival Time

P0 (0 - 0) + (12 - 3) = 9

P1 (3 - 1) = 2

P2 (6 - 2) + (14 - 9) + (20 - 17) = 12

P3 (9 - 3) + (17 - 12) = 11

Average Wait Time: (9+2+12+11) / 4 = 8.5

Multiple-Level Queues Scheduling

Multiple-level queues are not an independent scheduling algorithm. They make use of other
existing algorithms to group and schedule jobs with common characteristics.

 Multiple queues are maintained for processes with common characteristics.


 Each queue can have its own scheduling algorithms.
 Priorities are assigned to each queue.

For example, CPU-bound jobs can be scheduled in one queue and all I/O-bound jobs in another
queue. The Process Scheduler then alternately selects jobs from each queue and assigns them to
the CPU based on the algorithm assigned to the queue.

Explain how kernel prevents a process from monopolizing the use of CPU in Unix System V
In Unix System V and similar operating systems, preventing a process from monopolizing the
CPU involves implementing various scheduling policies and mechanisms to ensure fair
allocation of CPU resources among competing processes. Here's how the kernel achieves this:

1. Time Sharing:
 Unix System V implements time-sharing scheduling, where CPU time is divided
among active processes in small time slices or quantum.
 Each process is allocated a time slice during which it can execute on the CPU.
Once the time slice expires, the kernel preemptively switches to another process,
ensuring fairness among competing processes.
2. Priority-based Scheduling:
 System V Unix employs priority-based scheduling to assign priorities to processes
based on various factors such as CPU usage, I/O activity, and process
characteristics.
 Higher-priority processes are given preferential treatment in CPU allocation, but
the system ensures that lower-priority processes are not starved of CPU time.
3. Round-robin Scheduling:
 Unix System V may use round-robin scheduling as a component of its scheduling
policy. In round-robin scheduling, processes are scheduled in a circular queue,
and each process is allowed to execute for a fixed time slice before being
preempted.
 This mechanism ensures that each process gets a fair share of CPU time and
prevents any single process from monopolizing the CPU indefinitely.
4. Quantum and Preemption:
 The kernel in Unix System V sets a predefined time quantum for each process,
after which the process is preempted and placed back in the ready queue.
 This ensures that no process can monopolize the CPU for an extended period.
Even if a process is CPU-bound, it will be periodically interrupted to allow other
processes to execute.
5. Dynamic Prioritization:
 System V Unix may adjust the priority of processes dynamically based on their
behavior and resource usage. For example, processes that consume excessive
CPU time may have their priorities lowered to prevent them from monopolizing
the CPU.
6. Fairness and Scheduling Policies:
 The kernel in Unix System V implements fairness in CPU allocation by adhering
to various scheduling policies and algorithms, such as the Completely Fair
Scheduler (CFS) in modern Linux kernels.
 These policies ensure that all processes, regardless of their characteristics or
resource demands, are treated fairly in CPU allocation.

In summary, Unix System V prevents processes from monopolizing the CPU by implementing
time-sharing scheduling, priority-based scheduling, round-robin scheduling, preemption, and
dynamic prioritization. These mechanisms ensure fair allocation of CPU resources among
competing processes and maintain system responsiveness and stability.

Explain profiling in detail. Profiling is a performance analysis technique used in software


development to measure and analyze various aspects of a program's execution behavior, such as
CPU usage, memory usage, function call frequency, and execution time. Profiling provides
insights into how a program behaves during runtime, helping developers identify performance
bottlenecks, optimize code, and improve overall system performance. Here's an in-depth
explanation of profiling:

1. Types of Profiling:

 CPU Profiling: Measures the amount of CPU time consumed by different functions or
sections of code during program execution.
 Memory Profiling: Analyzes the memory usage patterns of a program, including
memory allocation, deallocation, and memory leaks.
 Function-Level Profiling: Tracks the frequency of function calls, execution time of each
function, and call stack information.
 I/O Profiling: Examines the input/output operations performed by the program,
including disk reads/writes, network operations, and file accesses.
 Concurrency Profiling: Investigates how a program utilizes multiple threads or
processes, identifying potential concurrency issues such as race conditions or deadlocks.

2. Profiling Techniques:

 Instrumentation-Based Profiling: Involves modifying the program's source code or


binary to insert profiling code that collects runtime data. This data is then analyzed to
generate profiling reports.
 Sampling-Based Profiling: Works by periodically sampling the program's execution
state, recording information about the program's behavior at each sample point. Sampling
is less intrusive than instrumentation but may miss short-lived events.
 Statistical Profiling: Uses statistical techniques to estimate the frequency and duration
of various program events based on sampled data. This approach provides a lightweight
profiling solution with minimal overhead.
 Dynamic Analysis: Involves analyzing the program's behavior during runtime,
monitoring function calls, memory allocations, and other events as they occur. Dynamic
analysis tools provide real-time insights into program behavior.
3. Profiling Tools:

 Profiling Libraries: Libraries such as gperftools (Google Performance Tools) and Intel
VTune provide APIs for instrumenting code and collecting runtime data.
 Profiler Commands: Tools like gprof, perf, valgrind, and oprofile offer command-line
interfaces for CPU, memory, and I/O profiling.
 Integrated Development Environments (IDEs): Many IDEs, including Visual Studio,
Eclipse, and JetBrains IntelliJ IDEA, offer built-in profiling tools for analyzing and
optimizing code.
 Specialized Profiling Tools: Tools like strace, ltrace, and dtrace provide system-level
tracing capabilities for analyzing system calls, library calls, and kernel-level events.
 Web Profilers: For web applications, tools like Chrome DevTools, Firefox Developer
Tools, and New Relic offer web-based profilers for analyzing client-side and server-side
performance.

4. Profiling Workflow:

 Identify Performance Goals: Define specific performance metrics and goals for the
profiling process, such as reducing CPU usage, optimizing memory usage, or improving
response time.
 Select Profiling Tools: Choose the appropriate profiling tools based on the type of
analysis needed (e.g., CPU profiling, memory profiling) and the programming language
and platform being used.
 Instrument or Sample: Depending on the chosen profiling technique, instrument the
code with profiling hooks or run the program under a profiler to collect runtime data.
 Analyze Results: Analyze the profiling data to identify performance bottlenecks,
resource usage patterns, and areas for optimization. This may involve examining call
graphs, memory allocation traces, or CPU utilization reports.
 Optimize and Iterate: Based on the profiling results, make optimizations to the code,
data structures, algorithms, or system configuration to improve performance. Iterate the
profiling process as needed to validate optimizations and further improve performance.

5. Best Practices:

 Focus on Hotspots: Prioritize optimization efforts on the most significant performance


bottlenecks identified by profiling.
 Profile Representative Workloads: Use realistic input data and workload scenarios for
profiling to ensure that optimizations address real-world usage patterns.
 Profile Early and Often: Incorporate profiling into the development process from the
beginning and profile regularly to detect performance regressions early.
 Use Multiple Profiling Techniques: Combine multiple profiling techniques to gain a
comprehensive understanding of the program's behavior and performance characteristics.
 Validate Optimizations: After making optimizations based on profiling results, re-
profile the code to validate the effectiveness of the optimizations and identify any new
performance issues.
In summary, profiling is a powerful technique for analyzing and optimizing software
performance. By identifying performance bottlenecks, memory leaks, and other issues, profiling
helps developers create faster, more efficient software with improved resource utilization and
responsiveness.

Memory management and I/O Subsystem

What is demand paging? Explain data structures used for it.


Demand paging can be described as a memory management technique that is used in operating
systems to improve memory usage and system performance. Demand paging is a technique
used in virtual memory systems where pages enter main memory only when requested or
needed by the CPU.
In demand paging, the operating system loads only the necessary pages of a program into
memory at runtime, instead of loading the entire program into memory at the start. A page fault
occurred when the program needed to access a page that is not currently in memory. The
operating system then loads the required pages from the disk into memory and updates the
page tables accordingly. This process is transparent to the running program and it continues to
run as if the page had always been in memory.
What is Page Fault?
The term “page miss” or “page fault” refers to a situation where a referenced page is not found
in the main memory.
When a program tries to access a page, or fixed-size block of memory, that isn’t currently
loaded in physical memory (RAM), an exception known as a page fault happens. Before
enabling the program to access a page that is required, the operating system must bring it into
memory from secondary storage (such a hard drive) in order to handle a page fault.
In modern operating systems, page faults are a common component of virtual memory
management. By enabling programs to operate with more data than can fit in physical memory
at once, they enable the efficient use of physical memory. The operating system is responsible
for coordinating the transfer of data between physical memory and secondary storage as
needed.
What is Thrashing?
Thrashing is the term used to describe a state in which excessive paging activity takes place in
computer systems, especially in operating systems that use virtual memory, severely impairing
system performance. Thrashing occurs when a system’s high memory demand and low
physical memory capacity cause it to spend a large amount of time rotating pages between
main memory (RAM) and secondary storage, which is typically a hard disc.
It is caused due to insufficient physical memory, overloading and poor memory management.
The operating system may use a variety of techniques to lessen thrashing, including lowering
the number of running processes, adjusting paging parameters, and improving memory
allocation algorithms. Increasing the system’s physical memory (RAM) capacity can also
lessen thrashing by lowering the frequency of page swaps between RAM and the disc.
Pure Demand Paging
Pure demand paging is a specific implementation of demand paging. The operating
system only loads pages into memory when the program needs them. In on-
demand paging only, no pages are initially loaded into memory when the program starts, and
all pages are initially marked as being on disk.
Operating systems that use pure demand paging as a memory management strategy do so
without preloading any pages into physical memory prior to the commencement of a task.
Demand paging loads a process’s whole address space into memory one step at a time,
bringing just the parts of the process that are actively being used into memory from disc as
needed.
It is useful for executing huge programs that might not fit totally in memory or for computers
with limited physical memory. If the program accesses a lot of pages that are not in memory
right now, it could also result in a rise in page faults and possible performance overhead.
Operating systems frequently use caching techniques and improve page replacement
algorithms to lessen the negative effects of page faults on system performance as a whole.
Working Process of Demand Paging
So, let us understand this with the help of an example. Suppose we want to run a process P
which have four pages P0, P1, P2, and P3. Currently, in the page table, we have pages P1 and
P3.

Therefore, the operating system‘s demand paging mechanism follows a few steps in its
operation.
 Program Execution: Upon launching a program, the operating system allocates a
certain amount of memory to the program and establishes a process for it.
 Creating page tables: To keep track of which program pages are currently in
memory and which are on disk, the operating system makes page tables for each
process.
 Handling Page Fault: When a program tries to access a page that isn’t in memory
at the moment, a page fault happens. In order to determine whether the necessary
page is on disk, the operating system pauses the application and consults the page
tables.
 Page Fetch: The operating system loads the necessary page into memory by
retrieving it from the disk if it is there.
 The page’s new location in memory is then reflected in the page table.
 Resuming the program: The operating system picks up where it left off when the
necessary pages are loaded into memory.
 Page replacement: If there is not enough free memory to hold all the pages a
program needs, the operating system may need to replace one or more pages
currently in memory with pages currently in memory. on the disk. The page
replacement algorithm used by the operating system determines which pages are
selected for replacement.
 Page cleanup: When a process terminates, the operating system frees the memory
allocated to the process and cleans up the corresponding entries in the page tables.
Advantages of Demand Paging
So in the Demand Paging technique, there are some benefits that provide efficiency of the
operating system.
 Efficient use of physical memory: Query paging allows for more efficient use
because only the necessary pages are loaded into memory at any given time.
 Support for larger programs: Programs can be larger than the physical memory
available on the system because only the necessary pages will be loaded into
memory.
 Faster program start: Because only part of a program is initially loaded into
memory, programs can start faster than if the entire program were loaded at once.
 Reduce memory usage: Query paging can help reduce the amount of memory a
program needs, which can improve system performance by reducing the amount of
disk I/O required.
Disadvantages of Demand Paging
 Page Fault Overload: The process of swapping pages between memory and disk
can cause a performance overhead, especially if the program frequently accesses
pages that are not currently in memory.
 Degraded performance: If a program frequently accesses pages that are not
currently in memory, the system spends a lot of time swapping out pages, which
degrades performance.
 Fragmentation: Query paging can cause physical memory fragmentation,
degrading system performance over time.
 Complexity: Implementing query paging in an operating system can be complex,
requiring complex algorithms and data structures to manage page tables and swap
space.

Explain working of page stealer process.


The page stealer process, also known as the page-out daemon or pager, is a component of the
virtual memory management system in operating systems. Its primary function is to reclaim
physical memory by transferring less frequently used pages from RAM (physical memory) to the
swap space (disk storage). This process is crucial for maintaining system performance and
preventing excessive memory pressure.

Here's how the page stealer process typically works:

1. Monitoring Memory Usage:


 The page stealer process continuously monitors the system's memory usage and
available free memory.
 It tracks the usage patterns of pages in physical memory, such as how frequently
they are accessed and modified.
2. Page Replacement Policy:
 The page stealer process adheres to a page replacement policy, such as the Least
Recently Used (LRU) algorithm or variants like Clock or Aging.
 This policy determines which pages should be selected for eviction when there is
a need to free up physical memory.
3. Thresholds and Triggers:
 The page stealer process sets thresholds or triggers based on system conditions,
such as the amount of free memory available, the rate of page faults, or other
performance metrics.
 When these thresholds are exceeded or triggers are activated, indicating a
shortage of free memory, the page stealer process initiates page-out operations.
4. Page-Out Operations:
 The page stealer process selects candidate pages for eviction based on the chosen
page replacement policy.
 It identifies pages that are less frequently accessed or considered less critical for
current system operations.
 Once candidate pages are selected, the page stealer process initiates page-out
operations to transfer these pages from physical memory to the swap space (disk).
 This involves writing the contents of the selected pages to the swap area on disk,
freeing up the corresponding physical memory frames.
5. Updating Page Tables:
 After the page-out operation is completed, the page tables and other data
structures are updated to reflect the new location of the paged-out pages.
 The page stealer process may also update page table entries to mark the evicted
pages as not present in physical memory, indicating that they are now residing in
the swap space.
6. Handling Page Faults:
 When a process attempts to access a page that has been paged out to disk, a page
fault occurs.
 The operating system's memory management subsystem handles page faults by
loading the required page back into physical memory from the swap space, if
necessary.
 The page stealer process may also be involved in servicing page faults,
coordinating the retrieval of paged-out pages and bringing them back into
memory.
7. Dynamic Operation:
 The page stealer process operates dynamically, adjusting its behavior based on
changing system conditions and workload characteristics.
 It balances the need for reclaiming memory with the performance impact of page-
out operations, striving to maintain system responsiveness and efficiency.

Overall, the page stealer process is a vital component of virtual memory management, ensuring
efficient utilization of physical memory resources while transparently managing the movement
of data between RAM and disk to accommodate the demands of running processes.
Explain page fault. Explain handling of validity page fault.

Page faults dominate more like an error. A page fault will happen if a program tries to access a
piece of memory that does not exist in physical memory (main memory). The fault specifies the
operating system to trace all data into virtual memory management and then relocate it from
secondary memory to its primary memory, such as a hard disk.
A page fault trap occurs if the requested page is not loaded into memory. The page fault
primarily causes an exception, which is used to notify the operating system to retrieve
the "pages" from virtual memory to continue operation. Once all of the data has been placed
into physical memory, the program resumes normal operation. The Page fault process occurs in
the background, and thus the user is unaware of it.

1. The computer's hardware track to the kernel and the program counter is often saved on
the stack. The CPU registers hold information about the current state of instruction.
2. An assembly program is started, which saves the general registers and other volatile data
to prevent the Operating system from destroying it.

Page Fault Handling

A Page Fault happens when you access a page that has been marked as invalid. The paging
hardware would notice that the invalid bit is set while translating the address across the page
table, which will cause an operating system trap. The trap is caused primarily by the OS's failure
to load the needed page into memory.

Now, let's understand the procedure of page fault handling in the OS:

1. Firstly, an internal table for this process to assess whether the reference was valid or
invalid memory access.
2. If the reference becomes invalid, the system process would be terminated. Otherwise, the
page will be paged in.
3. After that, the free-frame list finds the free frame in the system.
4. Now, the disk operation would be scheduled to get the required page from the disk.
5. When the I/O operation is completed, the process's page table will be updated with a new
frame number, and the invalid bit will be changed. Now, it is a valid page reference.
6. If any page fault is found, restart these steps from starting.

Page Fault Terminology

There are various page fault terminologies in the operating system. Some terminologies of page
fault are as follows:

1. Page Hit

When the CPU attempts to obtain a needed page from main memory and the page exists in main
memory (RAM), it is referred to as a "PAGE HIT".

2. Page Miss

If the needed page has not existed in the main memory (RAM), it is known as "PAGE MISS".

3. Page Fault Time

The time it takes to get a page from secondary memory and recover it from the main memory
after loading the required page is known as "PAGE FAULT TIME".

4. Page Fault Delay

The rate at which threads locate page faults in memory is referred to as the "PAGE FAULT
RATE". The page fault rate is measured per second.

5. Hard Page Fault

If a required page exists in the hard disk's page file, it is referred to as a "HARD PAGE
FAULT".

6. Soft Page Fault

If a required page is not located on the hard disk but is found somewhere else in memory, it is
referred to as a "SOFT PAGE FAULT".

7. Minor Page Fault

If a process needs data and that data exists in memory but is being allotted to another process at
the same moment, it is referred to as a "MINOR PAGE FAULT".
Explain in detail allocation of space on swap device.
Swapping is a memory management technique used in multi-programming to increase the
number of processes sharing the CPU. It is a technique of removing a process from the main
memory and storing it into secondary memory, and then bringing it back into the main memory
for continued execution. This action of moving a process out from main memory to secondary
memory is called Swap Out and the action of moving a process out from secondary memory to
main memory is called Swap In.
Swap-space management is a technique used by operating systems to optimize memory usage
and improve system performance. Here are some advantages and disadvantages of swap-space
management:

Advantages:

1. Increased memory capacity: Swap-space management allows the operating system


to use hard disk space as virtual memory, effectively increasing the available
memory capacity.
2. Improved system performance: By using virtual memory, the operating system can
swap out less frequently used data from physical memory to disk, freeing up space
for more frequently used data and improving system performance.
3. Flexibility: Swap-space management allows the operating system to dynamically
allocate and deallocate memory as needed, depending on the demands of running
applications.

Disadvantages:

Slower access times: Accessing data from disk is slower than accessing data from physical
memory, which can result in slower system performance if too much swapping is required.
Increased disk usage: Swap-space management requires disk space to be reserved for use as
virtual memory, which can reduce the amount of available space for other data storage
purposes.
Risk of data loss: In some cases, if there is a problem with the swap file, such as a disk error or
corruption, data may be lost or corrupted.
Overall, swap-space management is a useful technique for optimizing memory usage and
improving system performance. However, it is important to carefully manage swap space
allocation and monitor system performance to ensure that excessive swapping does not
negatively impact system performance.
Swap-Space :
The area on the disk where the swapped-out processes are stored is called swap space.
Swap-Space Management :
Swap-Space management is another low-level task of the operating system. Disk space is used
as an extension of main memory by the virtual memory. As we know the fact that disk access
is much slower than memory access, In the swap-space management we are using disk space,
so it will significantly decreases system performance. Basically, in all our systems we require
the best throughput, so the goal of this swap-space implementation is to provide the virtual
memory the best throughput. In these article, we are going to discuss how swap space is used,
where swap space is located on disk, and how swap space is managed.
Swap-Space Use :
Swap-space is used by the different operating-system in various ways. The systems which are
implementing swapping may use swap space to hold the entire process which may include
image, code and data segments. Paging systems may simply store pages that have been pushed
out of the main memory. The need of swap space on a system can vary from a megabytes to
gigabytes but it also depends on the amount of physical memory, the virtual memory it is
backing and the way in which it is using the virtual memory.
It is safer to overestimate than to underestimate the amount of swap space required, because if
a system runs out of swap space it may be forced to abort the processes or may crash entirely.
Overestimation wastes disk space that could otherwise be used for files, but it does not harm
other.
Following table shows different system using amount of swap space:

Figure – Different systems using amount of swap-space

Explain the swapping of processes between swap space and main memory.
To increase CPU utilization in multiprogramming, a memory management scheme known as
swapping can be used. Swapping is the process of bringing a process into memory and then
temporarily copying it to the disc after it has run for a while. The purpose of swapping in an
operating system is to access data on a hard disc and move it to RAM so that application
programs can use it. It’s important to remember that swapping is only used when data isn’t
available in RAM. Although the swapping process degrades system performance, it allows
larger and multiple processes to run concurrently. Because of this, swapping is also known
as memory compaction. The CPU scheduler determines which processes are swapped in and
which are swapped out. Consider a multiprogramming environment that employs a priority-
based scheduling algorithm. When a high-priority process enters the input queue, a low-
priority process is swapped out so the high-priority process can be loaded and executed. When
this process terminates, the low priority process is swapped back into memory to continue its
execution. Below figure shows the swapping process in operating system:
Swapping has been subdivided into two concepts: swap-in and swap-out.
 Swap-out is a technique for moving a process from RAM to the hard disc.
 Swap-in is a method of transferring a program from a hard disc to main memory, or
RAM.
Advantages
 If there is low main memory so some processes may has to wait for much long but
by using swapping process do not have to wait long for execution on CPU.
 It utilize the main memory.
 Using only single main memory, multiple process can be run by CPU using swap
partition.
 The concept of virtual memory start from here and it utilize it in better way.
 This concept can be useful in priority based scheduling to optimize the swapping
process.
Disadvantages
 If there is low main memory resource and user is executing too many processes and
suddenly the power of system goes off there might be a scenario where data get
erase of the processes which are took parts in swapping.
 Chances of number of page faults occur
 Low processing performance

Write a short note on: Streams.

In computing, a stream refers to a sequence of data elements that are made available over time.
Streams are fundamental abstractions used for input and output operations in programming
languages and operating systems. They provide a flexible and efficient way to handle data
communication between processes, files, devices, and networks.

Key Characteristics of Streams:


1. Sequential Access: Streams typically support sequential access, allowing data to be read
or written sequentially from start to end. This makes streams well-suited for tasks such as
reading from or writing to files, where data is processed sequentially.
2. Abstraction: Streams provide an abstraction layer that hides the underlying complexity
of data sources and destinations. They allow programmers to interact with different types
of data sources (e.g., files, network sockets) using a consistent interface, regardless of the
specific implementation details.
3. Buffering: Many stream implementations include buffering mechanisms to improve
performance. Buffering involves temporarily storing data in memory before it is read
from or written to the underlying data source or destination. This can reduce the
frequency of I/O operations and improve overall throughput.
4. Bidirectional Communication: Some streams support bidirectional communication,
allowing data to be both read from and written to the same stream. This is useful for tasks
such as inter-process communication or network communication, where data needs to be
exchanged in both directions.
5. Standardization: Streams are often standardized within programming languages and
operating systems, providing a common interface for performing input and output
operations. This simplifies the development of applications by providing consistent I/O
APIs that can be used across different platforms.

Common Operations on Streams:

1. Reading: Reading from a stream involves retrieving data elements from the stream
sequentially. This can include reading bytes from a file, characters from a network
socket, or structured data from a data source.
2. Writing: Writing to a stream involves sending data elements to the stream sequentially.
This can include writing bytes to a file, sending messages over a network socket, or
storing structured data in a data sink.
3. Flushing: Flushing a stream involves forcing any buffered data to be written to the
underlying data destination immediately. This is typically done to ensure that data is not
held in memory for an extended period and is instead written out promptly.
4. Closing: Closing a stream involves releasing any associated system resources and
terminating communication with the underlying data source or destination. This is
important for resource management and ensuring that system resources are not leaked.

In summary, streams are versatile and powerful abstractions for handling input and output
operations in software development. They provide a unified interface for interacting with various
types of data sources and destinations, offering flexibility, efficiency, and ease of use in
managing data communication tasks.

You might also like