CISC -> Complex instructions, could take more then one clock cycles per instruction
Easy to write the code in CISC
Intel and AMD works on CISC architecture.
RISC -> Reduced instructions, always takes one clock cycle per instruction.
Hard to write the code in RISC, e.g. only register to register instructions
work (no access of memory during execution step).
ARM (which used in android phones), M series (used in mac) and A series (used
in Iphones) works on RISC.
Operating System -> The Software in which we automate human afforts that acts as a
bridge between software (developed by human) and hardware of computer.
Unix -> A powerful, multiuser, multitasking OS developed in 70's at Bell Labs that
became the foundation of Linux, macOS and Android.
Trusted computing -> Is a method of using special hardware (like TPM chip) and
software rules to ensure that a system can be trusted to behave as expected and not
be tampered with.
TPM (Trusted Plateform Module) -> A hardware chip that contains algorithms and
other cryptographic keys, digital certifications that checks whether the data
received is secure or trusted.
What runs first in computer?
1. After pressing power button the power button sends signal to power supply unit
then it delivers electricity to motherboard and other components.
2. CPU start executing instructions but it don't know what to run so it gets
special memory address hardcoded in its hardware.
3. Then CPU starts executing code from BIOS or UEFI that is stored in flash memory
in motherboard.
These softwares:
Initialize hardware devices (keyboard, mouse, Ram, disk etc)
Checks whether these devices working
Detects connected storage (HDD, SSD, USB)
Looks for a bootable device
4. BIOS/UEFI locates bootloader on the selected boot device(SSD, HDD, USB) from MBR
(Master Boot Record) if BIOS
MBR is first sector (512 bytes) of storage device that contains the
information of:
Where the bootloader is
How the disk is partitioned
MBR points to VBR (Volume Boot Record) then BIOS loads VBR
VBR is the first sector of partition of a disk (512 bytes) that
contains code and data to boot OS from that specific partition.
It allows each partition to have its own bootloader logic that
enables multi-boot system (windows and linux on separate partitions).
Unlike MBR, it is filesystem aware (knows how to read different
filesystems like Fat32, NTFS etc).
It also contains bootloader code.
The bootloader(GRUB or Windows Boot Manager, etc) is a small program that
loads the OS kernel.
UEFI directly reads GPT (Guid Partition Table) like MBR read by BIOS, that
has information about bootloader which load by UEFI.
5. After loading OS kernel (like Linux kernel or windows NT kernel) into memory
then control is passed from bootloader to kernel.
6. OS takes over the computer,
initialize system drives,
mount file system,
start background services
launches the user interface
sequence: CPU -> BIOS/UEFI -> BootLoader -> OS kernel
That all is known as Booting.
Kernel -> core of something or the most important part of something
Bootloaders -> programs that load the OS into memory and give control to OS, acts
as a bridge between firmware (BIOS/UEFI) and OS kernel.
BIOS Bootloaders are GRUB for linux and Windows Boot Manager for
windows.
UEFI Bootloaders are special bootloaders that is .efi file and they
contains two types:
1. OS Independent -> can load any OS
2. OS specific -> can load specific OS
Why Do We Need a Bootloader?
BIOS/UEFI doesn't know how to load complex OS kernels from file systems (like NTFS,
ext4).
The bootloader understands file systems, knows where the kernel is, and loads it
into RAM.
It lets you choose between multiple OSs (dual booting).
It passes boot parameters to the kernel (like memory size, debug flags, etc.)
File System -> is the combinations of algorithms and data that is used to organize,
store, manage and retrive data on a storage device like HDD, SSD or USB drive that
is stored in our physical storage like HDD, SSD or USB.
It defines:
How files are stored
How directories and paths work
How space is allocated
What permissions and metadata a file can have
Bootloaders loads the OS kernel from file system then give control to OS.
MBR -> first sector of entire disk
VBR -> first sector of partition of a disk
Operatinig system -> A program that provides controlled access to resources (I/O
devices, CPU, Memory, Storage, Network).
Mechanisms and Policies:
Mechanisms are low level implementations and procedures that carry
fundemental operations in OS.
E.g. Interrupt handeling, Memory management, mode and context
switching, file system managment, etc.
Mechanism = How something is implemented
Policies are stretegies that define what should be done or which action to
take in given situations.
E.g. Security -> access control, file access -> read, read/write,
execute, Memory allocation, CPU scheduling
Policy = What decision is made or rule is applied
They are separated bcz:
Flexibility -> You can change policy without changing mechanism,
Customization -> different systems have different policies on the same
mechanism.
OS Structure:
Structure of OS refers to how its components are organized both logically
(how we divided software modules) and physically (how each software module is
defined).
Components are software modules and software modules are:
Process manager, Memory manager, File system, I/O system
Types of OS structure:
1. Monolithic structure:
All OS components are in single OS kernel that can directly
interect with hardware and with each other.
Pros -> very fast
Cons -> hard to maintain, security issue, single bug can crash
the whole system
2. Moduler structure:
All OS components are divided into multiple independent software
modules where each module can interect by other through defined interfaces.
Here kernel layer contains interfaces.
Pros -> easy to debug, easy to maintain and update
Cons -> performance overhead due to extra interfaces, complex bcz
module independency management
Linux, Windows and many other OS are based on this structure.
3. Microkernel structure:
All OS components acts as separate services runs in user space
and only core modules (communication, scheduling, memory) are in kernel.
Pros -> More secure and stable, easy to update
Cons -> Huge performence overhead bcz of large context switching
User mode and Kernel mode:
Modern CPUs has priviledge levels, OS uses these levels to protect the
system's critical functions (software modules).
These priviledge levels are divided into different modes a.k.a rings and they
are:
1. Kernel Mode (Ring 0):
Full access to all hardware and software resources
Can execute any CPU instruction
2. User Mode (Ring 3):
Restricted Enviroment
Runs only user applications
Cannot directly access hardware and user memory
We need system calls a.k.a inturrepts to switch between different modes and
this process is known as mode switching.
System calls (requests in user mode by applications) -> File operations,
Memory management, Process control, Device I/O.
Voilations:
If a system call by an application which could harm CPU is trap by OS then OS
will stop the reqeuest of execution.
Older CPU was provide control to specific program in CPU until the
application stop processing, but milicous applications will never return control to
kernel mode and OS got stuck,
Context switching:
To solve that problem OS define a specific timer known as timer
interrupt which provide control to specific application for specific time then it
switch control to another program. It is slow process but it makes OS safe from
that kind of milicous applications.
This process is known as context switch as it switch between one to
another application, it first store the CPU state (register values and flag states)
into memory of one application then loads the CPU state of another application from
memory into CPU of another application bcz of this it is slow.
Now int instruction are replaced by sysenter instruction due to its fast
processing.
Process vs Program:
Program -> code and static data stored in a file
Process -> A running program
Memory map -> created after a program come into process phase (loaded into memory)
that contains:
Text -> compiled program
Data -> initialized static data
BSS (block started by symbol) -> uninitialized static data
Heap -> dynamically alloced memory for objects
Stack -> address space and function parameters
These are loaded into memory in the order of buttom to top, first text then
data then so on upto stack,
bcz of this it says heap grows upward and stack grows downward.
System state -> tells the OS how to manage a process; the OS uses this info to
decide which process the CPU should execute next.
E.g. Open files, Process ID, Scheduling, memory and I/O statistics
Processor state -> It refers to what CPU is current doing, the current state of
CPU.
It includes register values, flag states, pointer values.
Processes in a Multitasking Environment:
Multiple concurrent processes -> Each process has a unique identifier known
as Process ID (PID).
Asynchronous events (inturrepts) -> OS need to take care of them
If process demands an operation that take longer time then OS need for
context switch.
Goal of OS -> Always put CPU in something running.
Process states -> describe the current state of process (program which is loaded
into memory). It is divided into three states:
1. Running -> The process's context is loaded into CPU and currently running.
2. Blocked (suspended) -> Due to operation of process that takes longer time
(I/O inturrept, timer inturrept) OS context switched the process and process
context stored in memory.
3. Ready -> Process is currently ready to execute (loaded in memory and no
inturrepts) but due to past inturrepts or scheduling OS stop the process to execute
called Preemption.
Process Control Block (PCB):
OS needs to keep track of each process for context switching in multitasking
enviroment, so OS stores PCB of each process in kernel space (specific area in
memory not a part of user space) that is later restored to execute the process.
PCB contains -> Process ID, Machine state (CPU register values, program
counter, stack pointer), process state, memory map etc.
Process states in details:
1. Created -> The process is created (either by user using .exe or OS using
init in linux, launchd in mac), its PCB and memory map are being loaded into
memory.
2. Ready -> Process is currently in ready state (its PCB and memory map is
loaded).
3. Running -> is subdivided into three states:
User running -> CPU state of process is loaded into CPU
Kernel runnnig -> Due to sys call or inturrept mode switch occuered and
OS take control, now OS can make context switch to give time to another process or
return to same process.
Zombie -> In kernel running the process could exited so no need to go
back to that process but its PCB and memory map are loaded into memory which need
to cleanup by OS.
4. Blocked (suspended) -> After mode switch from user mode to kernel mode via
sys call or int, due to that int which takes some time OS suspend the process.
Creating a process under POSIX(Portable Operating System Interface):
Defines functions to create a process under IEEE standard known as POSIX
which make programs portable on different Unix based OS (Linux, MacOS etc).
It has following functions:
1. fork() -> System call to create a new process,
A process call fork() then OS create a copy of that process but having
different PID, makes copy of PCB which contains same address of memory map as
parent has.
Memory map of both (parent and child) processes are the same (no making
copy) unless any of the process execute writes operation, if does then OS makes
copy of memory map only that section which process want to make change.
Parent PID > 0, child PID = 0, if PID < 0 -> error indication
What happens in fork?
• Check for available resources
• Allocate a new PCB
• Assign a unique PID
• Check process limits for user
• Set child state to “created”
• Copy data from parent PCB slot to child (with different PID)
• Increment counts on current directory & open files
• Copy parent context in memory (or set copy on write) (if write
is execute for any process)
• Set child state to “ready to run”
• Wait for the scheduler to run the process
2. exec(process -> string[]) -> System call used to replace the current
process into new one based on the parameter (string[]) in exec.
It has the same PID as the process before it but the content of PCB and
memory map is replaced (not all).
It is family of system calls, some of them are execl(), execv(),
execle() etc.
OS use fork() and exec() system calls combinly to create a new process,
say we are in terminal and we want to open calculator, we type command to
open calculator then OS:
use fork() syscall -> duplicate terminal
if (pid == 0) -> exec(calculator) # pid == 0 means child which is
calculator after exec
else wait for child to exit then continue terminal # parent block which
is terminal as it was
To do the above we need 2 more system calls:
3. exit() -> System call which is used to tell the OS I done by returning a
status to OS,
then OS push the process into zombie state in kernel mode.
4. wait() -> System call that tell to the OS to put me in blocked state until
the child exit,
OS put the process in blocked state until OS receives an exit status
from its child.
Sequence of process in creating another process: fork() -> exec() in child
process -> exit in child process -> wait() in parent process
Inter Process Communication (IPC) -> Is the mechanism that let multiple processes
to communicate with each other and each data. Examples are client-server model but
one of the best example are terminating a specific process by OS, If we
terminate a proccess by OS the OS tell the process to exit but process need to exit
their childred first,
so parent process communicate with their children to exit.
We use methods like pipes to feed output of one process into input of another
process e.g. in windwos CMD: type os_notes.txt | find "booting"
Signals -> the data (messages) exchange between OS and processes to communicate
with each other are known as signals. OS communicate with processes using these
commands.
In linux we use kill (pid, signal_number) to send signal (not to kill
process, to kill or terminate process we use signal_number = 9) to specified
processes by its PID and use signal_number to tell what we want.
E.g. kill -9 1234 # kill the process with pid = 1234
We can use kill for communication to another process too
We use signal(signal_number, function) to detect signal which other process
or OS send to our process, here signal_number is what we received and function is
signal handler which handle the response to OS (sends what process want to send).
Example:
#include <stdio.h>
#include <signal.h>
void handle_sigint(int sig) {
printf("Caught SIGINT (Ctrl+C), but not exiting!\n");
}
int main() {
signal(SIGINT, handle_sigint); // Handle Ctrl+C
while (1) {
printf("Running...\n");
sleep(1);
}
return 0;
}
Thread -> Thread is a light weight execution unit inside a process.
It shared the memory map to another threads inside a single process,
it share text, data and heap part of memory map but has different stack bcz
every thread need different functions to execute in different time frames,
it means it must have different CPU state so it also has different CPU
states.
It is light weight it means the context switching between them is very easy
(in terms of time complexity) for OS as it has same memory map and PCB.
IPC is very easy in threads as it runs under the same memory map.
Unlike processes, if one thread crash then the whole process will crashed.
A thread is a subset of process, a process contains one or more threads.
OS also preempts and schedule threads bcz of this every thread has its own
thread ID, a multi core CPU can handle multiple threads easily (no context
switching).
Thread Control Block:
A single PCB of process contains one or more Thread Control Blocks (TCB) as
per the number of threads.
They share the signals to each other.
Thread not aware OS:
Every process must deal with its own threads, operating system isn't
aware of any thread of the process.
If any thread use system call then the whole process will be blocked by
OS.
Scheduler has to realize that:
Context switching is much expensive we have to:
flush caches,
store the state of CPU in memory,
update the state of CPU
This problem can arise in multi core CPU, to deal with this we have to design
scheduler as:
Try to reschedule the thread on the core on which it was last scheduled
then we don't need to flush and update CPU state (no intereaction with memory).
This mechanism is called CPU Affinity.
Concurrent vs Simultanous processes:
Concurrent processes are those which run parellel, possible in multicore
CPUs,
Simultanous processes are those which run in single core CPUs, OS use context
switching between processes.
Multi-threaded programming patterns:
1. Single Thread Tast -> Create the thread, do a specific job by thread, then
release (distroye) the thread.
2. Worker threads -> Create the thread, assign specific job to the thread, if
the job completed then put the thread in wait state.
3. Thread pools -> Create many threads (a.k.a thread pool) if a job has to be
done then take a thread from pool and assign that job to thread.
Kernel level vs User level threads:
1. Kernel level threads -> That are create, managed and scheduled by OS
kernel, its TCB is stored in kernel space and OS kernel knows about each specific
thread.
Pros:
OS knows about each thread so true parallelism is possible on
multi core processor.
One block thread (due to I/O) does not affect others.
Cons:
More overhead due to system calls
Requierd context switching between threads which is slower then
user thread context switching.
Example: Linux pthread (POSIX thread)
2. User Level threads -> That are create, managed and scheduled by user
defined library, OS kernel sees only the process not threads so thread scheduling
is done by the user library.
Pros:
Faster to create and do context switching as no kernel is
involvement.
Can run on OSes that don't support threads natively.
Cons:
If one thread is blocked by kernel due to I/O then the whole
process will be blocked (all threads of that process).
Cannot take advantage of multiple cores without special
scheduling tricks.
Example: Green threads in java, Python's threading module, etc.
Kernel threads = managed by OS, better concurrency, slower context switching.
User threads = managed by library, faster switching, but blocking affects
all.
How user level thread library manage threads on kernel level?
1. 1:1 (kernel threads only) -> 1 user thread = 1 kernel thread
2. N:1 (user threads only) -> N user threads on one kernel thread/process
3. N:M (hybrid threading) -> N user threads on M kernel threads
Clone(func) is function like fork() but create thread and it specify specific
function to thread used by pthreads (POSIX thread library in linux).
Concurrent threads/processes -> Two processes/threads are concurrent if they run at
the same time (in multiple CPU cores) or their execution is interleaved in any
order (the context switching so so fast as they appear to be concurrent (a.k.a
simultanous).
Synchronous -> If process/threads executes in specific order (order of execution is
guaranteed).
One should wait until other complete the task.
Often uses one thread doing tasks in sequence.
Asynchronous -> If processes/threads executes without any order.
All processes/threads just to finish their task (no wait cycles)
Often uses multiple threads or event loops so tasks can overlap in time.
Independent -> Processes/threads that donot share any resources with each other.
Two independent processes is always asynchronous but the reverse is not true.
Parellel -> Processes/threads run at the same time on separate processors.
Race condition -> If two concurrent threads are dependent on each other then race
condition will arise as a bug.
Synchronization -> The technique that avoid race conditions.
Sometimes simple instructions like x = x + 1 could cause race conditions as
if it compile to assembly it looks like:
mov ax, [data]
inc ax
mov [data], ax
if some other thread access this variable after line 1 executes then a paste
value cause race condition.
Critical section -> region of program where race condition could arise, e.g. x = x
+ 1 is critical section.
Mutual exclusion -> restriction that allow only one thread to access critical
section at a time to avoid race conditions.
Avoid race conditions with locks:
- Grab and release lock around critical section.
- Wait if you cannot get a lock
Is one possible mutual exclusion.
- If one thread wants to enter it should be permitted, if multiple threads
wants to enter then exactly one should be allow to enter.
Bounded Waiting -> No thread should wait forever to enter critical section.
Solution 1 -> Disable all interrupts before entering critical section then enable
it after leaving critical section.
Pros:
Simple, guaranteed to work.
Often used in uniprocessors
Cons:
Threads has too much control over the system.
Won't work in multi core processors.
Solution 2 -> Software test and set lock.
while (locked) ;
locked = 1;
/* do critical section */
locked = 0;
But it is buggy as it contains race condition and busy waiting or spin lock.
Solution 3 -> Lock step synchronization
Thread 0 Thread 1
while (turn != 0); while (turn != 1);
critical_section(); critical_section();
turn = 1; turn = 0;
But it forces strict alternation (execution in specific order which is predifined
by the programmer, turns asynchronous threads into synchronous) and contains busy
waiting.
Solution from hardware:
Atomic instructions -> are those intructions that cannot be context switched
before executing all of them.
Say:
int test_and_set(int *x) {
last_value = *x;
*x = 1;
return last_value;
}
Is atomic instructions so it should return a value if any thread called the
test_and_set function, before returning value no context switching is possible.
It is hardware based means that instrucitons is predefined loaded into chips
near processors.
Lock and unlock mechanism:
while (test_and_set(&lock) == 1) ; /* spin */
/* do critical section */
lock = 0; /* release the lock */
But it contains busy waiting (a.k.a spin lock) due to many calls of
test_and_set function by other threads while one thread enters critical section.
Compare and swap and fetch and increment are another methods of atomic
instruction. All these solution requires busy waiting.
So far we came to say that Spin locks aren’t great. Can we block until we can get
the critical section?
Yes, solution is semaphores which are both anti busy waiting and mutual exclusive.
Semaphores:
Contains two atomic operations -> up and down
down(sem s) {
if (s > 0)
s = s – 1;
else
sleep on event s
}
up(sem s) {
if (someone is waiting on s)
wake up one of the threads
else
s = s + 1;
}
Think of it as a counter with rules:
- counter represents number of threads that could access the critical
section.
- Threads can increment (acquire/wait) or decrement (sinal/release) the
counter.
- If any thread want to decrement the counter when the counter is zero
then thread goes to sleep until counter increases.
- counter will never be negative.
Two types of semaphores:
1. Binary Semaphore -> counter will either be 1 or 0, used in mutual
exclusion.
E.g. Protecting a shared variable from concurrent modification
2. Counting Semaphore -> counter can be any positive integer up to n,
where n represents number of threads that could access critical region
concurrently.
E.g. Server client model where multiple clients can access and
modify single database.
Key Operations -> Only two atomic operations:
1. wait() (also called P() or down()):
- Decrement the semaphore value by 1.
- If value is 0 then the thread which tries to decrement the
value becomes block until value increses.
2. Signal (also called V() or up()):
- Increment the semaphore value by 1.
- If there are any waiting thread then one of them will
unblocked.
These operations must be atomic, means that no 2 threads can executes
these operations at the same time.
Example real world -> Shared data store (database)
readers-writers -> multiple processors can read concurrently but only
one should have access to write at a time and no reader can read while someone is
writing.
sem mutex=1; // critical sections used only by the reader
sem canwrite=1; // critical section for N readers vs. 1 writer
int readcount = 0; // number of concurrent readers
writer() {
for (;;) {
down(&canwrite);
// write data
up(&canwrite);
}
}
reader() {
for (;;) {
down(&mutex);
readcount++; // critical section
if (readcount == 1) // first reader
down(canwrite); // sleep or disallow the writer from writing
up(&mutex);
// do the read
down(&mutex);
readcount--; // critical section
if (readcount == 0)
up(canwrite);
up(&mutex);
// other stuff
}
}
Messaging:
Rendezvous:
Two parties, sender and receiver: Neither party proceeds until both are
ready.
Steps in a Rendezvous:
Sender waits until the receiver is ready to accept the message.
Receiver waits until the sender is ready to send the message.
When both are ready, the message is transferred directly (often
without buffering).
Both parties continue execution after the exchange.
Advantages:
– No need for message buffering if on same system
– Easy & efficient to implement
– Allows for tight synchronization
Disadvantage:
The receiver can’t proceed until the sender is ready.
Both must arrive at the rendezvous point at the same time
(synchronization point).
So if one side is slower, the other is forced to wait — they move
together like two people walking while tied with a short rope.
Direct Addressing:
The sender knows exactly who should receive the message.
The address (or identifier) of the receiver is given in the send
operation.
Sender must know the identity of receiver.
Indirect Addressing:
Messages are send to a data structure of FIFO queue.
Each queue is a mailbox, so then multiple readers can easily read the
messages at any time.
DeadLock -> Condition in processes or threads in which everyone is waiting for
someone else and non of them is going to free resources, system got stuck.
Example:
Two threads and two resources:
Thread A has Resource 1 and is waiting for Resource 2.
Thread B has Resource 2 and is waiting for Resource 1.
Both wait forever — deadlock.
Deadlocks are present when the graph of process and resource contains cycles.
To avoid deadlocks:
1. Break mutual exclusion -> allow processes to access resources.
2. Break preemption -> Allow OS to forcily take resouces from
processes.
3. Use Banker's algorithm:
Pretend to allocate the requested resource.
Check if all processes can still finish eventually.
If not, deny or delay the request otherwise take the resource.
Process scheduling
Most tasks fall into one of two categories:
1. Short CPU burst -> takes small time of CPU, e.g. Interacive tasks
2. Long CPU burst -> takes large time of CPU, e.g. Computational tasks
Goal of scheduling:
- Maximum use of CPU, improve throughput
- Let another task run when current task is waiting on I/O
Four events may cause scheduler to do something:
1. Current process goes from running to block state
2. Current process terminates
3. Due to inturrpt (a running process goes to ready state), preemptation
4. Current process goes from blocked to ready (I/O completed), preemptation
Based on the above events shedulers divides into 3 types:
1. Preemptive scheduler
2. Cooperative scheduler (non-preemptive) -> a single process runs until a
system call takes place or process exites.
3. Run-to-completion scheduler -> runs until process exits
Scheduler Algorithm goals:
1. Be fair -> each process must takes equal time of CPU
2. Be efficient -> Keep CPU busy
3. Maximum throughput -> Gets processes completed ASAP
4. Minimize response time -> minimize the average time of execution of
processes
5. Be predictable -> tasks should take predictable time
6. Minimize overhead -> Avoid unnecessary context switching
7. Avoid starvation -> When process waits indifinitly (starvation) to avoid
8. Enforce priorities -> Between tasks
9. Degrade gracefully -> If algorithm can't do best then it should also avoid
worst
Scheduling algorithms:
1. First Come, First Served (FCFS) -> Put Processes into FIFO Queue and put
the task in running which comes first and put the state block when I/O syscall
It is non-preemptive which make it inefficient
A process which takes long CPU burst will make the system busy
Poor average response time
2. Round-Robin scheduling -> Processes in FIFO queue like in FCFS but
preemptation by OS bcz of that no process can take longer then predefined quantum
(time slice).
Behavior depends on quantum:
long quantum makes it FCFS
short quantum increases overhead of context switching
Pros:
Every process gets an equal time
Easy to implement
Easy to compute average response time
Cons:
Fairness between processes isn't always good
3. Shortest remaining time first scheduling -> schedule tasks based on their
CPU burst which optimize average response time.
See slide 20 of 07_scheduling.
Pros:
short burst tasks complete first, reduces average response time.
Cons:
long bursts time always finish in the end which is not good practice.
So we need algorithm which defines priority on tasks.
In priority scheduling algorithms each process has a predefined priority
number assigned to it.
4. Multilevel Queues -> In which we arrange multiple queues on different
level which defines their priority and each queue has its own class.
Each class has its own queue, e.g. System processes, Interective
processes etc. see slide 26-27 of 07_scheduling.
Each queue has its own algorithm (round-robin or something).
If a processes is waiting for a long time but high priority class queue
never get empty then this situation is called process aging to avoid it:
We use Pririty Inversion technique to avoid starvation and give CPU
burst to a process which is waiting for long time.
In priority inversion we shift a low level class task to high level
class if it is waiting for long time.
Memory Management:
CPU has two things to do with memory, read from memory and write to memory
but we know the memory access is very slow which require specific way of dealing.
Monoprogramming -> run one program at a time and share the memory between program
and OS.
Multiprogramming -> run more then one program at a time and share the memory with
multiple programs.
But here we need specific memory management for more utilization of CPU.
Relocatable addresses -> Physical addresses are relocatable bcz physical address =
logical address + base address
We introduce logical and physical addresses, CPU use logical addresses to
access data from memory and a hardware known as Memory management Unit(MMU) used to
convert logical address to physical address.
every memory chunk of a program has base address and limit, using offset we
can access other data from PCB but here is limit which stop program from accessing
memory more then its predefined chunk.
Logical address < limit -> this check makes CPU slow
Multiple Fixed partitions:
We divid memory into multiple fixed partitions (segments) which is occupied
by programs.
If a program exits a hole is created in memory (see slide 10 of
08_memory_management).
If we move programs into holes it makes CPU slow as CPU has to move all
program data so hole remains unused.
Variable partition multiprogramming -> create partitions (segments) of any size as
required by program and OS have to look for holes.
If a process need more memory so in that case allocate extra memory for every
process or
find a hole big enough to relocate the process.
Combine two wholes by moving one process two another memory location (which
takes huge CPU burst).
Do these during context switching.
Segmentation hardware -> we divide memory into segments of variable size (as we did
but that was not hardware) which is defined in hardware that makes the task very
fast. It is same as base, limit and offset but in hardware.
This technique is applied in multiple fixed and variable partitions.
Allocation algorithms -> if we want to allocate one process, to avoid combining
holes memory fragmentation we use algorithms:
1. first fit -> find the first hole that fits.
2. best fit -> find the hole that best fit the process.
3. worst fit -> find the largest available hole. (bad)
This technique is applied in multiple fixed and variable partitions.
Segmentation -> dividing memory into segments of variable size.
Paging -> dividing memory into frames (segments) of equal size.
We see the problems with segmentation above now we need to get rid of relocatable
address.
Paging -> Is another memory management scheme in which:
Logical address space is divided into fixed size pages.
Physical memory is divided into frames of fixed size.
Page can be store in any available frame, no need to be contiguous.
Unlike in older schemes pages is non-contiguous, we can store any page in any
frame of memory of the same process, OS will track the page by page table.
No external fragmentation as no hole will be created, small internal
fragmentation could contain by memory.
Pagin is implemented by Memory Management Unit (MMU) which is hardware
component in CPU which:
Translate logical address into physical address, map the physical
address in page table to get address of frame.
Example:
CPU generates virtual address = (Page Number = 5, Offset = 20)
MMU looks up Page 5 in the page table → finds Frame 12 (5 maps to
12).
Physical address = (Frame 12, Offset 20).
Logical address is 32 bit address in which 20 bits specify page number
and 12 bits are offset that allow each logical address to access upto 4KB of
memory.
If we keep the page table in memory then the process becomes slow as it will
takes 2x of time to get the data from memory as compare to other memory management
schemes.
To make it faster we store page table in cache which is called Translation
lookaside Buffer (TLB) near CPU but TLB cache is limited so CPU will refresh the
cache as per context switch.
In paging MMU can also enforce protection by storing protection bits inside
page table.
Protection bits defines read, write, execute instructions on processes to
make it protected.
Applications:
To avoid malwares we use paging -> every page has protection bits read,
write and execute. If a page has write permission then disallow execute and vice
versa.
Memory sharing is easy.
Page replacement:
If memory is full of page frames of multiple processes, if new process
wants to load then move the frames which is not being used into disk space which is
known as swap area or swap file or swap space called swap in, reload when required
called swap out.
We can replace page by FIFO order which replace the old pages but we
can lost the frequently used pages.
Another approach is Least Frequently Used (LFU) pages but it requires a
timestamp which cannot handled by MMU.
Another approach is Not Frequently Used (NFU) pages which contains a
variable inside a page that specifies that how many times this page has been
accessed but if a page that is no longer being used could contains a high count.
Page fault -> when a page is swap out then CPU wants to access that page but
it is no longer in physical memory. OS needs to swap in again that page from swap
file.
Threshing -> loading more processes then memory space which results many page
faults.