Module 3
1- Distributed scheduling
1.1 Introduction
➢ In a distributed system, multiple computers (nodes) are connected over a network. Each
computer can independently process tasks. However, sometimes a few computers may
be overloaded (too many tasks), while others are idle or lightly loaded.
➢ To handle this imbalance, Distributed Scheduling is used. It ensures that all tasks are
evenly and efficiently distributed among available nodes.
➢ Why it matters:
• It avoids delays caused by overloaded nodes.
• It utilizes all the computers efficiently.
• It increases the system’s overall speed and performance.
So, Distributed Scheduling is like teamwork – distributing tasks fairly so that no one is
overworked or idle.
1.2 Motivation (Why we need Distributed Scheduling)
Even in powerful systems, if tasks are not well managed, performance drops.
• Goals:
• Avoid overloading of individual systems.
• Ensure faster task execution.
• Increase throughput (number of tasks completed).
• Make use of idle systems.
• Real-life Analogy:
Imagine a kitchen with 5 cooks. If only one cook gets all the orders while others are free, he’ll
be overworked and slow. If the orders are shared among all, food gets cooked faster!
So, the motivation is to use all resources smartly and improve performance.
1.3 Issues in Load Distributing
1. Load
• What is Load?
Load means the amount of work a computer is handling.
• How to measure it?
• Number of active processes.
• CPU usage.
• Queue length (waiting tasks).
• Memory used.
• Why it’s important?
To decide whether to send or receive tasks, a node must know its current load.
1.3.2 Classification of Load Distributing Algorithms
➢ Static Load Distributing Algorithms
• Task assignments are predefined.
• No decision-making during execution.
• Best suited when workload is predictable.
• Less flexible.
➢ Dynamic Load Distributing Algorithms
• Make decisions at runtime.
• Continuously monitor system state.
• Can adapt to varying loads.
• More efficient, but needs more communication overhead.
1.3.3 Load Balancing vs Load Sharing
• Load Balancing
• Distributes tasks evenly across all systems.
• Tries to equalize CPU usage on all nodes.
• More complex, needs frequent task movements.
• Load Sharing
• Just ensures no node is overloaded.
• Doesn’t aim for perfect equality.
• Simpler, less communication needed.
Example:
Load Balancing → All nodes have about 50% load.
Load Sharing → One node has 70%, another 40%, but nobody has 100%.
1.3.4Preemptive vs Nonpreemptive Transfers
➢ Preemptive Transfers
• A task that is already running is paused, transferred, and resumed on another
system.
• Needs extra work to save and transfer the state.
• Used for long-running jobs.
Example: A video rendering task that’s 50% done is paused and moved to a faster system to
finish.
➢ Nonpreemptive Transfers
• Only tasks that have not started are moved.
• Easier to implement.
• No need to save state.
Example: A task waiting in the queue is sent to another system before it begins.
1.4 Components of a Load Distributing Algorithm
This part explains how a distributed scheduling algorithm works. It has 4 key parts:
1. Transfer Policy: This decides when to move a task.
Conditions:
CPU usage > threshold (e.g., 80%)
Queue length is too long
Example: If CPU is overloaded, trigger task transfer.
2. Selection Policy: This decides which task should be moved.
Criteria:
The task hasn’t accessed too many local files.
It is small and easy to transfer.
It’s not urgent or real-time.
Example: Choose a background calculation task over a real-time video call.
3. Location Policy: This selects where (which system) to send the task.
Methods:
Random selection.
Round-robin (take turns).
Example: Find a node with CPU usage < 50%.
4. Information Policy: This defines how and when load information is collected.
Types:
Demand-driven: Collected only when needed.
Periodic: Collected at regular intervals.
State-change: Collected when there’s a major change (e.g., new task arrives).
Example: Every 5 seconds, check each node’s load.
1.5 Stability in Distributed Scheduling
Definition of Stability:
A distributed scheduling system is said to be stable if it continues to operate efficiently without
overloading any node or letting task queues grow indefinitely. Stability ensures that the rate of
task arrival does not exceed the system's ability to process them over time.
There are two main ways to analyze the stability of a distributed system:
1. Queuing-Theoretic Perspective: This method uses queueing models to understand
the flow of tasks in the system.
Key Concepts:
λ (lambda) = Task arrival rate.
μ (mu) = Task service (processing) rate.
Common Models:
M/M/1 Queue:
Single server.
Poisson arrivals (exponential inter-arrival times).
Exponential service times.
M/M/n Queue:
Multiple servers (n processors).
Same arrival and service time distributions.
Stability Condition:
A queueing system is stable only if the average arrival rate is less than the average service rate:
λ<μ
If λ ≥ μ, the task queue keeps growing without bound — this means the system becomes
unstable, eventually crashing or refusing new requests.
Example: Imagine Node A receives tasks at 10 per second (λ = 10), but it can process only 8
per second (μ = 8).
Queue length increases → response time increases → system fails under pressure.
2. Algorithmic Perspective: This focuses on how the algorithm behaves dynamically as
it responds to load changes in real-time.
Unstable Behavior Can Occur If:
• Tasks keep ping-ponging between two nodes due to poor decision logic.
• The system overreacts to small changes in load.
• There’s no control on the rate of task migration.
Techniques to Ensure Stability:
• Hysteresis thresholds:
• Don’t initiate task transfer unless the load is clearly above/below a certain level.
• Prevents triggering migrations due to minor fluctuations.
Rate-limiting transfers: Don’t allow unlimited task migrations in a short time.
Example: “Allow max 1 migration per second per node.”
Load averaging: Use an average load over time (e.g., last 5 seconds) instead of instant values.
1.6 Load Distributing Algorithms
Distributed systems often face load imbalance — some nodes are overloaded while others are
idle. To handle this, load distributing algorithms transfer tasks between nodes. These
algorithms can be categorized based on who initiates the transfer:
1.6.1 Sender-Initiated Algorithms
• The overloaded node (sender) starts the task transfer.
• When a node's queue length exceeds a threshold T, it becomes a sender and looks for a
receiver.
Policies Used:
1. Transfer Policy: A node becomes a sender if its task queue > threshold T.
2. Selection Policy: Only newly arrived tasks are considered for migration.
3. Location Policy (decides where to send the task):
a. Random:
• A random node is selected.
• Problem: May choose already overloaded node → useless transfer.
b. Threshold:
• Random node is selected, and its queue length is checked.
• If it's below T, it becomes the receiver.
• If not, another node is polled.
• Stops after polling PollLimit nodes.
c. Shortest Queue:
• From the randomly polled nodes, choose the one with the shortest queue.
4. Information Policy: Demand-driven: Sender collects load info only when it wants to
send a task.
Below Figure Explanation: (Sender-Initiated Flowchart)
1. Select a node at random to check.
2. If Poll-set = NIL, check own queue.
3. If own queue > T → look for receiver.
4. If polled node has queue < T → send task.
5. Else → poll more nodes (up to PollLimit).
6. If no receiver found → queue the task locally.
1.6.2 Receiver-Initiated Algorithms
• The underloaded node (receiver) starts the transfer request.
• When its queue length falls below threshold T, it searches for a task to execute.
Policies Used:
1. Transfer Policy: Triggered when task finishes and queue length is < T.
2. Selection Policy: Algorithm decides what kind of tasks to request.
3. Location Policy: Poll a random node to check if it's a sender.
• If sender's queue > T → it sends task.
• Try up to PollLimit times or wait before retrying.
4. Information Policy: Demand-driven: Only collect load info when needed.
Below Figure Explanation: (Receiver-Initiated Flowchart)
1. Task departs at node i, now queue < T.
2. Select a node randomly → poll to see if it's a sender.
3. If yes → transfer task from it.
4. Else → try other nodes (max PollLimit).
5. If still no task → wait and retry.
1.6.3 Symmetrically-Initiated Algorithms
• In these algorithms, both overloaded (sender) and underloaded (receiver) nodes can
initiate task transfers.
• The role (sender/receiver) depends on the node's current load.
• How It Works:
• If load > threshold → node becomes sender.
• If load < threshold → node becomes receiver.
• They poll other nodes and transfer tasks when a match is found.
• Use: Good for systems with dynamic workloads.
1.6.4 Adaptive Algorithms
• These algorithms monitor system state and adapt their behavior accordingly.
• They switch between sender, receiver, or symmetric modes based on system load.
• How It Works:
• Low Load → Use sender-initiated
• High Load → Use receiver-initiated
• Medium Load → Use symmetric
• Use: Ideal for large-scale distributed systems like cloud or grid computing.
1.7 Performance Comparison
This section compares the performance of different load sharing algorithms (like sender-
initiated, receiver-initiated, stable, adaptive, etc.).
The comparison is based on:
• Mean response time (average time to complete tasks)
• Offered system load (how busy the system is)
• Algorithms compared:
• M/M/1 – No load sharing at all
• RECV – Receiver-initiated
• SEND – Sender-initiated
• RANDOM – Sender with random destination
• ADSYM – Stable Symmetric
• ADSEND – Stable Sender-Initiated
• MM/K – Ideal (best-case) performance (theoretical)
1.7.1 Receiver-Initiated vs. Sender-Initiated
• At low to medium loads, Sender-Initiated (SEND) performs slightly better.
• At high loads, Receiver-Initiated (RECV) performs better because it avoids system
instability.
Conclusion:
• Use Sender-Initiated when the system isn’t heavily loaded.
• Use Receiver-Initiated at high loads to avoid overload issues.
1.7.2 Symmetrically-Initiated Load Sharing
• Combines both sender- and receiver-initiated.
• Works well for moderate to high loads.
• Better performance than using sender or receiver alone.
• But may still cause instability at very high loads due to excess polling.
1.7.3 Stable Load Sharing Algorithms
• ADSYM and ADSEND use thresholds and limit polling to avoid instability.
• ADSYM is very close in performance to MM/K (ideal case).
• ADSEND is not as good as ADSYM but more efficient than unstable sender-initiated.
1.7.4 Performance Under Heterogeneous Workloads
• Heterogeneous workloads = different nodes receive tasks at different rates.
• RECV and SEND become unstable if fewer nodes receive most tasks.
• ADSYM adapts best and performs well under such situations.
• RANDOM performs poorly because of wasteful polling.
1.8 Selecting a Suitable Load Sharing Algorithm
Here’s how to choose an algorithm based on system behavior:
System Behavior Recommended Algorithm
Never reaches high load Sender-Initiated
May reach high load Stable (ADSYM, ADSEND)
Load varies often ADSYM
Has many completed tasks waiting ADSEND
Heterogeneous load arrival ADSYM or Adaptive algorithms
1.9 Requirements for Load Distributing
1. Transparency: User should not know or manage task movement.
2. Fault Tolerance: Detect failed nodes.
• Reroute or retry task migration.
3. Scalability: Should work with 10 or 10,000 nodes.
• Avoid bottlenecks or centralized schedulers.
4. Low Overhead: Scheduling must not consume more resources than it saves.
5. Fairness: Ensure no node or user is starved of computation.
1.10 Load Sharing Policies: Case Studies
• V-System
• Unix-based system developed at Stanford.
• Used idle processor detection to migrate jobs.
• Used receiver-initiated method.
• Sprite
• Distributed OS supporting transparent preemptive migration.
• Worked on workstations during idle time.
• Tasks moved between nodes using process state transfer.
• Condor
• High-throughput batch system.
• Used idle desktop machines at night.
• Featured checkpointing, restart, and job rescheduling.
• Tech Scheduler
• Lightweight, LAN-based scheduler.
• Focused on low-complexity job distribution.
1.11 Task Migration
Definition: Process of transferring a process or task from one node to another, including all
required execution context.
• Steps:
1. Suspend execution.
2. Save state (program counter, memory, files).
3. Transmit state to new node.
4. Resume execution on target.
• Challenges:
1. Synchronization across nodes.
2. Handling open file descriptors, sockets.
3. Maintaining network connections.
1.12 Issues in Task Migration
1. State Transfer:
• CPU registers, memory image, I/O buffers.
• Checkpointing may be required.
2. Location Transparency:
• Application shouldn’t know task was moved.
• Use DNS, global name spaces, or stub layers.
3. Migration Mechanism:
• Stop → Serialize → Transmit → Deserialize → Resume
4. Performance:
• Migration Time must be < time saved from early completion.
• Otherwise, migration is not worth it.
2- Distributed shared Memory
2.1 Introduction (Distributed Shared Memory – DSM)
Traditionally, distributed systems use message passing to communicate (like sending/receiving
messages).
DSM (Distributed Shared Memory) provides a shared address space across different nodes,
like virtual memory across machines.
No actual shared physical memory exists. Instead, software simulates it over the network.
2.2 Architecture and Motivation
• DSM allows data sharing like virtual memory but across different computers.
• A mapping manager translates shared memory addresses into physical locations.
• Advantages of DSM:
1. Removes the need for explicit message passing.
2. Easier to build complex distributed apps (pass by reference is allowed).
3. Takes advantage of locality (saves bandwidth).
4. Cheaper than tightly coupled systems.
5. Allows use of large memory pools across systems.
6. Avoids bottlenecks of single shared memory in traditional multiprocessors.
2.3 Algorithms for Implementing DSM
1. Central Server Algorithm
• A central server holds and manages all shared data.
• All read/write requests go to the central server.
• Simple but can be a bottleneck due to high traffic.
FIGURE : Clients → Central Server → Data access.
2. Migration Algorithm
• Instead of accessing data remotely every time, data migrates to the requesting node.
• Only one node has access to the data at a time.
• Helps if same node accesses the data repeatedly.
• Can suffer from thrashing (too much movement between nodes).
FIGURE: Data migrates with access requests.
3. Read-Replication Algorithm
• Multiple nodes can read data at the same time.
• Only one node can write, and all others get invalidated or updated.
• Used when reads >> writes.
• Maintains a list of all nodes that have the data.
FIGURE: Multiple copies → Write = Invalidate all others.
4. Full-Replication Algorithm
• Every node has a copy and can read/write.
• A sequencer assigns sequence numbers to maintain order.
• Ensures consistency in write operations.
FIGURE: Clients send write → Sequencer → Sends update to all.
2.4 Memory Coherence
• Memory is coherent if every read returns the most recent write.
• DSM must ensure this using protocols like:
Write-invalidate: Invalidate others’ copies before writing.
Write-update: Update all copies.
2.5 Coherence Protocols:
2.5.1 Cache Coherence in the PLUS System
PLUS uses a write-update protocol.
Key Concepts:
• MCM (Memory Coherence Manager) at each node maintains consistency.
• Copy-list: Linked list of replicas of a page. One is the master copy.
• On read: If local, read. If not, ask the remote MCM.
• On write: Update master copy first, then update all others using the copy-list.
FIGURE : Node 4 writes → Update goes via node 2 (master) → Updates node 1 and 3.
Steps:
1. Node 4 sends write request to node 2 (master).
2. Node 2 updates itself.
3. Node 2 updates node 1 and 3 (via copy list).
4. Acknowledgements are sent back.
Benefits:
• Ensures strong consistency across replicas.
• Supports concurrent access with proper synchronization.
2.5.2 Unifying Synchronization and Data Transfer in Clouds
This section explains how memory coherence (making sure all copies of data are up-to-date)
and process synchronization (making sure processes access shared data correctly) are closely
linked, especially in cloud environments using Distributed Shared Memory (DSM) systems.
• Why is this Important?
When multiple processes (or nodes) share memory, two main problems must be handled:
1. Coherence – All nodes should see the latest version of data.
2. Synchronization – Processes should not read or write simultaneously in a way that
causes errors.
In cloud-based DSM, these two are combined or unified to make the system more efficient and
easier to manage.
• Key Concepts
Memory Coherence
• Memory coherence ensures that shared data stays consistent across all nodes.
• For example, if Node A writes new data to memory, and Node B reads it, B should see
the latest value.
Process Synchronization
• Synchronization ensures correct access to shared data:
Only one process writes at a time.
Readers wait if a write is happening (and vice versa).
• Typical methods: Locks, Semaphores, Barriers, etc.
• How Are They Unified in the Cloud?
In cloud DSM systems, memory coherence and synchronization are tightly connected:
• When a process reads or writes shared data, the system:
1. Checks if the data is up-to-date (coherent),
2. Applies any synchronization rules (e.g., writer has to wait if readers are active).
This combined approach:
• Reduces system overhead,
• Avoids conflicts,
• Improves performance,
• Ensures correctness in cloud-based distributed systems.
Example: Reader-Writer Problem in Cloud DSM
DSM systems use data replication and write-invalidate techniques to manage this:
• Readers can read from any replica.
• Writers must invalidate all other replicas before writing, to maintain coherence.
2.5.3 Type-Specific Memory Coherence in the Munin System
Munin system observes that not all data needs the same coherence policy.
Main Idea:
Apply different coherence strategies based on how data is accessed.
Types of Shared Data and Policies:
Data Type Strategy Used Why?
Read-only No updates needed Safe to replicate
Write-shared Use write-invalidate To maintain correctness
Write-private No coherence needed Only one process uses it
Migratory data Use migration or delayed Written and then read by
update another
Result data Updated once, then read Keep one final version
many times
This approach reduces unnecessary communication and improves performance by using the
right strategy for each type of data.
2.6 Design Issues in DSM
2.6.1 Granularity
Granularity = Size of memory unit that DSM shares.
Types:
• Fine-grained DSM:
1. Shares small data (word/variable).
2. Offers precise sharing but high overhead.
• Coarse-grained DSM:
1. Shares full memory pages or blocks.
2. Easier to manage but leads to false sharing:
Example:
If one thread changes variable A and another changes variable B (both in same page), page is
marked as modified even if A and B are unrelated.
2.6.2 Page Replacement
In DSM, memory is limited. So, some pages must be evicted (removed) when new pages are
needed.
Considerations:
• Don’t evict recently used or frequently accessed pages.
• Ensure evicted pages are either:
1. Not needed anymore, or
2. Can be retrieved easily if needed again.
Techniques Used:
• LRU (Least Recently Used),
• Clock algorithm,
• Reference counting.
Additionally, DSM must maintain coherence:
• If a page has many replicas, inform all when it is replaced or modified.
2.7 Case Studies
Real implementations of DSM systems that help us understand how different ideas are used in
practice.
2.7.1 IVY System
• First software DSM system.
• Used page-level sharing.
• A central server maintained the mapping of pages and handled updates.
Features:
• Easy to implement.
• Suitable for early research setups.
• Used single-writer, multiple-reader model.
• Could cause performance bottleneck due to central control.
2.7.2 Mirage System
Improves over IVY using migration of data pages.
Working:
• Instead of central control, data block is moved (migrated) to the node that requests it.
• Node can locally access data until it is moved again.
• Reduces communication cost and improves locality.
Optimization:
• Uses tunable timeout values to reduce unnecessary migrations.
• Reduces thrashing (rapid data movement between nodes).
2.7.3 Clouds DSM
Designed for cloud environments using distributed memory over large scale.
Key Features:
• Uses replication + locking for synchronization.
• Maintains coherence using:
1. Write-invalidate,
2. Reader-writer locks,
3. Versioning.
3- Distributed File Systems
3.1 Introduction
Definition
A Distributed File System (DFS) is a type of file system that allows files to be stored across
multiple computers (machines) in a network but appear to users as a single, unified file system.
Goals of DFS
1. Network Transparency: Users can access files without knowing where they
are physically stored.
2. High Availability: Files should be available even if some machines fail.
3.2 Architecture of DFS
• In DFS, files can be stored anywhere in the system.
• Clients: Machines that request access to files.
• Servers: Machines that store files.
• Cache Manager: Reduces access time by storing recently used data locally.
➤ Two main components:
• Name Server – Maps filenames to objects like files or directories.
• Cache Manager – Stores frequently used files close to the user to reduce access time.
3.3 Mechanisms for Building Distributed File Systems
This section explains the core mechanisms that make distributed file systems work efficiently.
These systems operate over networks and allow users to access files located on remote servers
as if they were local.
The main idea is that we should be able to:
• Access files from any system in the network
• Maintain performance, consistency, and transparency
• Reduce network delays and file access time
3.3.1 Mounting
What is Mounting?
Mounting is a technique used to connect two file systems or attach remote files/directories to a
local file system.
• A mount point is a node in the directory structure where the new file system will be
attached.
• For example: you can "mount" a directory from Server Y to your local directory /shared,
and then access it like any other folder.
How it Works?
There are two main approaches to managing mount information:
1. Client-side Mounting
• Each client maintains its own list of mounted remote directories.
• Advantage: More flexible and independent.
• Example: Sun NFS system.
2. Server-side Mounting
• The server manages a global mount table.
• Advantage: All clients see the same directory structure.
• Example: Sprite File System.
• If any directory is moved, the server updates it for all clients.
3.3.2 Caching
Why Caching is Important?
Caching improves performance by reducing the delay in accessing files over the network.
How it Works:
• When a client accesses a remote file, it downloads a copy and stores it locally (cache).
• Later accesses are served from this local cache, reducing network use and time.
Types of Caches:
• Main memory cache (RAM)
• Disk cache (local disk storage)
Benefits:
• Reduces repeated server requests.
• Improves access speed due to temporal locality (frequently accessed data stays near).
• Increases scalability of the system.
3.3.3 Hints
What are Hints?
Hints are cached addresses or location info about files that help speed up access without fully
caching the data.
Example:
If the client remembers where a file was stored last time (server location), it can go directly
there.
3.3.4 Bulk Data Transfer
Why Bulk Transfer?
When large amounts of data are transferred across a network, it’s expensive and slow due to
packet overhead and communication protocols.
What is Done?
• Instead of transferring one block at a time, multiple blocks are sent together.
• This reduces:
1. Protocol overhead
2. Disk seeks
3. Number of acknowledgements
Result:
Faster and more efficient file transfers over the network.
3.3.5 Encryption
Encryption is used to secure data during transmission in a distributed file system.
• Why needed? Because the file may pass through untrusted networks.
• Data is encrypted at sender’s end and decrypted at receiver’s end.
• Authentication ensures that only authorized users can access the data.
3.4 Design Issues in DFS
3.4.1 Naming and Name Resolution
• Naming = Assigning names to files/directories.
• Name Resolution = Mapping names to their actual locations.
Three approaches:
1. Concatenated host + filename: Not flexible.
2. Mount remote directories: Transparent but needs updates.
3. Single global namespace: Common naming space for all.
Concept of Contexts
• Helps resolve the problem of duplicate names across systems.
• A context = scope where names are unique.
• Used in systems like x-Kernel, Tilde, etc.
Name Server
• Manages name resolution in distributed systems.
• Can be:
1. Centralized: One server, risky if it fails.
2. Distributed: Multiple servers, more reliable.
3.4.2 Cache on Disk or Memory
• Clients cache data in:
1. Main memory
2. Local disk
• Reduces delay and improves performance.
3.5 Case Study – Sun Network File System (NFS)
About:
• Developed by Sun Microsystems (1985).
• Goal: Keep file system independent of OS and hardware.
Architecture:
• Uses Remote Procedure Call (RPC) for communication.
• Data is transferred using XDR (External Data Representation).
• Uses VFS (Virtual File System) interface to work with different types of file systems.
Naming & Location:
• Each client can maintain its own namespace.
• Mounting lets clients attach remote filesystems.
• Uses vnode structure (virtual node) to refer to files.
Caching in NFS:
• Three caches:
1. File Block Cache
2. Filename-to-vnode Translation
3. File Attributes
• Validation is required to ensure cached data is fresh.
3.5.2 The Sprite File System
Developed At: University of California, Berkeley
Main Idea: Extends UNIX for distributed computing. Focuses on network transparency and
efficient sharing of files.
Key Features:
• Location-transparent naming: Users don’t need to know where files are stored.
• Uses file caching aggressively — even on local disk.
• Introduces process migration — processes can move between machines.
How It Works:
• Clients cache data locally.
• Consistency is maintained using callback mechanism — server tells client when cached
data is outdated.
Advantages:
• Great performance due to client-side caching.
• Transparent file access across machines.
3.5.3 Apollo DOMAIN Distributed File System
Developed By: Apollo Computers
Main Idea: Focuses on file sharing with security and reliability in a networked environment.
Key Features:
• Implements location-independent naming using network-wide naming service.
• Uses context-based naming: file names depend on the context of the user or process.
• Strong support for access control and protection.
Notable Concept:
• Contexts: Help users and processes view the file system differently based on their roles
or environment.
3.5.4 Coda File System
Developed At: Carnegie Mellon University
Based On: Andrew File System (AFS)
Main Idea: Designed for mobile users and systems with frequent disconnection.
Key Features:
• Supports disconnected operation — users can access files even when server is down.
• Uses hoarding: frequently used files are automatically cached before disconnection.
• Conflict resolution: Handles file update conflicts when system reconnects.
How It Helps:
• Ideal for laptops or mobile computing.
• Provides high availability through replication and caching.
3.5.5 The x-Kernel Logical File System
Used In: Experimental systems (e.g., Arizona’s x-Kernel)
Main Idea: Uses logical file systems and provides fine-grained naming and access through
contexts.
Key Features:
• Users can define their own file namespace.
• Introduces tilde naming — gives each process its own view of file system.
• Combines flexibility of naming and access control.
Context Use:
• Files are accessed relative to a user-defined logical tree.
• Useful in systems that require multiple naming views or custom hierarchies.
3.6 Log-Structured File Systems (LFS)
What is a Log-Structured File System?
A Log-Structured File System (LFS) is a special type of file system where all changes to files
(writes and updates) are written in a log-like structure, i.e., sequentially on disk — like a
continuous diary of file updates.
Instead of modifying files at random places on the disk, LFS just appends changes to the end
of the log.
Why Use It?
• Traditional file systems write data to multiple disk locations — causing slow
performance due to disk seeks.
• LFS solves this by writing sequentially, which is much faster, especially for modern
disks and flash storage.
3.6.1 Disk-Space Management in LFS
Challenge:
Since LFS keeps appending data at the end of the log, the old versions of data still occupy disk
space. This can waste space quickly if not managed.
So, we need Garbage Collection — to clean old/unused data and free up space.
How Disk-Space Management Works:
1. Segmented Disk Design:
• The disk is divided into segments.
• Each segment is written sequentially.
• A segment contains file data and metadata (like inode info).
2. Garbage Collection:
• The system finds segments with mostly outdated data (called "cold" segments).
• It copies the live (still used) blocks to a new segment.
• Then it reclaims the old segment as free space.
3. Segment Cleaning Policies:
To decide which segments to clean, LFS uses:
• Greedy Cleaning: Clean the segment with the least live data first (easy to reclaim).
• Cost-Benefit Cleaning: Considers how old the segment is and how full it is.
4. Checkpointing:
• LFS maintains checkpoints — a snapshot of where the file system metadata is stored.
• During a crash, it uses the last checkpoint to recover quickly.
4 -Multimedia File Systems
A multimedia file system is designed to handle large amounts of data such as audio, video, and
images, which need to be accessed quickly and smoothly without delays. These systems must
support real-time performance, which means data must be available at the right time without
interruption (like buffering in videos).
Key Requirements:
1. Real-Time Data Access: Multimedia files (especially video/audio) must be
played in real-time, meaning the system must deliver the required data fast
enough to maintain smooth playback.
2. High Bandwidth: Audio and video files are large, so the file system must
support high-speed data transfer.
3. Low Latency: There should be minimal delay in fetching and playing
multimedia files.
4. Support for Continuous Media: The system must be able to handle streams of
data continuously, rather than just static files.
5 -File Placement in Multimedia Systems
File Placement refers to where and how files are stored on the disk to optimize performance.
➢ Why it's important?
Multimedia files are usually large and need to be read quickly and sequentially. So, careful file
placement helps reduce delays.
➢ Strategies:
Contiguous Placement
Store the entire file in one continuous block of disk space.
• Good for performance (fast reading).
• May cause fragmentation or need large free spaces.
Partitioned Placement
Large files are split into parts and each part is stored separately (like chapters in a book). This
allows parallel access.
• Helpful in video streaming.
• Requires smart scheduling to avoid playback issues.
Pre-Fetching
The system can predict and load upcoming parts of a file into memory before they’re actually
needed.
Helps in reducing playback pauses.
Disk Striping
Large files are split and stored across multiple disks (RAID technique).
• Improves speed by parallel reading.
• Increases complexity.
6 - Caching in Multimedia File Systems
Caching is storing frequently or recently accessed data in fast memory (like RAM) for quicker
access next time.
Role in Multimedia:
• Reduces access time.
• Avoids re-reading from slower disk.
• Ensures smooth playback by fetching data before it’s needed.
Types of Caching:
• Client-Side Caching: Data is cached at the client (user) end. For example, when you
stream a video, part of it is stored in a buffer.
• Server-Side Caching: Server stores frequently requested multimedia files in memory.
• Adaptive Caching: Uses intelligent algorithms to decide which parts of data to keep in
cache based on user behavior.
Challenges:
• Cache Size: Multimedia files are large, so managing limited memory space is hard.
• Consistency: Updated files may be out of sync with cached versions.
• Cache Replacement: Older/less useful data is removed to make space for new data
using LRU (Least Recently Used) or other policies.