DS Notes by BPS
DS Notes by BPS
Ans : Mutual exclusion distributed systems me ek important concept hai jo multiple processes
ko synchronize karne me help karta hai. Ye concept ensures karta hai ki ek time pe sirf ek
process ek shared resource ko access kar sake.
Mutual exclusion ka Main goal ye hai ki jab koi process ek shared resource ko use kar raha ho,
tab dusre processes ko us resource ko access nahi karne diya jata. Isse conflicts aur data
corruption ko prevent kiya jata hai.
Mutual exclusion ko implement karne ke liye various algorithms aur techniques hote hain, jaise
ki Lamport's Bakery Algorithm, Dekker's Algorithm, ya Peterson's Algorithm. In algorithms ka
istemal karke, distributed systems me mutual exclusion ko achieve kiya ja sakta hai, jisse ki
processes efficiently aur sahi tarah se work kar sake aur conflicts ko avoid kiya ja sake.
Ans : Deadlock handling distributed systems me difficult hai kyun ki distributed systems me
multiple nodes hote hain jo alag-alag machines par run karte hain aur communication network
ke through interact karte hain.
Yahan kuch reasons hain jo deadlock handling ko distributed systems me challenging banate
hain:
• Resource Distribution : Resources distributed hote hain across multiple nodes, jisse
deadlock detection aur recovery algorithms ko implement karna aur bhi challenging ho
jata hai.
# Edge Chasing Algorithm and Path Pushing Algorithm for deadlock
handling
Ans : Edge chasing aur path pushing do alag-alag approaches hain jo deadlock handling ke liye
distributed systems mein istemal kiye jaate hain
• Edge chasing algorithm mein, har ek node ek wait-for graph maintain karta hai jahan par
nodes processes ko represent karte hain aur directed edges wait-for relationships ko
represent karte hain.
• Jab koi process ek resource ka request karta hai, woh request message resource ke
owner tak bhejta hai. Agar resource available nahi hai, toh requesting process apne aap
ko wait-for graph mein resource ke taraf ek edge add karta hai.
• Agar koi process wait-for graph mein ek cycle detect karta hai, toh woh cycle ke kisi ek
edge ko choose karta hai aur uss edge ke through resource ka request bhejta hai. Isse
"edge chasing" kehte hain.
• Edge chasing algorithm wait-for graph mein cycles ki detection par nirbhar karta hai aur
unhe ek edge ko tod kar resolve karta hai.
• Path pushing algorithm mein bhi har ek process wait-for graph maintain karta hai, jo
edge chasing algorithm ke similar hota hai.
• Jab koi process detect karta hai ki woh resource ke availability ke karan aage nahi badh
sakta, toh woh wait-for graph ko examine karta hai ek possible cycle ko dhoondne ke
liye.
• Edge chasing algorithm ke mukabale, path pushing algorithm puri cycle ko push karne ki
koshish karta hai. Process cycles ke sabhi processes ko messages bhejta hai tak ki
request aage badh sake.
• Uddeshya yeh hota hai ki deadlock ko cycle ke ek point tak push karein jab tak koi aisa
process na ho jo cycle ko tod sake by releasing a resource.
Group communication distributed systems me various scenarios mein istemal hota hai:
• Multicast Communication
• Publish-Subscribe Systems
• Distributed Computing
Ans : Mounting aur caching, dono hi distributed systems me important concepts hain jo data
access aur performance improvement ke liye use kiye jaate hain.
# Mounting :
• Distributed systems me, mounting ek tarah ka remote access mechanism hai jisme ek
remote resource ko local filesystem par attach kiya jaata hai. Isse, remote resource local
filesystem par available ho jaata hai aur local processes us resource ko access kar sakte
hain jaise ki woh local filesystem par ho.
# Caching :
Caching ek technique hai jisme frequently accessed data ko temporary storage me store kiya
jaata hai tak ki future access ke liye use kiya ja sake.
Distributed systems me, caching ka istemal data access time ko kam karne aur overall
performance ko improve karne ke liye kiya jaata hai.
In dono concepts ka istemal distributed systems me performance optimization ke liye kiya jaata
hai. Mounting ke through remote resources ko local filesystem ke tarah access kiya jaata hai,
jabki caching ke through frequently accessed data ko local storage me temporary store kiya
jaata hai. Isse overall data access time kam hota hai aur system performance improve hoti hai.
# Replication in Distributed System
Ans : Replication ek important concept hai distributed systems me, jisme data copies ko
multiple locations par distribute kiya jaata hai. Har copy ko "replica" kehte hain.
Replication ka mukhya uddeshya hai data availability, fault tolerance, aur performance
improvement ko enhance karna.
Data Availability : Replication data ko multiple locations par distribute karta hai, jisse data
availability improve hoti hai. Agar koi replica ya node fail ho jata hai, toh dusre replicas se data
access kiya ja sakta hai.
Fault Tolerance : Replication ek fault tolerance mechanism bhi hai. Agar koi node ya replica fail
ho jata hai, toh dusre replicas se data access kiya ja sakta hai, jisse system ka reliability increase
hota hai.
Group Consensus: Group consensus ka mukhya udeshya hai ek group ke members ke beech me
agreement pana. Ismein, har ek member ka vote important hota hai aur overall decision group
ke consensus par depend karta hai. Har ek member ke vote ko consider kiya jata hai aur jab sab
members ek decision par agree karte hain, tab consensus achieve hota hai. Group consensus
algorithms me, jaise ki Raft aur Paxos, nodes ek distributed log ko maintain karte hain jismein
unke decisions record hote hain aur ye log consensus achieve karne me use hota hai.
Dynamic Vote Reassignment: Dynamic vote reassignment specific context me istemal hota hai
jahan nodes ke votes dynamically redistribute kiye jaate hain based on changing conditions.
Ismein, nodes ke votes unke performance, reliability, ya network conditions ke basis par
dynamically adjust kiye jaate hain. Jab kisi node ka performance degrade ho jaata hai ya
network conditions change ho jaate hain, toh uske votes ko redistribute kiya jaata hai taki
overall system performance maintain kiya ja sake.
In summary, dynamic vote reassignment ek approach hai jisme votes dynamically redistribute
kiye jaate hain based on changing conditions, aur ye group consensus ko achieve karne me aur
distributed systems ki resilience aur adaptability ko enhance karne me help karta hai.
Round Robin : Isme har incoming request ko sequentially different nodes ya servers par
forward kiya jata hai. Har node ek baar apne turn par aata hai, fir next request ko forward karta
hai. Ye evenly distribute karta hai load across all nodes.
Least Connection : Isme new request ko woh server handle karta hai jo kam active connections
handle kar raha ho. Isme workload ko evenly distribute karne ke bajaye, current load ke basis
par decisions liye jaate hain.
Weighted Round Robin : Isme har server ko ek weight assign kiya jata hai, jo uski capacity ko
represent karta hai. Jitna weight jyada hoga, utni zyada requests us server ko milengi. Isse,
servers ke capacity ke hisab se workload distribute hota hai.
Randomized Load Balancing : Isme requests randomly different servers par forward kiye jaate
hain. Ye simple approach hai jisme kisi specific logic ya criteria ke basis par decisions nahi liye
jaate hain.
Least Response Time : Isme requests ko woh server handle karta hai jiska response time kam
ho. Isme servers ke response time ke basis par load distribute hota hai.
IP Hash : Isme request ko handle karne wala server IP address ke hash ke basis par decide kiya
jata hai. Isse, ek client ko hamesha same server par redirect kiya ja sakta hai.
Overall, poll limit set karke, network traffic, resource utilization, load distribution, aur system
performance ko optimize kiya ja sakta hai. Isse system ka overall efficiency aur reliability
improve hoti hai.
Ans : Asynchronous approach of recovery me, typically distributed systems me istemal hone
wale data structures hote hain jo recovery process ko manage aur implement karne ke liye
optimize kiye gaye hote hain.
Write-ahead Log (WAL): Write-ahead log ek data structure hai jisme transactions ke changes ko
log kiya jaata hai transaction commit se pahle. Ye log disk par persist kiya jaata hai, taki agar
system failure hota hai, toh transaction ko recover kiya ja sake.
Checkpoint: Checkpoint ek point hai jahaan system ke current state ko durable storage me save
kiya jaata hai. Agar failure hota hai, toh checkpoint ke baad ke transactions ko redo karke
system ko restore kiya ja sakta hai. Checkpointing ke liye tree-based data structures ya linked
lists istemal kiye jaate hain.
Redo Log: Redo log ek data structure hai jisme transactions ke changes ko log kiya jaata hai.
Agar failure hota hai, toh redo log ke entries ko read karke transactions ko recovery kiya ja
sakta hai.
Undo Log: Undo log ek data structure hai jisme transactions ke changes ke reverse operations
ko log kiya jaata hai. Agar failure hota hai, toh undo log ke entries ko read karke transactions ko
rollback kiya ja sakta hai.
Distributed Data Structures: Agar distributed system me recovery implement kiya ja raha hai,
toh distributed data structures ka istemal hota hai. Isme distributed hash tables (DHTs),
distributed queues, aur distributed logs jaise data structures istemal kiye jaate hain jo
distributed environment me transactions ke recovery ko manage karne ke liye optimize kiye
gaye hote hain.
In data structures ka istemal asynchronous approach of recovery me kiya jaata hai taki system
ke failures ke baad bhi data consistency aur durability maintain ki ja sake. Har data structure ke
apne advantages aur limitations hote hain jo recovery process ko influence karte hain.
# Load Balancing Algorithm
Ans : Network File System (NFS) ek distributed file system protocol hai jo network file sharing
ko facilitate karta hai UNIX aur UNIX-like operating systems ke beech me.
NFS ka mukhya uddeshya hai file sharing aur access ko simple aur seamless banane ka. Isme, ek
server (NFS server) files ko share karta hai aur dusre systems (NFS clients) uss server se files ko
access kar sakte hain jaise ki local files ho. NFS kaam karta hai TCP/IP protocol suite ke upar aur
UNIX ke file system model ke saath integrate hota hai.
Network File System (NFS) ka architecture client-server model par based hai, jisme NFS server
files ko share karta hai aur NFS clients un files ko access karte hain. Yeh architecture NFS
protocol ke kisi bhi version ke liye similar hoti hai, lekin specifics version ke features aur
enhancements me variations ho sakti hain.
Overall, NFS ek widely used protocol hai distributed file sharing ke liye UNIX aur UNIX-like
operating systems mein, aur iska istemal network file access ko simple aur efficient banane ke
liye hota hai.
Ans : Two-phase commit (2PC) protocol aur three-phase commit (3PC) protocol, distributed
databases aur distributed systems me consistency maintain karne ke liye use hone wale
transaction commit protocols hain. Ye protocols transactions ke atomicity, consistency,
isolation, aur durability (ACID) properties ko maintain karne me help karte hain.
Subordinates apne local transaction ko prepare karte hain, aur phir coordinator ko "ready to
commit" ya "abort" message bhejte hain.
Commit Phase: Agar sab subordinates "ready to commit" message bhejte hain, toh coordinator
"commit" message bhejta hai, jisse subordinates apne local transaction ko commit karte hain.
Agar koi subordinate "abort" message bhejta hai ya timeout ho jaata hai, toh coordinator
"abort" message bhejta hai, jisse sab subordinates apne local transaction ko rollback karte hain.
Three-phase commit protocol me transactions ko commit karne ke liye teen phases hote hain
:
CanCommit Phase : Transaction coordinator sab participating nodes se "Can you commit?"
message bhejta hai.
Subordinates apne local transaction ko prepare karte hain aur "Yes" ya "No" response bhejte
hain.
PreCommit Phase : Agar sab subordinates "Yes" response bhejte hain, toh coordinator "Pre-
Commit" message bhejta hai.
DoCommit Phase : Coordinator "Commit" message bhejta hai, jisse subordinates apne local
transaction ko commit karte hain.
Subordinates "Ack" message bhejte hain, indicating ki unhone commit receive kiya hai.
Ans : AFS ka matlab ho sakta hai Andrew File System, jo ek distributed file system hai jo
Carnegie Mellon University me develop kiya gaya tha. AFS file sharing ke liye istemal hota hai
aur ye multiple clients ke beech me files ko share karta hai. Iska uddeshya files ko distributed
environment me accessible banane ka hai.
ai.
Distributed Architecture : AFS ek distributed file system hai jisme files ko multiple servers par
distribute kiya jata hai. Isme files ko multiple clients access kar sakte hain, jaise ki woh local files
ho.
Caching : AFS clients local caching ka istemal karte hain taki frequently accessed files ko local
storage me cache karke access kiya ja sake aur network traffic kam ho.
Security : AFS me security features shamil hain jaise ki access control lists (ACLs) aur encryption
taki sirf authorized users hi files ko access kar sake.
Scalability : AFS scalable hai aur large-scale distributed environments me easily deploy kiya ja
sakta hai.
Replication : AFS me replication ka concept hota hai jisme files ko multiple servers par replicate
kiya jata hai taki reliability aur fault tolerance improve ho.
Overall, AFS ek robust distributed file system hai jo files ko multiple clients ke beech me share
karne ke liye design kiya gaya hai.
# Fault Tolerance
Ans : Fault tolerance ka mtlab hai ki system ke kisi bhi hisse ka failure hone par bhi, system
apna operation continue kare aur users ko service provide kare, ideally without any noticeable.
Fault tolerance ke liye, redundancy, failure detection mechanisms, failover mechanisms, aur
recovery processes ka istemal kiya jata hai.
Failover Mechanism : Failover mechanism ek important part hai fault tolerance ka. Agar koi
component fail ho jata hai, toh failover mechanism use karke dusre healthy components ko
activate kiya jata hai taki system continue kare apna operation.
Recovery : Fault tolerance me recovery mechanism hota hai jise system ko wapas normal state
me laaya ja sake. Recovery process me data recovery, component replacement, aur system
reconfiguration shamil ho sakte hain.
Ans : Yeh terms distributed systems ya computer networks me failures aur communication
issues ko describe karne ke liye istemal kiye jaate hain:
Orphan Messages: Orphan messages un messages ko refer karte hain jo bheje gaye hote hain
lekin kisi reason se intended recipient tak pahunch nahi paate. Iska mukhya karan ho sakta hai
network failure, message corruption, ya communication channel me issue. Orphan messages ko
typically discard kiya jata hai ya phir unka retry kiya jata hai.
Lost Messages: Lost messages hote hain jab ek message bheja jata hai lekin wo destination tak
pahunch nahi paata. Iska karan ho sakta hai network congestion, hardware failure, ya
communication errors. Lost messages ko retrieve karna challenging ho sakta hai, aur iska
solution redundancy aur error correction techniques ka istemal karke kiya jaata hai.
Domino Effect: Domino effect ka concept hai ki ek failure ya issue ek se dusre components ko
bhi affect karta hai aur unme cascade of failures ya issues trigger karta hai. Jab ek component
fail hota hai, toh isse connected components par pressure badh jaata hai jisse wo bhi fail ho
sakte hain, aur is tarah se chain reaction shuru ho jaata hai.
Livelocks: Livelocks ek situation hote hain jahan system ka koi component ek deadlock situation
se bahar nahi nikal pata hai aur continuously retry karta rehta hai lekin kuch progress nahi hota
hai. Livelocks me, components apni actions ko repeat karte rahte hain lekin kisi resolution tak
nahi pahuchte. Iska solution usually deadlock detection aur resolution techniques ka istemal
karke kiya jaata hai.
# Agreement Protocal
Ans : Agreement protocol ek distributed systems me istemal hone wala protocol hai jo
multiple nodes ya processes ke beech me consensus achieve karne me madad karta hai. Iska
mukhya udeshya hai ki sabhi nodes ya processes ek common decision par agree kar sake, chahe
wo koi value ho, transaction ho, ya koi action ho.
Two-Phase Commit (2PC): Two-phase commit protocol ek agreement protocol hai jisme ek
coordinator node se subordinates tak 2 phases me communication hoti hai transaction commit
karne ke liye. Yeh protocol distributed databases me istemal hota hai.
Three-Phase Commit (3PC): Three-phase commit protocol ek extension hai 2PC ka jisme ek
extra phase add kiya gaya hai taki coordinator ki single point of failure (SPoF) aur deadlock ki
problems ko handle kiya ja sake.
Solution : Let's represent the state of the system using a table where each row represents a
site and each column represents the number of updates for each replica. Initially, each replica
has been updated 3 times, so the table will look like this:
d) E receives an update request and can only communicate with sites A, B, and C: