Distributed Systems
CH – 05
Time and State in Distributed
Systems
Baikuntha Acharya (baikunth2a@gmail.com)
Lecturer, Sagarmatha Engineering College, Sanepa, Lalitpur
© Baikuntha Acharya (baikunth2a@gmail.com)
Time in Distributed Systems
Problems
✓ There is no notion of a single global
physical time:
• Each machine in a DS has its own clock
• Crystals tick at different rates → Clock drift
• Attach GPS receiver → Accuracy ~40ns
✓ Time triggered systems:
• Certain activities are scheduled to occur at predefined moments in time.
• For the coordination over a DS we need a coherent notion of time.
• Example: time-triggered real-time systems
✓ Maintaining the consistency of distributed data is often based on:
• the time when a certain modification has been performed.
2
© Baikuntha Acharya (baikunth2a@gmail.com)
Time in Distributed Systems (Cont..)
✓ Consistency Problem: The make-program example:
• When the programmer has finished changing some source files he starts make;
make examines the times at which all object and source files were last modified
and decides which source files have to be (re)compiled.
• Although P.c is modified after P.o has been generated, because of the clock drift
the time assigned to P.c is smaller. → P.c will not be recompiled for the new
version!
3
© Baikuntha Acharya (baikunth2a@gmail.com)
Time in Distributed Systems (Cont..)
Types of Clock
✓ Physical Clocks
• Tied to the notion of real time
• Can be used to order events
✓ Logical Clocks
• Derived from the notion of potential cause-effect between events
• Not tied to the notion of real time
• Can be used to order events
• Different types
• Lamport's Logical Clock
• Vector Clocks
• …..
4
© Baikuntha Acharya (baikunth2a@gmail.com)
Time in Distributed Systems (Cont..)
Solutions
✓ Synchronization of physical clocks:
• Physical clock synchronization is needed for distributed real-time systems.
• Computer clocks can be synchronized with known degree of accuracy.
• Within the bounds of this accuracy we can coordinate activities.
✓ Logical clocks:
• Physical time might not be important - what is important is the relative order of
events!
• Relative ordering is based on a virtual notion of time - logical time.
• The order of events occurring at different processes is critical for many
distributed applications.
5
© Baikuntha Acharya (baikunth2a@gmail.com)
Time in Distributed Systems (Cont..)
Solutions (Cont..)
✓ Ordering can be based on two simple situations:
• If two events occurred in the same process then they occurred in the order
observed following the respective process;
• Whenever a message is sent between processes, the event of sending the message
occurred before the event of receiving it.
6
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization
Synchronize with Time Server
✓ Simplest synchronization technique:
• Send a network request to obtain the time
• Set the time to the returned value.
✓ Do not consider the network and processing latency.
7
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization (Cont..)
Cristian's Algorithm
✓ Based on time server considering the compensation for delays.
• T0 → Request sent time
• T1 → Reply received time
✓ Assume network delays are symmetric.
✓ Assume both T0 and T1 are measured with the same clock
✓ Requires a special node with time source → Single point of failure
8
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization (Cont..)
Berkeley Algorithm
✓ Centralized as in Cristian’s, but the time server is active.
✓ Assumes no machine has an accurate time source.
✓ Time Daemon asks all other machines for their clock values:
• Computes average time
• Tells other machines how to adjust their clocks
✓ No separate time source needed (No single point of failure as in Cristian’s)
✓ Can elected a leader:
• Acts as time server
• Fault tolerant (can be elected new)
✓ Issues:
• Load in central server
• Variation in message delay in large networks
9
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization (Cont..)
Network Time Protocol (NTP)
✓ Problem: Cristian algorithm → Not scalable for internet-scale synchronization
✓ Solution: Use a hierarchical approach arranging in strata.
✓ Enables clients across Internet to be synchronized to UTC despite message
delays.
✓ Provides reliable service using redundant paths and server.
• A requests time of B at its own T1
• B receives request at its T2, records T2
• B responds at its T3, sending values of T2 and T3
• A receives response at its T4
• Question: what is θ = TB – TA?
• Assume transit time is approximately the same both ways
• Assume that B is the time server that A wants to synchronize to
10
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization (Cont..)
Network Time Protocol (NTP) (Cont..)
✓ A knows (T4 – T1) from its own clock.
✓ B reports T3 and T2 in response to NTP request.
✓ A computes total transit time of: 𝑇4 − 𝑇1 − 𝑇3 − 𝑇2
✓ One-way transit time, approx. ½ of total:
𝑇4 −𝑇1 − 𝑇3 −𝑇2
•
2
✓ B’s clock at T4 reads approximately:
𝑇4 −𝑇1 − 𝑇3 −𝑇2 𝑇4 −𝑇1 − 𝑇2 +𝑇3
• 𝑇3 + =
2 2
✓ Difference between B and A clocks at T4 is:
𝑇4 −𝑇1 − 𝑇2 +𝑇3 𝑇2 −𝑇1 + 𝑇3 −𝑇4
• θ= − 𝑇4 =
2 2
11
© Baikuntha Acharya (baikunth2a@gmail.com)
Physical Clock Synchronization (Cont..)
Network Time Protocol (NTP) (Cont..)
✓ Arranged in strata:
• Stratum 0 → Master clock
• 1st stratum: systems connected directly to accurate time source (Satellite, radio, etc)
• 2nd stratum: systems synchronized from 1st stratum systems
• …..
• 15th stratum: systems synchronized from 14th stratum systems
✓ NTP Synchronization Modes
• Multicast mode
• lower accuracy but efficient for high speed LANs
• Procedure call mode
• Cristian’s algorithm All messages are delivered unreliably with UDP (port 123)
• Symmetric mode
• Peer systems can synchronize with each other (usually used with stratum 1 & 2 servers)
12
© Baikuntha Acharya (baikunth2a@gmail.com)
Problems with Physical Time
✓ Users can mess up clocks
• (and forget to set their time zones!)
✓ Unpredictable delays in Internet.
✓ Relativistic issues:
• If A and B are far apart physically, and two events TA and TB are very close in
time, then which comes first? how do you know?
13
© Baikuntha Acharya (baikunth2a@gmail.com)
Logical Clocks
Lamport’s Logical Clocks
✓ Not “clocks” at all.
• monotonically increasing software counter.
• There is a logical clock CPi at each process Pi in the system.
• The value of the logical clock is used to assign timestamps to events.
• CPi(a) is the timestamp of event a in process Pi.
• There is no relationship between a logical clock and any physical clock.
Without Logical Clock With Logical Clock 14
© Baikuntha Acharya (baikunth2a@gmail.com)
Lamport’s Logical Clocks
✓ Ordering by Lamport is based on the happened before relation
(denoted by →):
• a → b, if a and b are events in the same process and a occurred before b;
• a → b, if a is the event of sending a message m in a process, and b is the event of
the same message m being received by another process;
• If a → b and b → c, then a → c (the relation is transitive).
• If a → b, event a causally affects event b. The two events are causally related.
• If both a → e and e → a are false, then a and e are concurrent events; we write a || e.
• To capture the happened-before relation, logical clocks have to be implemented so
that: if a → b, then C(a) < C(b).
• To capture the happened-before relation, if a → b, then C(a) < C(b)
15
© Baikuntha Acharya (baikunth2a@gmail.com)
Implementation - Lamport’s Logical Clocks
✓ [R1]:
• CPi is incremented before each event is issued at process Pi : CPi := CPi + 1.
✓ [R2]:
• a) When a is the event of sending a message m from process Pi, then the timestamp tm =
CPi(a) is included in m (CPi(a) is the logical clock value obtained after applying rule R1).
• b) On receiving message m by process Pj, its logical clock CPj is updated as follows: CPj :=
max(CPj, tm).
• c) The new value of CPj is used to timestamp the event of receiving message m by Pj
(applying rule R1).
✓ If a and b are events in the same process
• a occurred before b, then a → b, and (by R1) C(a) < C(b).
✓ If a is the event of sending a message, and b is the event of received by
another process:
• then a → b, and (by R2) C(a) < C(b).
✓ If a → b and b → c, then a → c, and (by induction) C(a) < C(c).
16
© Baikuntha Acharya (baikunth2a@gmail.com)
Lamport’s Logical Clocks (Cont..)
✓ For the make-program example:
• The compilation process notifies to the editor process, through a message about
the event P.o created ⇒ a logical clock can be used to correctly timestamp the
files.
17
© Baikuntha Acharya (baikunth2a@gmail.com)
Lamport’s Logical Clocks (Cont..)
Problems
✓ It impose only a partial order on the set of events;
• pairs of distinct events generated by different processes can have identical
timestamp.
• For certain applications a total ordering is needed;
• they consider that no two events can occur at the same time.
• In order to enforce total ordering a global logical timestamp is
introduced:
• A pair (CPi(a), i), where i is an identifier of process Pi;
• We define:
• (CPi(a), i) < (CPj(b), j) if and only if
• CPi(a) < CPj(b), or CPi(a) = CPj(b) and i < j.
18
© Baikuntha Acharya (baikunth2a@gmail.com)
Lamport’s Logical Clocks (Cont..)
Problems (Cont..)
✓ Not powerful enough to perform a causal ordering of events.
• if a → b, then C(a) < C(b).
• However, the reverse is not always true (if the events occurred in different processes):
• if C(a) < C(b), then a → b is not necessarily true. (it is only guaranteed that b → a is not
true).
✓ From the diagram:
• C(e) < C(b), however there is no causal
relation from event e to event b.
• By just looking at the timestamps of the
events, we cannot say whether two events
are causally related or not.
19
© Baikuntha Acharya (baikunth2a@gmail.com)
Lamport’s Logical Clocks (Cont..)
Problems (Cont..)
✓ Need to process message according to their causal order.
• we use the associated timestamp for this purpose.
✓ Process P3 receives messages M1,
M2, and M3.
• send(M1) → send(M2), send(M1) → send(M3),
send(M3) || send(M2)
✓ M1 has to be processed before M2
and M3.
• However, P3 has not to wait for M3 in order to
process it before M2 (although M3’s logical
clock timestamp is smaller than M2’s).
20
© Baikuntha Acharya (baikunth2a@gmail.com)
Vector Clocks
✓ Able to perform a causal ordering of events using their timestamp.
✓ Each process Pi has a clock 𝐶𝑝𝑣𝑖 which is an integer vector of length
n (n is the number of processes).
✓ 𝐶𝑝𝑣𝑖 (𝑎) is the timestamp of event a in process Pi.
✓ 𝐶𝑝𝑣𝑖 𝑖 is the ith entry of 𝐶𝑝𝑣𝑖 .
✓ 𝐶𝑝𝑣𝑖 𝑗 , 𝑗 ≠ i, is the (logical) time of occurrence of the last event at Pj
• It is the “best guess” of the logical time at Pj.
21
© Baikuntha Acharya (baikunth2a@gmail.com)
Vector Clocks (Cont..)
Implementation
✓ [R1]: 𝐶𝑝𝑣𝑖 is incremented before each event is issued at process Pi:
• 𝐶𝑝𝑣𝑖 [𝑖] ≔ 𝐶𝑝𝑣𝑖 𝑖 + 1
✓ [R2]:
a) When a is the event of sending a message m
from process Pi , then the timestamp:
𝑡𝑚 = 𝐶𝑝𝑣𝑖 [𝑖] is included in m.
b) On receiving message m by process Pj , its
vector clock 𝐶𝑝𝑣𝑗 is updated as follows:
𝐶𝑝𝑣𝑗 𝑘 ≔ max 𝐶𝑝𝑣𝑗 𝑘 , 𝑡𝑚 𝑘 , ∀𝑘𝜖 1,2, . . 𝑛 .
c) The new value of 𝐶𝑝𝑣𝑗 is used to timestamp
the event of receiving message m by Pj
22
© Baikuntha Acharya (baikunth2a@gmail.com)
Vector Clocks (Cont..)
Causal Ordering from Timestamp
✓ For any two vector timestamps u and v, we have:
✓ Two events a and b are causally related if and only if
Cv(a) < Cv(b) or Cv(b) < Cv(a). Otherwise the events are
concurrent.
✓ Provides causal ordering of events:
• a → b if and only if Cv (a) < Cv (b).
• Timestamps can be used to determine causal order.
✓ If Send(M1) → Send(M2)
• then every recipient of both messages M1 and M2 must receive M1 before M2.
23
© Baikuntha Acharya (baikunth2a@gmail.com)
References
✓ Dawadi, B.R. "OpratingSystem-MS". Retrieved from: baburd.com.np
24
© Baikuntha Acharya (baikunth2a@gmail.com)