CS 417 – DISTRIBUTED SYSTEMS
Week 3: Part 1
Clock synchronization
© 2023 Paul Krzyzanowski. No part of this
Paul Krzyzanowski content, may be reproduced or reposted in
whole or in part in any manner without the
permission of the copyright owner.
Synchronization
Synchronization covers interactions among distributed processes
Clocks Identify when something happened
Mutual exclusion Only one entity can do an operation at a time
Leader election Who coordinates activity?
Message consistency Does everyone have the same view of events?
Agreement Can everyone agree on a proposed value?
All of these are trivial in non-distributed systems
All of these are tricky in distributed systems
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 2
Clock Synchronization
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 3
Why?
• Allow a process to identify "now" in a way that's consistent with other
processes on other systems
– Scheduling jobs in process control environments
– Applications where time-based billing or access control is needed
– Certificate validation
• Temporal ordering of events from concurrent processes
– Event logging: debugging, root cause analysis, tracking breaches
– Consistent file modification times in shared file systems
– Identifying latest versions
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 4
Logical vs. physical clocks
• Physical clocks keep time of day
– Consistent across systems
• Logical clock keeps track of event ordering
– among related (causal) events
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 5
Synchronizing physical clocks
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 6
Problem: Get two systems to agree on time
• Why is it hard?
– Two clocks hardly ever agree
– Quartz oscillators oscillate at slightly different frequencies
• Clocks tick at different rates
– Create ever-widening gap in perceived time ⇒ Clock Drift
• Relative offset = Difference between two clocks at one point in time
• Jitter = Short-term variation in frequency
• Also note — astronomical time vs. relative time
– Time of day vs. count of seconds from epoch
(e.g., Unix/Linux counts seconds from 00:00:00 UTC on 1 January 1970)
– Time of day takes time zones, daylight saving time, leap seconds, etc. into account
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 7
Dealing with drift
Not good idea to set a clock back
– Illusion of time moving backwards can confuse message ordering and
software development environments
Apply gradual clock correction
If fast (ahead):
Make the clock run slower until it synchronizes
If slow (behind):
Make the clock run faster until it synchronizes
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 8
Dealing with drift
The OS can do this:
1. Redefine the rate at which system time is advanced with each interrupt
or
2. Read the counter but compensate for drift
Adjustment changes slope of system time:
Drift compensation via a linear compensation function
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 9
Compensating for a fast clock
Computer’s time, C Clock synchronized
offset
Drift compensation
function applied
e
tim
ct
rfe
pe
UTC time, t
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 10
Compensating for a fast clock
Computer’s time, C
Now we’re drifting again!
e
tim
ct
rfe
pe
UTC time, t
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 11
Resynchronizing
After synchronization period is reached
– Resynchronize periodically
– Successive adjustment of a drift compensation function can bring us closer
to true slope
Long-term clock stability is not guaranteed
The system clock will still drift based on changes in temperature, pressure,
humidity, and age of the crystal
Keep track of adjustments and apply continuously
– e.g., Linux adjtimex system calls and hwclock command
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 12
Going to sleep
• RTC keeps on ticking when the system is off (or sleeping)
• OS cannot apply correction continually
• Record time when going to sleep
– Read hardware clock on wake-up
– Estimate drift for the interval and apply a correction factor
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 13
Getting accurate time
• Attach GPS receiver to each computer
– Accurate to ~40 ns
• Not a practical solution for every machine
– Cost, power, convenience, environment
– Accuracy gets worse near buildings, bridges, trees, …
• Chip-scale atomic clock
– Nice, but around $2,000
– Most computers won’t have this either
– And if you have it, you still need to set it
to give you the right time
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 14
Synchronize from a time server
Simplest synchronization technique
– Send a network request to obtain the time
– Set the time to the returned value
what’s the time?
time
client
server
3:42:19
Does not account for network or processing latency
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 15
Cristian’s algorithm
Compensate for delays
– Note times:
• request sent: T0
• reply received: T1
– Assume network delays are symmetric
Tserver
server
request reply
client
T0 T1 time
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 16
Cristian’s algorithm
Client sets time to:
Tserver
server
request reply
client
T0 T1 time
𝑇! − 𝑇" = estimated overhead
2 in each direction
T1 −T0
T new =Tserver +
2
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 17
Error bounds
If the minimum message transit time (Tmin) is known:
Place bounds on accuracy of result
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 18
Error bounds
Tserver
server
request reply
client
T0 T1 time
Tmin Tmin
Earliest time message arrives Latest time message leaves
range = T1 - T0 - 2Tmin
T1 −T0
accuracy of result = ± −T min
2
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 19
Cristian’s algorithm: example
• Send request at 5:08:15.100 (T0)
Note:
1,000 ms = 1 s
• Receive response at 5:08:15.900 (T1) 1,000,000 µs = 1s
• Response contains 5:09:25.300 (Tserver)
Elapsed time is T1 -T0 = 5:08:15.900 - 5:08:15.100 = 800 ms
Best guess: timestamp was generated 400 ms ago
Set time to Tserver+ elapsed time = 5:09:25.300 + 0.400 = 5:09.25.700
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 20
Cristian’s algorithm: example
If best-case message time=200 ms
Tserver
T0 = 5:08:15.100
server
T1 = 5:08:15.900
request reply
Ts = 5:09:25:300
client
T0 T1 time Tmin = 200 ms
200 200
800
#""$!"" &""
Error = ± % − 200 = ± %
− 200 = ±200 𝑚𝑠
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 21
Berkeley Algorithm
Gusella & Zatti, 1989
• Designed for intranets (e.g., data centers)
• Assumes no machine has an accurate time source
• Obtains time from participating computers
• Synchronizes all clocks to a fault-tolerant average
– Select the largest set of time values that don’t differ from each other by some quantity
– Avoids averaging values of malfunctioning clocks or clocks that drifted too far
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 22
Berkeley Algorithm: example
leader
3:00
9:
: 25 2:50
10
3
3:25 2:50 9:10
1. Request timestamps from all followers
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 23
Berkeley Algorithm: example
leader
3:00
9:
: 25 2:50
10
3
3:25 2:50 9:10
Suppose max ∂=0:45
2. Compute fault-tolerant average:
3 : 25 + 2 : 50 + 3 : 00
= 3 : 05
3
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 24
Berkeley Algorithm: example
leader
3:00
-6
: 20 :0
5
-0 +0:15
3:25 2:50 9:10
3. Send offset to each client
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 25
Network Time Protocol, NTP
• 1991, 1992
– Internet Standard, version 3: RFC 1305
• June 2010
– Internet Standard, version 4: RFC 5905-5908
– IPv6 support
– Improve accuracy to tens of microseconds
– Dynamic server discovery
• November 2020
– Draft standard, NTP v5
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 26
NTP Goals
• Enable clients across Internet to be accurately synchronized to UTC despite
message delays
– Use statistical techniques to filter data and gauge quality of results
• Provide reliable service
– Survive lengthy losses of connectivity
– Redundant paths, redundant servers
• Provide scalable service
– Enable huge numbers of clients to synchronize frequently
– Offset effects of clock drift
• Provide protection against interference
– Authenticate source of data
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 27
NTP servers
clock clock 0
Arranged in strata
1
– Stratum 0 = master clock
– Stratum 1: systems connected 2
directly to accurate time source
– Stratum 2: systems synchronized 3
from 1st stratum systems
– … 4
– Stratum 15: systems synchronized
from 14th stratum systems
Synchronization Subnet
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 28
NTP Synchronization Modes
Broadcast (or multicast) mode Not supported in NTPv5
– Lower accuracy but efficient; for high-speed LANs
Client-server (procedure call) mode
– Cristian’s algorithm
Symmetric (peer-to-peer) mode Not supported in NTPv5
– Peer servers can synchronize with each other to provide mutual backup
• Usually used with stratum 1 & 2 servers
• Pair of servers retain data to improve synchronization over time
• Both devices act as requesters and responders – they operate in the same
stratum and the times converge to each other
All messages are delivered unreliably with UDP (port 123)
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 29
NTP Clock Quality
• Precision
– Smallest increment of time that can be read from the clock
• Jitter (dispersion)
– Difference in successive measurements
– Due to network delays, OS delays, and clock oscillator instability
• Accuracy
– How close is the clock to UTC?
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 30
NTP messages
• Procedure call and symmetric mode
– Messages exchanged in pairs: request & response
• Time encoded as a 64-bit value:
– Divide by 232 to get the number of seconds since Jan 1 1900 UTC
• NTP calculates:
– Offset for each pair of messages (θ): Estimate of time difference between two clocks
= correction that needs to be applied to a client clock to synchronize it
– Delay (δ): Travel time: ½ of total delay minus remote processing time
– Dispersion = jitter: Maximum offset error relative to reference clock
• Found via repeated synchronizations
• Use this data to find the preferred NTP server:
– Probe multiple servers – each several times
– Pick lowest dispersion … at the lowest stratum if tied
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 31
Simple Network Time Protocol (SNTP)
• Based on Unicast mode of NTP
– Subset of NTP, not new protocol
• Operates in multicast or procedure call mode
• Recommended for environments where server is root node and client is
leaf of synchronization subnet
• Root delay, root dispersion, reference timestamp ignored
• Does not
– Account for servers that provide inconsistent time stamps
– Detect malicious interference
v3 RFC 2030, October 1996; v4 RFC 5905, June 2010
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 32
SNTP Example
T2 T3
server
request reply
client
T1 T4 time
Round-trip network delay: Time offset:
(𝑇% −𝑇! ) + (𝑇' − 𝑇( )
𝜕 = 𝑇( − 𝑇! − (𝑇' − 𝑇% ) 𝑡=
2
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 33
SNTP example
T2=800 T3=850
server
request reply
time
client
T1=1100 T4=1200
Offset = ((800 - 1100) + (850 - 1200)) / 2
= ((-300) + (-350)) / 2 (𝑇!−𝑇") + (𝑇# − 𝑇$)
= -650 / 2 = -325 Time offset: 𝑡 =
2
Set time to T4 + t = 1200 - 325 = 875
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 34
SNTP = Cristian’s algorithm
T2 T3
server
request reply
Ts time
client
T1 T4
𝑇" + 𝑇#
Just define 𝑇! =
2
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 35
Key Points: Physical Clocks
• Cristian’s algorithm & SNTP
– Set clock from server
– But account for network delays
– Error: uncertainty due to network/processor latency
• Errors are additive
• Example: ±10 ms and ±20 ms = ±30 ms
– NTP: track jitter, error, stratum, and delay among several NTP servers to
choose the best one to synchronize from
• Adjust for local clock drift
– Linear compensation function
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 36
Precision Time Protocol
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 37
More accurate clock synchronization
• Why?
– Industrial process control: synchronizing equipment
– High frequency trading
– Power-grid controls
– Audio/video sync
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 38
PTP: IEEE 1588 Precision Time Protocol
• Designed to synchronize clocks on a LAN to sub-microsecond precision
– Designed for LANs, not global: low jitter, low latency
– Timestamps ideally generated at the MAC or PHY layers to minimize delay and jitter
• Determine master clock (called the Grandmaster)
– Use a Best Master Clock algorithm to determine which clock is most precise
– The Grandmaster sends periodic synchronization messages to others (slave devices)
• Two phases in synchronization
1. Offset correction
2. Delay correction
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 39
PTP: Choose the “best” clock
Best Master Clock
• Distributed election based on properties of clocks
• Criteria from highest to lowest:
– Priority 1 (admin-defined hint)
– Clock class
– Clock accuracy
– Clock variance: estimate of stability based on past syncs
– Priority 2 (admin-defined hint #2)
– Unique ID (tie-breaker)
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 40
PTP: Master initiates sync
master T1
time
syn
c
slave
T2 time
Master initiates the protocol by sending a sync message containing a timestamp
Slave timestamps arrival with a timestamp from its local clock
Offset + Delay = T2 - T1
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 41
PTP: Send delay request
master T1 T4
time
st
ue
Offset + Delay = T2 - T1
syn
req
c
lay
de
slave
T2 T3 time
Slave needs to figure out the network delay. Send a delay request
Note the time it was sent
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 42
PTP: Receive delay response
master T1 T4
time
de
st
lay
ue
Offset + Delay = T2 - T1
syn
req
res
c
po
lay
nse
de
slave
T2 T3 time
Master marks the time of arrival and returns it in a delay response
Delay response = Delay - Offset = T4 – T3
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 43
PTP: Slave computes offset
master T1 T4
time
de
st
lay
ue
Delay + Offset = T2 - T1
syn
req
res
c
Delay - Offset = T4 - T3
po
lay
nse
de
slave
T2 T3 time
master_slave_difference = T2 – T1 = delay + offset
- slave_master_difference = T4 – T3 = delay – offset
master_slave_difference – slave_master_difference = 2(offset)
(T2 – T1) – (T4 – T3) = T2 – T1 – T4 + T3 = 2(offset)
offset = ( T2 – T1 – T4 + T3 ) ÷ 2
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 44
PTP: Example
825 865 885 925 950 990 Time at the master
40 40 40
master T1 T4
time
de
20
st
lay
ue
syn
delay = 40
req
res
offset = 235
po
lay
... but we don’t know this yet
nse
de
slave
T2 T3 time
1060 1100 1120 1160 1185 1225
T2 – T1 = 1100-825 = 275 = delay + offset
T4 – T3 = 925-1120 = -195 = delay – offset
master_slave_difference = T2 – T1 = delay + offset
275 – (-195) = 470 = 2(offset)
slave_master_difference = T4 – T3 = delay – offset
offset = 470/2 = 235
master_slave_difference – slave_master_difference = 2(offset) Time is set to 1225 - offset
offset = (T2 – T1 – T4 + T3) ÷ 2 = 1225 – 235 = 990
when we receive last msg
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 45
NTP vs. PTP
• Range
– NTP: nodes widely spread out on the Internet
– PTP: local area networks
• Often implemented at the physical layer to eliminate OS & scheduling overhead
• Accuracy
– NTP usually several milliseconds on WAN
– PTP usually sub-microsecond on LAN (around 1 µs)
• PTP can be 10,000x more precise than NTP!
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 46
The End
February 5, 2023 CS 417 © 2023 Paul Krzyzanowski 47