Distributed Systems
(4th edition, version 01)
Chapter 01: Introduction
Introduction From networked systems to distributed systems
Distributed versus Decentralized
What many people state
Centralized Decentralized Distributed
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Distributed versus Decentralized
What many people state
Centralized Decentralized Distributed
When does a decentralized system become distributed?
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Distributed versus Decentralized
What many people state
Centralized Decentralized Distributed
When does a decentralized system become distributed?
• Adding 1 link between two nodes in a decentralized system?
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Distributed versus Decentralized
What many people state
Centralized Decentralized Distributed
When does a decentralized system become distributed?
• Adding 1 link between two nodes in a decentralized system?
• Adding 2 links between two other nodes?
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Distributed versus Decentralized
What many people state
Centralized Decentralized Distributed
When does a decentralized system become distributed?
• Adding 1 link between two nodes in a decentralized system?
• Adding 2 links between two other nodes?
• In general: adding k > 0 links....?
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Alternative approach
Two views on realizing distributed systems
• Integrative view: connecting existing networked computer systems into a
larger a system.
• Expansive view: an existing networked computer systems is extended
with additional computers
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Alternative approach
Two views on realizing distributed systems
• Integrative view: connecting existing networked computer systems into a
larger a system.
• Expansive view: an existing networked computer systems is extended
with additional computers
Two definitions
• A decentralized system is a networked computer system in which
processes and resources are necessarily spread across multiple
computers.
• A distributed system is a networked computer system in which processes
and resources are sufficiently spread across multiple computers.
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Some common misconceptions
Centralized solutions do not scale
Make distinction between logically and physically centralized. The root of the
Domain Name System:
• logically centralized
• physically (massively) distributed
• decentralized across several organizations
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Some common misconceptions
Centralized solutions do not scale
Make distinction between logically and physically centralized. The root of the
Domain Name System:
• logically centralized
• physically (massively) distributed
• decentralized across several organizations
Centralized solutions have a single point of failure
Generally not true (e.g., the root of DNS). A single point of failure is often:
• easier to manage
• easier to make more robust
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Some common misconceptions
Centralized solutions do not scale
Make distinction between logically and physically centralized. The root of the
Domain Name System:
• logically centralized
• physically (massively) distributed
• decentralized across several organizations
Centralized solutions have a single point of failure
Generally not true (e.g., the root of DNS). A single point of failure is often:
• easier to manage
• easier to make more robust
Important
There are many, poorly founded, misconceptions regarding scalability, fault
tolerance, security, etc. We need to develop skills by which distributed systems
can be readily understood so as to judge such misconceptions.
Distributed versus decentralized systems
Introduction From networked systems to distributed systems
Perspectives on distributed systems
Distributed systems are complex: take persepctives
• Architecture: common organizations
• Process: what kind of processes, and their relationships
• Communication: facilities for exchanging data
• Coordination: application-independent algorithms
• Naming: how do you identify resources?
• Consistency and replication: performance requires of data, which need to
be the same
• Fault tolerance: keep running in the presence of partial failures
• Security: ensure authorized access to resources
Studying distributed systems
Introduction Design goals
What do we want to achieve?
Overall design goals
• Support sharing of resources
• Distribution transparency
• Openness
• Scalability
Introduction Design goals
Sharing resources
Canonical examples
• Cloud-based shared storage and files
• Peer-to-peer assisted multimedia streaming
• Shared mail services (think of outsourced mail systems)
• Shared Web hosting (think of content distribution networks)
Observation
“The network is the computer”
(quote from John Gage, then at Sun Microsystems)
Resource sharing
Introduction Design goals
Distribution transparency
What is transparency?
The phenomenon by which a distributed system attempts to hide the fact that
its processes and resources are physically distributed across multiple
computers, possibly separated by large distances.
Distribution transparency
Introduction Design goals
Distribution transparency
What is transparency?
The phenomenon by which a distributed system attempts to hide the fact that
its processes and resources are physically distributed across multiple
computers, possibly separated by large distances.
Observation
Distribution transparancy is handled through many different techniques in a
layer between applications and operating systems: a middleware layer
Distribution transparency
Introduction Design goals
Distribution transparency
Types
Transparency Description
Access Hide differences in data representation and how an
object is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location
while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several
independent users
Failure Hide the failure and recovery of an object
Distribution transparency
Introduction Design goals
Degree of transparency
Aiming at full distribution transparency may be too much
Distribution transparency
Introduction Design goals
Degree of transparency
Aiming at full distribution transparency may be too much
• There are communication latencies that cannot be hidden
Distribution transparency
Introduction Design goals
Degree of transparency
Aiming at full distribution transparency may be too much
• There are communication latencies that cannot be hidden
• Completely hiding failures of networks and nodes is (theoretically and
practically) impossible
• You cannot distinguish a slow computer from a failing one
• You can never be sure that a server actually performed an operation
before a crash
Distribution transparency
Introduction Design goals
Degree of transparency
Aiming at full distribution transparency may be too much
• There are communication latencies that cannot be hidden
• Completely hiding failures of networks and nodes is (theoretically and
practically) impossible
• You cannot distinguish a slow computer from a failing one
• You can never be sure that a server actually performed an operation
before a crash
• Full transparency will cost performance, exposing distribution of the
system
• Keeping replicas exactly up-to-date with the master takes time
• Immediately flushing write operations to disk for fault tolerance
Distribution transparency
Introduction Design goals
Degree of transparency
Exposing distribution may be good
• Making use of location-based services (finding your nearby friends)
• When dealing with users in different time zones
• When it makes it easier for a user to understand what’s going on (when
e.g., a server does not respond for a long time, report it as failing).
Distribution transparency
Introduction Design goals
Degree of transparency
Exposing distribution may be good
• Making use of location-based services (finding your nearby friends)
• When dealing with users in different time zones
• When it makes it easier for a user to understand what’s going on (when
e.g., a server does not respond for a long time, report it as failing).
Conclusion
Distribution transparency is a nice goal, but achieving it is a different story, and
it should often not even be aimed at.
Distribution transparency
Introduction Design goals
Openness of distributed systems
Open distributed system
A system that offers components that can easily be used by, or integrated into
other systems. An open distributed system itself will often consist of
components that originate from elsewhere.
What are we talking about?
Be able to interact with services from other open systems, irrespective of the
underlying environment:
• Systems should conform to well-defined interfaces
• Systems should easily interoperate
• Systems should support portability of applications
• Systems should be easily extensible
Openness
Introduction Design goals
Policies versus mechanisms
Implementing openness: policies
• What level of consistency do we require for client-cached data?
• Which operations do we allow downloaded code to perform?
• Which QoS requirements do we adjust in the face of varying bandwidth?
• What level of secrecy do we require for communication?
Implementing openness: mechanisms
• Allow (dynamic) setting of caching policies
• Support different levels of trust for mobile code
• Provide adjustable QoS parameters per data stream
• Offer different encryption algorithms
Openness
Introduction Design goals
On strict separation
Observation
The stricter the separation between policy and mechanism, the more we need
to ensure proper mechanisms, potentially leading to many configuration
parameters and complex management.
Finding a balance
Hard-coding policies often simplifies management, and reduces complexity at
the price of less flexibility. There is no obvious solution.
Openness