[go: up one dir, main page]

0% found this document useful (0 votes)
31 views53 pages

DC - Unit 1 - Introduction Final

The document outlines the syllabus for a course on Distributed Computing, covering key topics such as introduction to distributed systems, logical time, distributed mutual exclusion, consensus and recovery, and cloud computing. It discusses the characteristics, motivations, benefits, applications, and drawbacks of distributed computing, as well as various communication primitives and algorithms. Additionally, it provides references and textbooks for further reading on the subject.

Uploaded by

shaminimurali30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views53 pages

DC - Unit 1 - Introduction Final

The document outlines the syllabus for a course on Distributed Computing, covering key topics such as introduction to distributed systems, logical time, distributed mutual exclusion, consensus and recovery, and cloud computing. It discusses the characteristics, motivations, benefits, applications, and drawbacks of distributed computing, as well as various communication primitives and algorithms. Additionally, it provides references and textbooks for further reading on the subject.

Uploaded by

shaminimurali30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

CS3551 - DISTRIBUTED COMPUTING

UNIT I INTRODUCTION 8
Introduction: Definition-Relation to Computer System Components – Motivation – Message -Passing Systems versus Shared Memory Systems – Primitives
for Distributed Communication – Synchronous versus Asynchronous Executions – Design Issues and Challenges; A Model of Distributed Computations: A
Distributed Program – A Model of Distributed Executions – Models of Communication Networks – Global State of a Distributed System.
UNIT II LOGICAL TIME AND GLOBAL STATE 10
Logical Time: Physical Clock Synchronization: NTP – A Framework for a System of Logical Clocks – Scalar Time – Vector Time; Message Ordering and Group
Communication: Message Ordering Paradigms -Asynchronous Execution with Synchronous Communication – Synchronous Program Order on Asynchronous
System – Group Communication – Causal Order – Total Order; Global State and Snapshot Recording Algorithms: Introduction – System Model and Definitions
– Snapshot Algorithms for FIFO Channels.
UNIT III DISTRIBUTED MUTEX AND DEADLOCK 10
Distributed Mutual exclusion Algorithms: Introduction – Preliminaries – Lamport’s algorithm – RicartAgrawala’s Algorithm –– Token-Based Algorithms –
Suzuki-Kasami’s Broadcast Algorithm; Deadlock Detection in Distributed Systems: Introduction – System Model – Preliminaries – Models of Deadlocks –
Chandy-Misra-Haas Algorithm for the AND model and OR Model.
UNIT IV CONSENSUS AND RECOVERY 10
Consensus and Agreement Algorithms: Problem Definition – Overview of Results – Agreement in a Failure-Free System(Synchronous and Asynchronous) –
Agreement in Synchronous Systems with Failures; Checkpointing and Rollback Recovery: Introduction – Background and Definitions – Issues in Failure
Recovery – Checkpoint-based Recovery – Coordinated Checkpointing Algorithm - - Algorithm for Asynchronous Checkpointing and Recovery

UNIT V CLOUD COMPUTING 7


Definition of Cloud Computing – Characteristics of Cloud – Cloud Deployment Models – Cloud Service Models – Driving Factors and Challenges of Cloud –
Virtualization – Load Balancing – Scalability and Elasticity – Replication – Monitoring – Cloud Services and Platforms: Compute Services – Storage Services –
Application Services
CS3551 - DISTRIBUTED COMPUTING

TEXT BOOKS
1. Kshemkalyani Ajay D, Mukesh Singhal, “Distributed Computing: Principles, Algorithms and
Systems”, Cambridge Press, 2011.
2. Mukesh Singhal, Niranjan G Shivaratri, “Advanced Concepts in Operating systems”, Mc-Graw Hill
Publishers, 1994.

REFERENCES
1. George Coulouris, Jean Dollimore, Time Kindberg, “Distributed Systems Concepts and Design”, Fifth
Edition, Pearson Education, 2012.
2. Pradeep L Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall of India, 2007.
3. Tanenbaum A S, Van Steen M, “Distributed Systems: Principles and Paradigms”, Pearson Education, 2007.
4. Liu M L, “Distributed Computing: Principles and Applications”, Pearson Education, 2004.
5. Nancy A Lynch, “Distributed Algorithms”, Morgan Kaufman Publishers, 2003.
6. Arshdeep Bagga, Vijay Madisetti, “ Cloud Computing: A Hands-On Approach”, Universities Press,
2014.

UNIT I – INTRODUCTION
Introduction
 Father : Le Sile B.Lamport
 Sister concern of OS
 OS and DC
 Need for Distributed computing
 - Efficiency
 - Fault Tolerance
 - Scalability
 Distributed computing and parallel computing
 Distributed computing and Distributed system
Definition and Characteristics

• Definition
A collection of computers said to be loosely coupled which has its own memory, runs its
own OS and communicates by message passing over a communication network
cooperating to address a problem collectively is termed to be Distributed computing.

• Characteristics
1. No common physical clock - independent start time
2. No shared memory - Direct or indirect communication through message passing
3. Geographical separation
- Resides at any location communicating through networking
- NOW / COW ( Network / Cluster of Workstation)
4. Autonomy and heterogeneity - Runs at different speed and on different OS
Motivation / Benefits

1)Inherently distributed computation


Distributed computing helps in distributed computation in the case of
communication between distant parties eg: money transfer in banking.

2) Resource sharing
• Datasets are distributed across several servers eg: Distributed database
such as DB2
• Replication also possible

3) Access to geographically remote data and resources


• Database - payroll data of an MNC
• Supercomputer - To be centralized
• Wireless technology
Motivation / Benefits
4) Enhanced reliability
• Availability - Resource must always be available
• Integrity - We must not get non relevant information
• Fault Tolerant - There must be a back up

5) Increased performance / cost ratio


• Overall performance is always better than normal computing with respect to
total cost

6) Scalability
• Bears with the load when number of users are increased
Motivation / Benefits

7) Modularity and incremental expandability


• Modularity - Configuration can be different in each node
• Incremental expandability – Nodes can be replaced or increased
Applications

 Applications
1) Finance and commerce - Amazon
2) Information society - Search engines
3) Cloud technologies - AWS
4) Entertainment - Online gaming, youtube
5) Healthcare - Online patient records
6) Education - E-learning
7) Transport and logistics - Google map
8) Environment management - Sensor technologies
Unique Drawbacks

1) Communication
• Communication does not come for free
• Limit on quantity of information sent
• Limit on the speed in which information is sent
• Cost of transmission is high

2) Incomplete Knowledge
• Each processor has only a limited knowledge of the systems and the
computations carried out by other processors ie it will not know the size
of input shared at other processor
Relation to computer system components

Distributed system connects processor by communication network

• Structure of DC holds various computers or nodes with processor and memory


for itself all connected via WAN / LAN.
• Communication network helps in information transfer between nodes.
Relation to computer system components

Interaction of software components at each processor

• Network protocol stack - http, mail, ftp and telnet


• Middleware - holds libraries required by distributed computing
- Egs: OMG - Object management group, CORBA - common object request
broker architecture, RPC - Remote procedure call , MPI – Message
passing interface
Relation to computer system components

Interaction of software components at each processor


• There is communication between network protocol stack, OS and middleware.
• Application layer functions will not be available in middleware
• Libraries will perform reliable and ordered multicasting
Message passing system versus shared memory system

Single processor system


Message passing - Common message queue
Shared memory - Common memory
Message passing system versus shared memory system

• Shared memory is used throughout the normal system


• Communication takes place through shared data variable and control variable
• Semaphores and monitors are used for achieving synchronization in shared
memory usage
• Distributed computing
Message passing called distributed shared memory is used for communication
in distributed computing
Message passing uses two primitives
send (dest_name , message)
receive ( source_name , message)
Emulating Message Passing on Shared Memory system
(MP  SP)

• Shared address space is divided into multiple parts


• Each part is assigned to different processor.
• Message passing uses two primitives Shared memory p1 p2 p3

send (dest_name , message)


receive ( source_name , message)
• Shared memory uses two primitives : Read and Write
• Message passing primitives has to be emulated into shared memory primitives
Send is emulated as write ( separate location is reserved into partition as
mailbox)
Receive is emulated as read
• Read and write operation has to be synchronized
Emulating Shared Memory on Message Passing system
(SM  MP)

• Read is emulated as Receive and Write is emulated as Send

• Each shared location is modeled as a separate process

(a) Write to a shared location is emulated by sending an update message to


the corresponding owner process

(b) Read to a shared location is emulated by sending a query message to the


corresponding owner process

• While using shared memory emulation , latency involved in read and write
operation may be high.
Emulating Shared Memory on Message Passing system
(SM  MP)

• An application can use combination of shared memory and message passing.

• In MIMD (Multiple Instruction and Multiple Data) Message passing


multicomputer system
- multiprocessor system is used
- Into multiprocessor system processor communicates via shared memory
- communication between two computers is done using message passing
Difference between Shared Memory and Message Passing

Shared Memory Message Passing


It is mainly used for data communication. It is mainly used for communication.

It offers a maximum speed of It takes a huge time because it is


computation because communication is performed via the kernel (system
completed via the shared memory, so the calls).
system calls are only required to establish
the shared memory.
The code for reading and writing the data No such code is required in this case
from the shared memory should be because the message passing feature
written explicitly by the developer. offers a method for communication
and synchronization of activities
executed by the communicating
processes.
Difference between Shared Memory and Message Passing

Shared Memory Message Passing


It is used to communicate between the It is most commonly utilized in a
single processor and multiprocessor distributed setting when communicating
systems in which the processes to be processes are spread over multiple
communicated are on the same machine devices linked by a network.
and share the same address space.

It is a faster communication strategy It is a relatively slower communication


than the message passing. strategy than the shared memory.

Make sure that processes in shared It is useful for sharing little quantities of
memory aren't writing to the same data without causing disputes.
address simultaneously.
Primitives for Distributed Communication
• Message sending and receiving primitives : Send() and Receive()

• Parameters of Send() primitive :


(a) Destination
(b) Buffer in user space, containing the data to be sent.

• Two ways of sending data when Send() is invoked:


(a) Buffered - copies data from user buffer to kernel buffer and then from
kernel buffer to network

(b) UnBuffered – Data is directly copied from user buffer onto network
Primitives for Distributed Communication

• Parameters of Receive() primitive :


(a) Source from which the data is to be received
(b) User buffer into which the data is to be received

• ways of receiving data when Receive() is invoked:


(a) Buffered – since data may already have arrived when the primitive is
invoked and needs a storage place in kernel
Primitives for Distributed Communication

Types of primitives

1. Synchronous primitives ( Send/Receive - A cares for B and B cares for A )

• Handshake between sender and receiver


• Send completes when Receive completes
• Receive completes when data copied into buffer
[[

2. Asynchronous primitives ( Send - A cares only about itself )


A Send primitive is said to be asynchronous if control returns back to the invoking
process after the data item to be sent has been copied out of the user-specified buffer.
Primitives for Distributed Communication

3. Blocking primitives ( Send/Receive - waits until its own task completes )


A primitive is blocking if control returns to the invoking process after the
processing for the primitive (whether in synchronous or asynchronous mode)
completes.
- Synchronous messaging
- Asynchronous messaging

4. Nonblocking primitives (Send/Receive - Not bothered about its own task )


• A primitive is non-blocking if control returns back to the invoking process
immediately after invocation, even though the operation has not completed.
• Send: even before data copied out of user buffer
• Receive: even before data may have arrived from sender
Primitives for Distributed Communication

• Return parameter on the primitive call returns a system generated Handle


which can be used to check the status of completion of the call.
Checking is done in three ways

1. Polling ( keeps checking in loop or periodically when the handle is posted )


Send(X, destination, handlek) //handlek is a return parameter

2. Wait with handle ( issues a wait with the list of handles )


Wait(handle1,handle2,...,handlek,...,handlem) //Wait always blocks

3. Blocking Wait – wait call blocks until one of the stipulated handle is posted

• Detecting and posting the completion of the processing of the primitive


Primitives for Distributed Communication

• Four versions of Send primitive


Primitives for Distributed Communication

• All the versions are illustrated using timing diagram


• Timing diagram has three time lines
1. Process execution
2. User buffer from / to which data is sent / received
3. Kernel / communication subsystem
Primitives for Distributed Communication
Pi is sender and Pj is receiver
(a) Blocking synchronous Send and Blocking synchronous Receive
b) Non Blocking synchronous Send and Non Blocking synchronous Receive
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication

c) Blocking Asynchronous Send d) Non Blocking Asynchronous Send


Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
Uses :
• A synchronous Send is easier to use from a programmer's perspective because
the handshake between the Send and the Receive makes the communication
1.
appear instantaneous, thereby simplifying the program logic.
[[

• The nonblocking synchronous Send also avoids the potentially large delays for
handshaking, particularly when the receiver has not yet issued the Receive call.
• The nonblocking asynchronous Send is useful when a large data item is being
sent because it allows the process to perform other instructions in parallel with
the completion of the Send.
• The nonblocking Receive is useful when a large data item is being received
and/or when the sender has not yet issued the Send call, because it allows the
process to perform other instructions in parallel with the completion of the
Receive.
Primitives for Distributed Communication

Processor Synchrony
• All the processors must have their clock synchronized.

• Not attainable in Distributed system

• Attainable only for larger granularity of code

• This synchronization can be achieved using barrier synchronization


• Barrier synchronization:
Ensures that no processor begins executing the next step of code until all the
processors have completed executing the previous steps of code assigned to
each of the processors.
Primitives for Distributed Communication
Libraries and Standards
[

• In computer systems we use different


Software products and scientific tools for
sending messages and making remote calls

• Big companies use their custom methods


like IBM’s CICS Customer Information
Control System)software.

• Scientists use libraries like MPI and PVM

• Commercial software's uses a method called RPC (used to call functions on different
computers).Types of RPC used – Sun RPC and DCE (Distributed computing Environment ) RPC.

• Sockets are used to make remote calls work



Primitives for Distributed Communication

Libraries and Standards


[

• Two other mechanism for communication – messaging (focusses on point to point communication
between specific applications) and streaming ( for broader data dissemination and real time
processing )

• For the object based software RMI (Remote Method Invocation) and ROI (Remote Object
Invocation) are used. And big standardized systems like CORBA and DCOM(Distributed Component
Object Model) are also used.
Synchronous VS Asynchronous Executions
SYNCHRONOUS ASYNCHRONOUS

Processors are synchronized and the clock There is no processor synchrony and there is
drift rates between any two processors is no bound on the drift rate of processor clocks.
bounded

Message delivery (transmission + delivery) Message delays (transmission + propagation


times are such that they occur in one logical times) are finite but unbounded
step or Round

There is a known upper bound on the time There is no upper bound on the time taken by
taken by a process to execute a step. a process to execute a step.
Synchronous VS Asynchronous Executions

Code for synchronous execution

All the messages sent in a round are received within that same round.

(1) Sync_Execution(int k, n)
// There are k rounds.
(2) for r = 1 to k do
(3) process i sends a message to (i + 1) mod n and (i −1) mod n;
(4) each process i receives a message from (i + 1) mod n and (i − 1) mod n;
(5) compute some application-specific function on the received values
[
Synchronous VS Asynchronous Executions
Emulation

A  S (Emulated in such a way that all communication is finished within the


same round in which it is initiated)
S  A (Emulated using a tool called synchronizer)

Emulation among the principal system classes in a failure free system


Design Issues and Challenges
1. From system perspective
2. Algorithmic challenges
3. Applications of distributed computing
From system perspective
4. Communication
5. Processes
6. Naming
7. Synchronization
8. Data storage and access
9. Consistency and replication
10. Fault tolerance
11. Security
12. Applications Programming Interface (API) and transparency
Design Issues and Challenges
1. Communication :
involves designing appropriate mechanisms for communication among the
processes in the network.
Eg: Remote Procedure Call (RPC), Remote Object Invocation (ROI), message- oriented
communication versus stream oriented communication

2. Processes :
management of processes and threads at clients/servers; code migration; and the
design of software and mobile agents

3. Naming :
Devising easy to use and robust schemes for names, identifiers, and addresses is essential
for locating resources and processes in a transparent and scalable manner. Naming in
mobile systems provides additional challenges because naming cannot easily be tied to any
static geographical topology
Design Issues and Challenges
4. Synchronization :
Mechanisms for synchronization or coordination among the processes are essential.
Eg: Mutual Exclusion and Leader election algorithms are required
Physical and Logical clocks has to be synchronized
Global state recording algorithms require different forms of synchronization

5. Data storage and access :


Proper schemes are required for storing and accessing the data in a fast and
scalable manner across the network in order to provide efficiency.

6. Consistency and replication :


Replication of data objects is highly desirable for fast access of data and
scalability. Dealing with the consistency among the replicas in distributed
setting is another challenge.
Design Issues and Challenges
7. Fault Tolerance :
(a) Fault tolerance requires maintaining correct and efficient operation in spite of
any failures of links, nodes, and processes.
(b) Process resilience, reliable communication, distributed commit,
checkpointing and recovery, agreement and consensus, failure detection, and self-
stabilization are some of the mechanisms to provide fault-tolerance.
8. Security :
involves various aspects of cryptography, secure channels, access control, key
management– generation and distribution, authorization, and secure group
management.

9. Applications Programming Interface and Transparency :.


The API for communication and other specialized services is important for the ease of
use and wider adoption of the distributed systems services by non-technical users.
Design Issues and Challenges
Transparency :
Transparency deals with hiding the implementation policies from the user

(a) Access Transparency : hides differences in data representation on different


systems and providing uniform operations to access
system resources.

(b) Location Transparency : makes the locations of resources transparent to the


users.

(c) Migration Transparency : allows relocating resources without the users


noticing it.
Design Issues and Challenges

(d) Relocation Transparency : The ability to relocate the resources as they are
being accessed

(e) Replication Transparency : does not let the user become aware of any
replication

(f) Concurrency Transparency : deals with masking the concurrent use of shared
resources for the user.

(g) Failure Transparency : refers to the system being reliable and fault-tolerant.
Design Issues and Challenges

10. Scalability and Modularity :


The algorithms, data (objects), and services must be as dis tributed as
possible. Various techniques such as replication, caching and cache
management, and asynchronous processing help to achieve scalability

Algorithmic challenges in Distributed Computing


Distributed Program

• It is composed of a set of n asynchronous processes p1, p2, ..., pi, ..., pn that
communicate by message passing over the communication network.

• Each process runs on a different processor

• Cij denote the channel from process pi to process pj

• mij denote a message sent by pi to pj.

• The processes do not share a global memory and communicate solely by passing
messages.

• processes do not share a global clock


Distributed Program

• Process execution and message transfer are asynchronous ( will not wait for
the delivery of message to complete)

• The global state of a distributed computation is composed of the states of the


processes and the communication channels
- The state of a process is characterized by the state of its local memory
- The state of a channel is characterized by the set of messages in transit
in the channel
Model of Distributed Execution
send(m) →msg rec(m)

• Dot - indicates an event , horizontal line - progress of the process , slant


arrow - indicates message transfer
• The actions of a process are modeled as three types of events:
internal events, message send events, and message receive events.
• Let eix denote the xth event at process pi.
• For message m, send(m) and rec(m) denote its send and receive events,
respectively.
Model of Distributed Execution

• The occurrence of events changes the states of respective processes and


channels.
- An internal event changes the state of the process at which it occurs.

- A send event (or a receive event) changes the state of the process that
sends (or receives) the message and the state of the channel on which
the message is sent (or received).

• The events at a process are linearly ordered by their order of occurrence.


Model of Distributed Execution

• The events at a process are linearly ordered by their order of occurrence.

• The execution of process pi produces a sequence of events ei1, ei2, ..., eix , eix+1
and is denoted by Hi
• Hi =(hi, →i)
hi is the set of events produced by pi
binary relation →i defines a linear order on these events.
• Relation →i expresses causal dependencies among the events of pi.

• The execution of an event takes a finite amount of time.


Model of Distributed Execution

Causal Precedence Relation


• For any two events ei and ej , ei → ej  ej → ei.
• For any two events ei and ej , ei → ej  ej → ei.

You might also like