03 Communication PDF
03 Communication PDF
Communication 2
Communication 3
• Low level layers
– Physical layer: describes how bits are transmitted between two directly
connected nodes
– Data link layer: describes how a series of bits is packed into a frame to
allow for error and flow control
– Network layer: describes how packets in a network of computers are to be
routed
• Transport layer
– Describes how data is transmitted among two nodes, offering a service
independent from the lower layers
– It provides the actual communication facilities for most distributed systems
– Standard Internet protocols
• TCP: connection-oriented, reliable, stream-oriented communication
• UDP: unreliable (best-effort) datagram communication
• Higher level layers
– Merged together in the current, Internet practice
Communication 4
Communication 5
Middleware is build above transport level, in the middle in between transport layer and application layer
Communication 6
Transient: communication only happens only if the two parts of the communication are active and ready to receive, if one of the two is not active
the network is lost.
Persistent: communication can happen even if one of the two are not connected (e.g. e-mail, WhatsApp), there is something in the middle that stores
the message. It can be stored at various level.
In WhatsApp:
first thick, synchr with server
second thick, synchr with operating system on the other side (on the receiver)
blue thick, synchr with other user
Communication 7
• Fundamentals • Message oriented
– Protocols and protocol communication
stacks – Fundamentals
– Middleware – Message passing (sockets
• Remote procedure call and MPI)
– Fundamentals – Message queuing
– Discovering and binding – Publish/subscribe
– Sun and DCE • Stream-oriented
implementations communication
• Remote method invocation – Fundamentals
– Fundamentals – Guaranteeing QOS
Communication 8
• Parameter passing in a local procedure call
– the stack before (a) and after (b) the call to:
count = read(fd, buf, nbytes)
Communication 9
• Different mechanisms to pass parameters
– By value. Like in C when passing basic data types
– By reference. Like in C when passing arrays or in Java
when passing objects
– By copy/restore. Similar but slightly different than
previous one
Copy/Restore = you pass 3, it's elaborated and becomes 5, then 5 is copied back in the main program.
It's equivalent to pass by reference. It's seems like the value passed is shared.
It can loose the fact is that two variables references to the same value.
Communication 10
• Considerations
– Application developers are familiar with procedure call
– Well-engineered procedures operate in isolation (black box) helping
structuring code
– There is no fundamental reason not to execute procedures on separate
machine
• Conclusion
– Remote communication
can be hidden by using
procedure-call
mechanism
Communication 11
procedure execution
application application
code code
local invocation result procedure
call return transfer call
reply message
invocation message
Communication 12
Communication 13
The language of the signature is written in a special language, IDL, interface definition language, that is a language that describes signatures.
This solve the case in which the code in two machines are written in different languages.
Communication 15
Passing by reference means to pass a pointer to another machine, which makes no sense. So the solution is passing by copy/restore (it simula
But you lose referential integrity.
Communication 17
Communication 18
• Problem: find out which server (process)
provides a given service
– Hard-wiring this information in the client code is
highly undesirable
– Two distinct problems:
• Find out where the server process is
• Find out how to establish communication with it
Communication 19
#
• Introduce a daemon process (portmap) that binds calls and
server/ports:
– The server picks an available port and tells it to portmap, along with the
service identifier
– Clients contact a given portmap and:
• Request the port necessary to establish communication
• portmap provides its services only to local clients, i.e., it solves
only the second problem
– The client must know in advance where the service resides
In sun you have to specify the IP
– However:
• A client can multicast a query to multiple daemons
• More sophisticated mechanisms can be built or integrated
– e.g., directory services
Communication 20
" #
Communication 21
"
• Problem: server processes may remain active even in
absence of requests, wasting resources
• Solution: introduce another (local) server daemon
that:
– Forks the process to serve the request
– Redirects the request if the process is already active
– Clearly, the first request is served less efficiently Cause you have to wait for it
to start.
• In Sun RPC:
– inetd daemon
– The mapping between requested service and server
process is stored in a configuration file (/etc/services)
Communication 22
In the case the two processes are on the same
machine
They can share memory
Communication 23
Up to know, RPC was synchronous, the caller is suspended until the called ends.
There are some cases in which I don't need to be sure that the function called has to be done until my next instruction.
In this case it is an asynchronous RPC.
$%
• Sun RPC includes the ability to perform batched RPC
– RPCs that do not require a result are buffered on the client
– They are sent all together when a non-batched call is requested (or when a timeout
expires)
– Enables yet another form of asynchronous RPC
• A similar concept can be used to deal with mobility (as in the Rover toolkit by
MIT):
– If a mobile host is disconnected between sending the request and receiving the
reply, the server periodically tries to contact the mobile host and deliver the reply
– Requests and replies can come through different channels
– Depending on network conditions and application requirements, the network
scheduler module may decide to:
• Send requests in batches
• Compress the data
• Reorder requests and replies in a non-FIFO order, e.g., to suit application-specified priorities
– Promises are used at the client
Communication 25
• Fundamentals • Message oriented
– Protocols and protocol stacks communication
– Middleware – Fundamentals
• Remote procedure call – Message passing (sockets
and MPI)
– Fundamentals
– Message queuing
– Discovering and binding
– Publish/subscribe
– Sun and DCE
implementations • Stream-oriented
• Remote method invocation communication
– Fundamentals – Fundamentals
– Guaranteeing QOS
Communication 26
• Same idea as RPC, different programming
constructs
– The aim is to obtain the advantages of OOP also in
the distributed setting
• Important difference: remote object references
can be passed around
– Need to maintain the aliasing relationship
• Shares many of the core concepts and
mechanisms with RPC
– Sometimes built on top of an RPC layer
Communication 27
Invoke method on remote object.
Communication 28
• Java RMI
– Single language/platform (Java and the Java Virtual
Machine)
– Easily supports passing parameters by reference or “by
value” even in case of complex objects
– Supports for downloading code (code on demand)
• OMG CORBA
– Multilanguage/multiplatform So no passing by value
– Supports passing parameters by reference or by value
• If objects are passed by value (valuetype) it is up to the
programmer to guarantee the same semantics for methods on
the sender and receiver sides
Communication 29
• Fundamentals • Message oriented
– Protocols and protocol communication
stacks – Fundamentals
– Middleware – Message passing (sockets
• Remote procedure call and MPI)
– Fundamentals – Message queuing
– Discovering and binding – Publish/subscribe
– Sun and DCE • Stream-oriented
implementations communication
• Remote method invocation – Fundamentals
– Fundamentals – Guaranteeing QOS
Communication 30
Asynchronous
Communication 31
• Synchronous vs. asynchronous
– Synchronous: the sender is blocked until the recipient has stored (or
received, or processed) the message
– Asynchronous: the sender continues immediately after sending the message
• Transient vs. persistent
– Transient: sender and receiver must both be running for the message to be
delivered
– Persistent: the message is stored in the communication system until it can
be delivered
• Several alternatives (and combinations) are provided in practice
Communication 32
Pure
Asynchronous
Receipt-based
synchronous
Response-based
Delivery-based synchronous
synchronous
Communication 33
Synchronous
Asynchronous
Communication 34
• The most straightforward form of message oriented communication is message passing
– Typically directly mapped on/provided by the underlying network OS functionality (e.g.,
socket)
– A (kind of) middleware provides another form of message passing called MPI
• Message queuing and publish/subscribe are two different models provided at the middleware
layer
– By several “communication servers”
– Through what is nowadays called an overlay network
Communication 35
From A to some middle machines (communication server, as proxes) and in the end to B (the receiver).
In the middle there are buffers that stores temporally messages.
• Unicast TCP and UDP (and multicast IP) are well know network
protocols
– The related RFCs describe how they work in practice (on top of IP)
– But how a poor programmer can take advantage of such protocols?
• Berkeley sockets are the answer!
– First appeared in Unix BSD in 1982
– Today available for every platform
• Sockets provide a common abstraction for inter-process
communication
– Unix and Internet sockets exists. Here we are interested in the latter
– Allows for connection-oriented (stream i.e., TCP) or connectionless
(datagram, i.e., UDP) communication
Communication 36
• The server accepts connection on a port
• The client connects to the server
• Each (connected) socket is uniquely identified by 4 numbers: The
IP address of the server, its “incoming” port, the IP address of
the client, its “outgoing” port
P3
Client
P1
Server
P2
Client
Communication 37
Communication 38
No difference between client and server anymore
"
• Client and server use the same approach to send and receive
datagrams
• Both create a socket bound to a port and use it to send and
receive datagrams
• There is no connection and the same socket can be used to send
(receive) datagrams to (from) multiple hosts
P1
Peer 1
P3
Peer 3
P2
Peer 2
Communication 39
"
socket() socket()
rcvfrom() sendto()
sendto() recvfrom()
close() close()
Communication 40
Form of group communication. All nodes connected to the group receive a copy of the packet. Group are identified by special IP
addresses. Groups are open.
Communication 41
• Limitation of sockets
– Low level
– Protocol independent (and so awkward to use)
• In high performance networks (e.g., clusters of
computers) we need higher level primitives for
asynchronous, transient communication...
• ...providing different services besides pure read
and write
• MPI was the (platform independent) answer
Aww so good
Communication 42
• Communication takes place within a known group of processes
• Each process within a group is assigned a local id
– The pair (groupID, processID) represents a source or destination address
– Messages can also be sent in broadcast to the entire group (MPI_Bcast, MPI_Reduce,
MPI_Scatter, MPI_Gather)
• No support for fault tolerance (crashes are supposed to be fatal)
• Main MPI primitives
Communication 43
Socket, MPI are not persistent, if the receiver is offline the message is lost.
With message queuing the message are not lost if the receiver is offline.
%
• Point-to-point persistent asynchronous communication
– Typically guarantee only eventual insertion of the message in the recipient
queue (no guarantee about the recipient’s behavior)
– Communication is decoupled in time and space
– Can be regarded as a generalization of the e-mail
• Intrinsically peer-to-peer architecture
• Each component holds an input queue and an output queue
• Many commercial systems:
– IBM MQSeries (now WebSphere MQ), DECmessageQ, Microsoft Message
Queues (MSMQ), Tivoli, Java Message Service (JMS), …
Communication 44
Primitive Meaning
Put Append a message to a specified queue
Get Block until the specified queue is nonempty, and remove the first message
Poll Check a specified queue for messages, and remove the first. Never block
Notify Install a handler to be called when a message is put into the specified queue
program A program B
begin begin
… MOM …
attach_queues(); middleware attach_queues();
… …
loop loop
put(msg); get(msg);
… …
… …
… …
get(msg); put(msg);
end end
end end
Communication 45
& %
• Clients send requests to the server’s queue
• The server asynchronously fetches requests, processes
them, and returns results in the clients’ queues
– Thanks to persistency
and asynchronicity, Client
clients need not
remain connected
Client Server
– Queue sharing
simplifies load
balancing Client Server
Communication 46
%
• Queues are identified by symbolic names
– Need for a lookup service, possibly distributed, to convert
queue-level addresses in network addresses
– Often pre-deployed static topology/naming
Communication 47
%
• Queues are manipulated by queue managers
– Local and/or remote, acting as relays (a.k.a. applicative routers)
• Relays often organized in an overlay network
– Messages are routed by using application-level criteria, and by relying on a
partial knowledge of the network
– Improves fault tolerance
– Provides applications with
multi-point without IP-level
multicast
Communication 48
%
• Message brokers provide application-level gateways
supporting message conversion
– Useful when integrating sub-systems
Communication 49
&
• Application components can publish asynchronous event notifications, and/or
declare their interest in event classes by issuing a subscription
– extremely simple API: only two primitives (publish, subscribe)
– event notifications are simply messages
• Subscriptions are collected by an event dispatcher component, responsible for
routing events to all matching subscribers
– Can be centralized or distributed
• Communication is:
– Transiently asynchronous
– Implicit
– Multipoint
• High degree of decoupling among components
– Easy to add and remove components
– Appropriate for dynamic environments
Communication 50
• The expressiveness of the subscription language allows one to distinguish
between:
– Subject-based (or topic-based)
• The set of subjects is determined a priori
• Analogous to multicast
• e.g., subscribe to all events about “Distributed Systems”
– Content-based
• Subscriptions contain expressions (event filters) that allow clients to filter events based
on their content
• The set of filters is determined by client subscriptions
• A single event may match multiple subscriptions
• e.g., subscribe to all events about a “Distributed System” class with date greater than
16.11.2004 and held in classroom D04
• The two can be combined
• Tradeoffs:
– Complexity of the implementation vs. expressiveness
– However, expressiveness allows additional filtering!
Communication 51
• In event-based systems a special component of the architecture,
the event dispatcher, is in charge of collecting subscriptions and
routing event notifications based on such subscriptions
– For scalability reasons, its implementation can be distributed
Communication 52
"
Message
Dispatcher
• Centralized
– A single component is in charge of
collecting subscriptions and forward
messages to subscribers
• Distributed
– A set of message brokers organized in an
overlay network cooperate to collect
subscriptions and route messages
– The topology of the overlay network
and the routing strategy adopted may
vary
• Acyclic vs. cyclic overlay
Communication 53
• Every broker stores only subscriptions coming from directly
connected clients
• Messages are forwarded from broker to broker...
• ...and delivered to clients only if they are subscribed
Communication 54
• Every broker forwards subscriptions to the others
• Subscriptions are never sent twice over the same link
• Messages follow the routes laid by subscriptions
• Optimizations may exploit coverage relationships
– E.g., “Distributed *” > “Distributed systems”
– Fusion, subsumption, summarization
Communication 55
"
• Each time a broker receives a message it must
match it against the list of received filters to
determine the list of recipients
• The efficiency of this process may vary,
depending on the complexity of the subscription
language, but also on the forwarding algorithm
chosen
– It greatly influences the overall performance of the
system
Communication 56
• Assumes a rooted overlay tree
• Both messages and subscriptions are forwarded by brokers
towards the root of the tree
• Messages flow “downwards” only if a matching subscription had
been received along that route
Communication 57
"
Communication 58
&
Communication 59
&
• Every source defines a
shortest path tree (SPT)
• Forwarding table keeps
information organized per
source
– For each source v the
children in the SPT
associated with v
– For each children u a
predicate which joins the
predicates of all the nodes
reachable from u along the
SPT
Communication 60
• Same as PSF but leveraging
the concept of indistinguishable
sources
• Two sources A e B with SPT
T(A) and T(B) are
indistinguishable from a node
n if n has the same children
for T(A) and T(B) and
reaches the same nodes along
those children
• This concept brings several
advantages
– Smaller forwarding tables
– Easier to build them
Communication 61
&
• The source of a message calculates
the set of receivers and add them to
the header of the message
• At each hop the set of recipients is
partitioned
• Two different tables:
– The unicast routing table
– A forwarding table with the
predicate for each node in the
network
Communication 62
" ' "'
• Builds minimum latency SPTs
• Use request packets (config)
and reply ones (config
response).
• Every node acquire a local
view of the network (its
neighbors in the SPT)
Communication 63
&
• Allows to build SPTs based
on different metrics
• Use packets carrying
information about the known
state of the network (Link-
State Packet - LSP) forwarded
when a node acquire new
information
• Every node discovers the
topology of the whole
network
• SPTs are calculated locally
and independently by every
node
Communication 64
(
• CEP systems adds the ability to deploy rules that
describe how composite events can be generated from
primitive (or composite) ones
– Recently a number of languages and systems have been
proposed to support such architecture (both under the CEP
and DSMS labels)
Communication 65
• The rule language
– Find a balance between expressiveness and processing
complexity
• The processing engine
– How to efficiently match incoming (primitive) events to build
complex ones
• Distribution
– How to distribute processing
• Clustered vs. networked
solutions
Communication 66
• Fundamentals • Message oriented
– Protocols and protocol communication
stacks – Fundamentals
– Middleware – Message passing (sockets
• Remote procedure call and MPI)
– Fundamentals – Message queuing
– Discovering and binding – Publish/subscribe
– Sun and DCE • Stream-oriented
implementations communication
• Remote method invocation – Fundamentals
– Fundamentals – Guaranteeing QOS
Communication 67
&
• Data stream: A sequence of data units
– Information is often organized as a sequence of data units. E.g., text,
audio,...
• Time usually does not impact the correctness of the communication
– Just its performance
• In some cases this is not the case
– E.g., when sending a video “in streaming”, i.e., to be played “on-line”
• Transmission modes
– Asynchronous: The data items in a stream are transmitted one after the other
without any further timing constraints (apart ordering)
– Synchronous: There is a max end-to-end delay for each unit in the data
stream
– Isochronous: There is max and a min end-to-end delay (bounded jitter)
• Stream types: Simple vs. complex streams (composed of substreams)
Communication 68
• Non-functional requirements are often expressed as Quality of Service (QoS)
requirements
– Required bit rate
– Maximum delay to setup the session
– Maximum end-to-end delay
– Maximum variance in delay (jitter)
Communication 69
"
• IP is a best effort protocol!
– So much for QoS :-)
• Actually it offers a Differentiated Services field (aka Type
Of Service - TOS) into its header
– 6 bits for the Differentiated Services Code Point (DSCP) field
– 2 bits for the Explicit Congestion Notification (ECP) field
– The former encodes the Per Hop Behaviour (PHB)
• Default
• Expedited forwarding. Usually less than 30% of traffic and often much
less
• Assured forwarding (further divided into 4 classes)
• Not necessarily supported by Internet routers
Communication 70
• Buffering
– Control max jitter by
sacrificing session
setup time
• Forward error
correction
• Interleaving data
– To mitigate the impact of
lost packets
Communication 71
!
• Synchronizing two or more streams is not easy
• It could mean different things
– Left and right channels at CD quality → each sample must be synchronized → 23
µsec of max jitter
– Video and audio for lip synch → each audion interval must be in synch with its
frame → at 30 fps 33 msec of max jitter
• Synchronization may take place at the sender or the receiver side
– In the former case the different streams can be merged together
• It may happen at different layers (application vs. middleware)
Communication 72