Darshan Institute of Engineering & Technology
Darshan Institute of Engineering & Technology
Darshan Institute of Engineering & Technology
Q1. Define Distributed System. What are the advantages of Distributed System?
• Definition: A distributed system is a collection of independent computer that appears to its
user as a coherent (logical) system.
• A distributed system consists of concurrent (parallel) processes accessing distributed resources.
• A distributed system consists of autonomous computers linked by a computer n/w and
equipped with a distributed system s/w.
• Resources are shared through message passing in a network environment that may be
unreliable and contain un-trusted components.
Characteristics of Distributed System
• Difference between various computers and the ways in which they communicate are mostly
hidden from users.
• User and application can interact with a distributed system in a consistent (reliable) way,
regardless where and when interaction takes place.
• It should be relatively easy to expand.
• DS will normally be available continuously, if some parts may be temporarily out of order.
Advantages of Distributed System
• Inherently distributed applications
• Information sharing among geographically distributed users
• Resource sharing: hardware and software
• Better price performance ratio
• Shorter response time and higher throughput
• Higher reliability and availability against component failure
• Extensibility and incremental growth
• Better flexibility in meeting user’s needs
Q2. What is the difference between Shared Memory Architecture & Distributed
Memory Architecture? Explain it with diagrams.
Shared memory architecture:
• Shared memory systems form a major category of multiprocessors. In this category, all processors
share a global memory (See above fig.).Communication between tasks running on deferent
processors is performed through writing to and reading from the global memory.
• All interprocessor coordination and synchronization is also accomplished via the global memory.
Two main problems need to be addressed when designing a shared memory system:
• In this architecture each processor has its own local memory an all the processing is done locally.
• All systems are interconnected through a LAN.
• No sharing among the address spaces.
• Easy to implement, built through commodity hardware with the interconnection being
established through a standard Ethernet or ATM switches.
• If a processor wants to access a remote memory location, the information will be sent as a
message transfer through the communication network.
The comparison between workstation model and workstation-server model are following
• It is economically more viable to use a few high-end costly servers and more diskless
workstations. Diskless workstations are easier to maintain than diskful ones.
• In a workstation-server model, the request –response protocol indicates that the client does not
get burdened and the process migration becomes unnecessary.
• The user also has the flexibility of changing his workstation.
• Workstation is connected in a suitable network configuration using the star topology.
• Workstation-server model consists of multiple workstations coupled with powerful servers with
extra hardware to store the file systems and other software.
• Workstation-server model is suitable for sharing the resources between different systems in a
modular fashion.
• Consider a brief comparison of the other two processor-pool model and workstation-server
model are following
• The processor-pool model uses computing resources more effectively, all the resources of the
system being available to the currently working users.
• Workstation-server model offer services only to individual clients.
• Processor-pool model is that some of its processors can work as servers, if the load has increased
or if more users are logged in and demanding new services.
• A hybrid model of a distributed system may be built by combining the features of the
workstation-server model and the processor-pool model.
Workstation-server Model:
• The Workstation-server model consists of multiple workstations coupled with powerful servers
with extra hardware to store the file systems and other software like databases.
Processor-pool Model:
• The processor-pool model consists of multiple processors and a group of workstations.
• The model is based on the observation that most of the time a user does not need any computing
power.
• In this model the process is pooled together to be shared by the users as needed.
• The Processor-pool of process consists of large microcomputers and minicomputers attach to the
n/w.
• Each processor has its own memory to load and run.
• The processors in the pool have no terminals attached directly to them, and the user access the
system from terminals that are attached to the n/w via special device.
• The processors in the pool have no terminals attached directly to them, and the user access the
system from terminals that are attached to the n/w via special device.
• An appropriate number of processors are temporarily assigned to his or her job by the run server.
Q5. What are the various issues related to Design Distributed System?
Transparency:
Reliability:
• Reliability in terms of data means that data should be available without any errors.
• Distributed system where multiple processors are available and the system become reliable.
Scalability:
• The techniques used to handle scalability issues in distributed systems are hide communication
latencies, hide distribution, and hide replication.
The three techniques commonly used for scaling are discussed next.
• Hide communication latencies
• Hide distribution
• Hide replication
Figure. The difference between letting (a) a server or (b) a client check forms as they are being filled.
• Micro kernels use the minimalist, modular approach with accessibility to other services as
needed.
Performance:
• Response time, throughput, system utilization, bandwidth requirements
Security:
Security consists of three main aspects, namely,
• Confidentiality: Protection against unauthorized access.
• Integrity: Protection of data against corruption.
• Availability: Protection against failure and always being accessible.
one expects real-time interaction with the distributed system, this may be very noticeable.
Transparency Description
Access Hide differences in data representation and how a resource is
accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in
use
Replication Hide that a resource may be shared by several competitive
users
Concurrency Hide that a resource may be shared by several competitive
users
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on disk
Process addressing:
• In process addressing machine ID and process ID is used for addressing.
• Client kernel uses the machine ID to locate the current machine.
• Server kernel uses the process ID to locate the process on that machine.
• This method is not transparent because the user knows the location of the server.
Name server technique:
• Use extra machine to map ASCII level names to machine address. The ASCII strings are embedded
in the program. This special server is called a name server.
• Hardwire the machine number into the client server.
• Processes pick random address and the machine address is located by broadcast method.
• Use a two-level naming scheme with mapping carried out by the name server.
Tier 1
Tier 2
Tier 3
Q10. How does a Distributed Operating System differ from a Network Operating
System?
• Network operating system is a software architecture used to build a distributed system from a
network of workstations connected by high speed networks.
• A distributed operating system enables a distributed system to behave like a virtual uniprocessor,
even though the system operates on a collection of machines. This is possible using various
characteristics like:
• Interprocess communication
• Uniform process management mechanism
(b) Internetworks
(c) WAP
(a) Wireless LAN:
• WLAN technology allows computer network devices to have network connectivity without the use
of wires i.e. using radio waves.
• This network uses Bluetooth, infrared link or low power radio network operating at a bandwidth
of 1-2 mbps over a distance of 10m.
• Users are connected to the LAN trough a wireless client adapter.
• In figure 2-4 top level, wireless networks can be classified as wireless LAN, wireless MAN, and
wireless WAN.
• These wireless LANs classified as PAN and business LANs.
• Wireless MANs are wireless local loops.
• The wireless WANs are classified as cellular networks, satellite systems and paging networks.
(b) Internetworks:
• In figure 2-5 internetworks consists of several interconnected networks providing data
communication facilities.
• The heterogeneous nature of protocols, technologies like Ethernet, FDDI, and token ring, and the
methods for interconnection are hidden from the users.
• The major components of an internetwork are hardware interconnection by routers, gateways
and software layer which support addressing and data transmission.
• The seven layer of the OSI reference model are shown in figure 2-7
• Every layer performs independent network function dealing with some specific aspect of
communication.
• In figure 2-8 are demonstrated the communication between data link layer
• For example communicates with the network layer and the physical layer of system A and the
data link layer of system B.
• Standard A:
• The User-Network Interface (UNI) standard contains the Generic Flow Control field (GFC) and
Virtual Path Identifier (VPI) one-octet field.
• Standard B:
• The Network-Network Interface (NNI) has one and a half octets long VPI but does not contain the
GFC field.
• In figure 2-18 ATM frame consists of a frame header and a group of cells, Which can be in use or
idle.
In figure 2-20 are classified based on the time delay, bit rate, and connection mode.
• The VMTP request response behavior is depicted in figure 2-22. This technique provides feedback
when transmission rate is high.
• Rate based flow control mechanism: VMTP users burst protocol and sends blocks of packet to the
receivers with spaced inter packet gaps.
• This techniques helps in matching data transmission and reception speeds.
• For example money transfers from a bank account. Server does not respond to such request by
executing them more than once.
• Conditional delivery for real time communication: message should be delivered only if the server
can immediately process it.
Q16. What is Message Passing? How does the Message Passing Approach is differ
from Share Memory Approach?
Definition: A type of communication between processes. Message passing is a form of
communication used in parallel programming and object-oriented programming. Communications
are completed by the sending of messages (functions, signals and data packets) to recipients.
Q17. How the Message Format is required in IPC? Explain the components of
Message Format with diagram.
• A message consists of data and control information, formatted by the sender so that it is
meaningful to the server. The sender process adds control information to the data block with
headers and trailers before transmission. So message format is required for IPC.
• The header information of a message block consists of the following element, as shown in figure
3-4.
• Address: it uniquely identifies the sender and receiver processes in the network by specifying the
source and destination address.
• Sequence number: it is a message identifier which takes care of duplicate and lost message in
case of system failure.
• Structural information: it has a two parts. Type part specifies the message carries data or only the
pointers to data. Length part specifies length of the variable sized data in bytes.
• In a message passing system, the sender is responsible for formatting the contents of the
message.
• Receiver is fully aware about the message format used by the server in the communication
process.
• A blocking primitive’s invocation blocks the execution of the invoker. The receiver blocks itself
until a message arrives.
• Invocation of a non blocking primitive does not block the execution of the invoker.
• Non blocking operation return straight away and allow the sub program to continue execution.
• In fig 3-9 the blocking and non blocking primitives for client server communication.
• The client sends a message, it makes a trap to the kernel and blocks itself.
• Message passing primitives like blocking and non blocking operation are used to achieve
synchronization during IPC.
Q20. How does the Single-Message Buffering Scheme is differ from Multiple-Message
Buffering Scheme?
• For synchronous communication, a single message buffer strategy server the purpose. In case of
receiver is not ready to receive the message, it is stored in the buffer.
• Message is readily available to the receiver when it is ready.
• This message passing technique involves two copy operations, as shown in figure 3-14.
• The message buffer can be located either in the receiver’s address space or the kernel process’s
address space.
• Multiple message buffers with a capacity of a few messages on the receiver side.
• Multiple message buffers avoid the buffer overflow issue. In this technique sender may not aware
that the receiver buffer has overflowed and will keep on sending messages.
Multicast Communication:
• One to much group communication called multicasting communication. And is widely used in
distributed system.
• Multicasting is shown in figure 3-27. For example a server manager needs a volunteer for load
balancing.
• In multicast communication send a request to all servers processes and selects the one which
responds first.
• The client stub reads the network messages from the local kernel.
• After converting the return values, the client stub returns to the client function.
• When the message arrives at the server, the stub examines the message to see which procedure
is needed and then makes the appropriate call. If the server also supports other remote
procedures, the server stub might have a switch statement in it to select the procedure to be
called, depending on the first field of the message.
• The actual call from the stub to the server looks much like the original client call, except that the
parameters are variables initialized from the incoming message.
• As long as the client and server machines are identical and all the parameters and results are
scalar types, such as integers, characters, and Booleans, this model works fine. However, in a
large distributed system, it is common that multiple machine types are present.
• Each machine often has its own representation for numbers, characters, and other data items.
For example, IBM mainframes use the EBCDIC character code, whereas IBM personal computers
use ASCII. As a consequence, it is not possible to pass a character parameter from an IBM PC
client to an IBM mainframe server using the simple scheme of Fig. 2-3: the server will interpret
the character incorrectly.
RRA protocol:
• RRA protocol requires that the client should acknowledgement the receipt of the reply messages.
• RRA protocol involves three messages per call, as seen in figure 4-16
• Assign unique IDs to request message.
• Reply message are matched with the corresponding acknowledgement message.
• Client acknowledges the reply message onil if it receives the replies to all earlier requests.
• It depends on number of messages involved in the communication between the client and the
server for RPC execution.
Q27. Define Client-Server binding. Briefly explain all the issues in client-server
binding.
• Client Stub must know the location of a server before RPC can take place between them.
• Process by which client gets associated with server is known as BINDING.
• Servers export their operations to register their willingness to provide services and
• Clients import operations, asking the RPC Runtime to locate a server and establish any state that
may be needed at each end.
• When a binding is altered by the concerned server, it is important to ensure that any state data
held by the server is no longer needed or can be duplicated in the replacement server.
Maybe semantics
• In this semantic, client may not know whether the remote method is executed once or not at all.
• Useful in application where failed invocation are acceptable.
Server class:
• This class contains the implementation of serve side code, objects which run on the server. It
also consists of a description of the object state and the implementation of methods which
operate on that state. The skeleton on the server side stub is generated from the interface
specification of the object.
Prepared by: Nishant R. Viramgama 180702 – Distributed System
Darshan Institute of Engineering & Technology Interprocess communication
Client class:
• This class contains the implementation of client side code and proxy. Ti is generated from the
object interface specification. The proxy basically converts each method call into a message
that is sent to the server side implementation of the remote object.
• Communication is set up to the server and is cut down when call is complete. The proxy state
contains the server network address, end to the server and local ID of the object.
• Proxies are serialized in java, marshaled and sent as a series of bytes to another process where
it is unmarshalled, and can invoke methods in remote objects.
• Marshalling code is replaced with the implementation handle for remote object reference. In
java, RMI references to objects are a few hundred bytes.
• Java RMI allows object specification solution, which is flexible. Objects whose state rarely
changes are made truly distributed.
• At each invocation, the client checks the state at binding to ensure consistency. Each process
executes the same JVM.
Q33. What do you mean by physical clock and what are the problems with
unsynchronized clocks?
Physical clock:
• Every computer needs a timer mechanism to keep track of current time and various accounting
purpose.
• Such as calculating the time spent by a process in CPU utilization, disk I/O etc.
• In DS an application may have process that concurrently run on a multiple nodes of the system.
• For correct result applications require that the clock of the nodes is synchronized with each
other.
• For example: Distributed on-line reservation system.
• Each CPU has its own clock is required that all the clocks in the system display the same time.
Problems with unsynchronized clocks:
• Consider a distributed online reservation system in which the last available seat may get
booked from multiple nodes if their local clocks are not synchronized.
• Synchronized clocks enable measuring the time duration of distributed activities which starts on
node and terminate on another node. For example: there is a need to calculate the time taken
to transmit a message from one node to another at any time.
• As shown in figure 5-1 event occurred after another event may nevertheless be assigned an
earlier time.
• In figure newly modified ‘output.c’ will not be re-compiled by the ‘make’ program because of a
slightly slower clock on the editor’s machine.
• For correctness, all distributed application needs the clocks on the nodes to be synchronized
with each other.
• In this algorithm each node periodically sends a message called time = T to the time server.
• When the time server node receives the message, it responds with time=T message. This is
depicted in figure 5-7.
Active time server centralized algorithm:
• In the active server method, the time server periodically broadcast its clock time ‘time=T’.
• All other nodes receive the broadcast message and use the clock time in the message for
correcting their own clocks.
• This method is not fault tolerant.
• If a broadcast message reaches a node a little late at a node due to, say, a communication link
failure, the clock of the client node will readjusted to an incorrect value.
Clocks whose values, if cbnnmnmnmmn
• The centralized heuristic algorithm does not require advance information and is also called as
the top down algorithm.
• In figure 6-4, a coordinator maintains the usage table with one entry for every user and this is
initially zero. When events happen, messages are sent to the coordinator and the table is
updated.
• For example: the entries could be that a processor is requested, a processor becomes free, or a
clock tick has occurred.
• If the machines become overloaded, the coordinator to allocate a processor to it. The request is
granted if the processor is available and no one else wants processor to it; request is denied
temporarily, and note is made if no processor are free.
• It always favours a light user than a heavy one.
• The heuristic used for processor allocation is that when the CPU becomes free, pending request
whose owners have the lowest score win.
• In this algorithm, the distributed entities cooperate with each other to make scheduling
decisions.
• It is more complex and its stability is better than no cooperative algorithm.
• Process migration is the relocation of a process from its current location to another node.
• It may be migrated either before it starts executing on its source node or during the course of
its execution.
• The former is known as non-preemptive process migration and,
• The latter is known as preemptive process migration.
• Preemptive process is costlier than non-preemptive.
• Process migration involves the following major steps:
• Selection of process that should be migrated.
• Selection of the destination node to which the selected process should be migrated.
• Actual transfer of the selected process to the destination node.
• Once we identify the process to be migrated and where it is migrated, the next step is to carry
out process migration.
The major steps involved in process migration are:
• Freezing process on the source node
• Starting process on the destination node
• If two threads want to double the same global variable, it is best done one after another. One
thread should exclusively access it, double it, and pass the control to the other thread.
• A critical region is implemented using a mutex variable which is a binary semaphore having two
states: locked and unlocked.
• In figure 6-26 lock operation attempt to lock the mutex. It succeeds if unlocked and the mutex
becomes locked in a single atomic action.
• If more than thread is waiting in the mutex, one is released, while the others continue to wait.
• The trylock operation attempt to lock a mutex. Based on the status of the mutex, either the
mutex unlocks, or trylock returns a status code which indicates success.
• The thread package provides another synchronization feature called the condition variable
which is associated with a mutex at the time it is created.
• A DSM system provides a logical of shared memory which is built using a set of interconnected
nodes having physically distributed memories.
• As shown in figure 7-1 DSM is a collection of workstation connected by a LAN sharing of virtual
address space.
• Processes running on different nodes can exchange messages through the implementation of a
simple passing system.
• The DSM abstraction exists only virtually and presents a consolidated memory space to all the
nodes.
• DSN systems are claimed to be scalable and achieve very high computing power.
• The commonly used replacement algorithms are divided into the following groups: usage-based
versus non-usage-based algorithms, and fixed versus variable-space algorithms (see Figure 7-
34).
Munin:
• Munin treats the collection of all memories in the distributed system as a single address space
with coherence enforced by the software.
• Munin views memory on each machine as a collection of disjointed segments.
• The virtual address space of each processor is partitioned into shared and private areas. The
private area is local to each processor and contains non-shared data; the runtime structures are
used to manage memory coherence; and the system memory map is used to record the
segments of global shared memory that are currently mapped into the local portion of shared
memory.
• The system map may also contain hints about other processors' shared memory areas, which
may not always be reliable, due to infrequent updates.
• Munin servers on each machine interact with the application program and the underlying
distributed OS to ensure that the segments are correctly mapped into the local memory when
they are accessed.
• Munin performs fault handling in a manner analogous to page fault handling in a virtual
memory system. When a thread accesses an object for which there is no local copy, a memory
fault occurs. This causes Munin to suspend the faulting thread and invoke the associated server
to handle the fault.
• The server checks to see what type of shared object the thread faulted on and invokes the
appropriate fault handler. The suspended thread is then resumed after handling the fault.
• Munin treats each shared data object as one of the following nine types of objects, private,
write once, write many, result, synchronization, migratory, producer consumer, read mostly
and general read write.
Linda:
• Out operation: Outs locally and store the tuples only on machine it was generated.
• In operation: the machine must broadcast a template.
Centralized approach:
• This approach used for generating unstructured names and is shown in figure 9-14. This system
incorporates a centralized global unique identifier generator that generates a standard global
unique identifier for each object in the system.
• This approach is simple and easy but it exhibits poor efficiency and reliability.
• In this approach each node may either bind or map the global identifier to the locally created
object.