[go: up one dir, main page]

0% found this document useful (0 votes)
119 views30 pages

Module 3 - Distributed Objects & Remote Invocation

Distributed Objects and Remote Invocation Lecture Notes by Steve Donald and Noura Limam Modified by Maxwell Young Last Class covered communication in networks Defining a protocol'' OSI model (layered protocols), message format for protocol stack Connection-less, connection-oriented protocols Discussion of TCP, UDP Criteria for applications: data loss, bandwidth, timing interface between Transport and Application layers (sockets)

Uploaded by

Dusan Namasi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views30 pages

Module 3 - Distributed Objects & Remote Invocation

Distributed Objects and Remote Invocation Lecture Notes by Steve Donald and Noura Limam Modified by Maxwell Young Last Class covered communication in networks Defining a protocol'' OSI model (layered protocols), message format for protocol stack Connection-less, connection-oriented protocols Discussion of TCP, UDP Criteria for applications: data loss, bandwidth, timing interface between Transport and Application layers (sockets)

Uploaded by

Dusan Namasi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Module 3 - Distributed Objects & Remote Invocation

Lecture Notes by Steve Donald and Noura Limam Modified by Maxwell Young

Last Class

Covered communication in networks


Defining a ``protocol'' OSI model (layered protocols), message format for protocol stack Connection-less, connection-oriented protocols Discussion of TCP, UDP Criteria for applications: data loss, bandwidth, timing Interface between Transport and Application layers (sockets) Hardware infrastructure of the Internet Circuit Switching vs Packet Switching, Delay

Course Modules

Module 1: Architectures/Models of Distributed Systems Module 2: Overview of Computer Networks Module 3: Distributed Objects and Remote Invocation Module 4: Distributed Naming Module 5: Distributed File Systems Module 6: Synchronization Module 7: Data Replication Module 8: Fault Tolerance Module 9: Security

CS454/654

0-3

Programming Models for Distributed Applications

Remote Procedure Call (RPC)


Extension of the conventional procedure call model Allows client programs to call procedures in server programs Object-oriented Extension of local method invocation Allows an object in one process to invoke the methods of an object in another process Allows sender and receiver to communicate without blocking and without the receiver having to be running
3-4

Remote Method Invocation (RMI)


Message-Based Communication

CS454/654

Essentials of Interprocess Communication

Ability to communicate - exchange messages

This is the responsibility of the networks and the request-reply protocol Interfaces Processes have to be able to understand what each other is sending

Ability to talk meaningfully


Agreed standards for data representation Well discuss this later

CS454/654

3-5

Interface

Defines what each module can do in a distributed system Service interface (in RPC model)

specification of procedures offered by a server specification of methods of an object that can be invoked by objects in other processes

Remote interface (in RMI model)

Example: CORBA IDL


//In file Person.idl Struct Person { string name; string place; long year; }; Inerface PersonList { readonly attribute string listname; void addPerson(in Person p); void getPerson(in string name, out Person p); long number(); };

CS454/654

3-6

Remote Procedure Call (RPC)


Extension of the familiar procedure call semantics to distributed systems (with a few twists) Allows programs to call procedures on other machines Process on machine A calls a procedure on machine B:

The calling process on A is suspended; and execution of the called procedure takes place on B; Information can be transported from the caller to the callee in the parameters; and information can come back in the procedure result No message passing or I/O at all is visible to the programmer Calling and called procedures execute in different address spaces Parameters and results have to be passed, sometimes between different machines Both machines can crash
3-7

Subtle problems:

CS454/654

Local Procedure Call

Consider a C program with the following call in the main program


count = read(fd, buf, nbytes) where fd (file descriptor) and nbytes (no. bytes to read) are integers, and buf (buffer

into which data are read) is an array of characters


Local variables for main Local variables for main

Local variables for main

SP

nbytes buf fd Return address Reads local variables

SP

0 Before call to read

0 While read is active

SP

0 After return to main

The read routine is extracted from library by linker and inserted into the object program. It puts the parameters in registers and issues a READ system call by trapping to the kernel. CS454/654
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms Prentice-Hall, Inc. 2002

3-8

RPC Operation

Objective: Make interprocess communication among remote processes similar to local ones (i.e., make distribution of processes transparent) RPC achieves its transparency in a way analogous to the read operation in a traditional single processor system

When read is a remote procedure (that will run on the file servers machine), a different version of read, called client stub, is put into the library Like the original read, client stub is called using the calling sequence described in previous slide Also, like the original read, client stub traps to the kernel Only unlike the original read, it does not put the parameters in registers and ask the kernel to give it data Instead, it packs the parameters into a message and asks the kernel to send the message to the server Following the call to send, client stub calls receive, blocking itself until the reply comes back When reply returns, extract results and return to caller

CS454/654

3-9

RPC Operation (2)


Client
Client procedure Client stub
Marshall params

Server
Comm. (OS) Comm. (OS) (3) Call packet(s) (8) Result packet(s) Server stub Server procedure

Call() . . .

(1)

(2)

Transmit

Receive

(4) (7)

Unmarshall arguments Marshall results

(5) (6)

Call

(10) Unmarshall (9)


results

Receive

Transmit

Return

1. 2. 3. 4. 5.

Client procedure calls the stub as local call Client stub builds the message and calls the local OS Client OS sends msg to server OS Server OS hands over the message to server stub Server stubs unpacks parameters and calls server procedure

6. 7. 8. 9. 10.

Server procedure does its work and returns results to server stub Server stub packs results in a message and calls its local OS Server OS sends msg to client OS Client OS hands over the message to client stub Client stubs unpacks result and returns from procedure call

CS454/654

3-10

Parameter Passing

Parameter passing by value


Compose messages that consist of structures for values Problem: Multiple machine types in a distributed system. Each often has its own representation for data! Agree on data representation Difficult Use copy/restore

Parameter passing by reference


Client machine . . n=sum(4,7); . . Message sum 4 7

Stubs

Server machine sum(i,j) int i,j; { return(i+j); } Kernel

sum 4 7

Kernel Parameter passing by value CS454/654

3-11

Data Representation

Not a problem when processes are on the same machine Different machines have different representations Examples:

Characters: IBM Mainframes use EBCDIC character code and IBM PCs use ASCII. It is not possible to pass a character from an IBM PC client to an IBM Mainframe server. Integers: Different computers may store bytes of an integer in different order in memory (e.g.,: little endian, big endian), called host byte order
00000000 00000001 5 J
0 4

Little Endian (80*86, Intel 486) Big Endian (IBM 370, Sun SPARC) 0
3 7

00000000 00000000 0
2 6

00000000 00000000 0 L
0 4

00000001 00000000
1 5

2 6

1 5

0 4

1 5

3 7

2 6

3 7

L L I J Original message on Intel 486

I L L Message after receipt on SPARC

L I J Message after being inverted

CS454/654

3-12

The Way Out

Additional information needed about integers and strings

Available, since the items in the message correspond to the procedure identifier and the parameters.

Once a standard has been agreed upon for representing each of the basic data types, given a parameter list and a message, it is possible to deduce which bytes belong to which parameter, and thus to solve the problem Given these rules, the requesting process knows that it must use this format, and the receiving process knows that incoming message will have this format Having the type information of parameters makes it possible to make necessary conversions

foobar (x,y,z) char x; float y; int z[5]; { }

foobar y 5 Z[0] Z[1] Z[2] Z[3] Z[4] x

CS454/654

3-13

Data Transmission Format

Option 1: Devise a canonical form for data types and require all senders to convert their internal representation to this form while marshalling.

Example: Use ASCII for characters, 0 (false) and 1 (true) for Booleans, etc., with everything stored in little endian Consequence: receiver no longer needs to worry about the byte ordering employed by the sender Sometimes inefficient: Both processes talking big endian; unnecessary conversion at both ends!

Option 2: Sender uses its own native format and indicates in first byte of message which format is used. Receiver stub examines first byte to see what the client is. Everyone can convert from everyone elses format.

Everyone has to know everyone elses format Insertion of a new machine with another format is expensive (luckily, does not happen too often) CORBA uses this approach

CS454/654

3-14

Question of the Day


2 math grads meet that havent seen each other in over 20 years

First grad asks: How have you been? Second: Great! I got married and I have three daughters. First: How old are they? Second: The product of their ages is 72, and the sum of their ages is the same as the number on that building over there.. First: Right, ok.. oh wait.. hmm, i still dont know Second: Oh sorry, the oldest one just started to play the piano First: Wonderful! my oldest is the same age!

Question: how old are the daughters? From www.techinterview.org

CS454/654

3-15

RPC in Presence of Failures

Five different classes of failures can occur in RPC systems


The client is unable to locate the server The request message from the client to the server is lost The reply message from the server to the client is lost The server crashes after receiving a request The client crashes after sending a request

CS454/654

3-16

Client Cannot Locate the Server

Examples:

Server might be down Server evolves (new version of the interface installed and new stubs generated) while the client is compiled with an older version of the client stub Use a special code, such as -1, as the return value of the procedure to indicate failure. In Unix, add a new error type and assign the corresponding value to the global variable errno.

Possible solutions:

-1 can be a legal value to be returned, e.g., sum(7, -8) Not every language has exceptions/signals (e.g., Pascal). Writing an exception/signal handler destroys the transparency

Have the error raise an exception (like in ADA) or a signal (like in C).

CS454/654

3-17

Lost Request Message

Kernel starts a timer when sending the message:


If timer expires before reply or ACK comes back: Kernel retransmits If message truly lost: Server will not differentiate between original and retransmission everything will work fine If many requests are lost: Kernel gives up and falsely concludes that the server is down we are back to Cannot locate server

CS454/654

3-18

Lost Reply Message

Kernel starts a timer when sending the request message:

If timer expires before reply comes back: Retransmits the request

Problem: Not sure why no reply (reply/request lost or server slow) ? Problem: What if the request is not idempotent, e.g. money transfer

If server is just slow: The procedure will be executed several times

Way out: Clients kernel assigns sequence numbers to requests to allow servers kernel to differentiate retransmissions from original

CS454/654

3-19

Server crashes
REQ

Server
Receive Execute Reply (a)

REQ No REP

Server
Receive Execute Crash (b)

REQ No REP

Server
Receive Crash (c)

REP

Problem: Clients kernel cannot differentiate between (b) and (c)


(Note: Crash can occur before Receive, but this is the same as (c).)

3 schools of thought exist on what to do here:


Wait until the server reboots and try the operation again. Guarantees that RPC has been executed at least one time (at least once semantic) Give up immediately and report back failure. Guarantees that RPC has been carried out at most one time (at most once semantics) Client gets no help. Guarantees nothing (RPC may have been carried out anywhere from 0 to a large number). Easy to implement.
3-20

CS454/654

Client Crashes

Client sends a request and crashes before the server replies: A computation is active and no parent is waiting for result (orphan)

Orphans waste CPU cycles and can lock files or tie up valuable resources Orphans can cause confusion (client reboots and does RPC again, but the reply from the orphan comes back immediately afterwards) Extermination: Before a client stub sends an RPC, it makes a log entry (in safe storage) telling what it is about to do. After a reboot, the log is checked and the orphan explicitly killed off.

Possible solutions

Expense of writing a disk record for every RPC; orphans may do RPCs, thus creating grandorphans impossible to locate; impossibility to kill orphans if the network is partitioned due to failure.

CS454/654

3-21

Client Crashes (2)

Possible solutions (contd)

Reincarnation: Divide time up into sequentially numbered epochs. When a client reboots, it broadcasts a message declaring the start of a new epoch. When broadcast comes, all remote computations are killed. Solve problem without the need to write disk records
If network is partitioned, some orphans may survive. But, when they report back, they are easily detected given their obsolete epoch number Gentle reincarnation: A variant of previous one, but less Draconian

When an epoch broadcast comes in, each machine that has remote computations tries to locate their owner. A computation is killed only if the owner cannot be found

Expiration: Each RPC is given a standard amount of time, T, to do the job. If it cannot finish, it must explicitly ask for another quantum.

Choosing a reasonable value of T in the face of RPCs with wildly differing requirements is difficult

CS454/654

3-22

Asynchronous RPC

Traditional (synchronous) RPC interaction

Asynchronous RPC interaction

CS454/654

From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms Prentice-Hall, Inc. 2002

3-23

Asynchronous RPC (2)

A client and server interacting through two asynchronous RPCs


CS454/654
From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms Prentice-Hall, Inc. 2002

3-24

Where do stub procedures come from?


Client
Client procedure Client stub
Marshall params

Server
Comm. (OS) Comm. (OS) (3) Call packet(s) (8) Result packet(s) Server stub Server procedure

Call() . . .

(1)

(2)

Transmit

Receive

(4) (7)

Unmarshall arguments Marshall results

(5) (6)

Call

(10) Unmarshall (9)


results

Receive

Transmit

Return

In many systems, stub procedures are generated automatically by a special stub compiler Given a specification of the server procedure and the encoding rules, the message format is uniquely determined It is, thus, possible to have a compiler read the server specification and generate a client stub that packs its parameters into the officially approved message format Similarly, the compiler can also produce a server stub that unpacks the parameters and calls the server Having both stub procedures generated from a single formal spec. of the server not only make life easier for the programmers, but reduces the chance of error and makes the system transparent with respect to differences in internal representation of data.

CS454/654

3-25

Writing Clients/Servers (using DCE RPC)

CS454/654

From Tanenbaum and van Steen, Distributed Systems: Principles and Paradigms Prentice-Hall, Inc. 2002

3-26

How does the client locate the server?

Hardwire the server address into the client

Fast but inflexible!

A better method is dynamic binding:

When the server begins executing, the call to initialize outside the main loop exports the server interface This means that the server sends a message to a program called a binder, to make its existence known. This process is referred to as registering To register, the server gives the binder its name, its version number, a unique identifier (32-bits), and a handle used to locate it The handle is system dependent (e.g., Ethernet address, IP address, an X.500 address, ) Call Input Output Register Deregister Name, version, handle, unique id Name, version, unique id 3-27

Binder interface CS454/654

Dynamic Binding

When the client calls one of the remote procedures for the first time, say, read:

The client stub sees that it is not yet bound to a server, so it sends a message to the binder asking to import version x of server interface The binder checks to see if one or more servers have already exported an interface with this name and version number.

If no currently running server is willing to support this interface, the read call fails If a suitable server exists, the binder gives its handle and unique identifier to the client stub

Client stub uses the handle as the address to send the request message to. The message contains the parameters and a unique identifier, which the servers kernel uses to direct incoming message to the correct server
Call Look up Input Name, version Output Handle, unique id 3-28

Binder interface CS454/654

Dynamic Binding (2)

Advantages

Flexibility Can support multiple servers that support the same interface, e.g.:

Binder can spread the clients randomly over the servers to even load Binder can poll the servers periodically, automatically deregistering the servers that fail to respond, to achieve a degree of fault tolerance Binder can assist in authentication: For example, a server specifies a list of users that can use it; the binder will refuse to tell users not on the list about the server

The binder can verify that both client and server are using the same version of the interface The extra overhead of exporting/importing interfaces costs time The binder may become a bottleneck in a large distributed system
3-29

Disadvantages

CS454/654

Distributed Object Management

Bad news:

Additional complications
Distribution problems Object-orientation problems

Object ids Object state

Good news:

A good model for distributed computing

Computation is distributed among objects; now assume objects are located on different computers

Object-orientation features
Encapsulation Abstraction Extensible type system Code reuse (inheritance)

CS454/654

3-30

You might also like