Книга: Distributed operating systems
7.5.1. Remote Procedure Call
7.5.1. Remote Procedure Call
Normal point-to-point communication in Amoeba consists of a client sending a message to a server followed by the server sending a reply back to the client. It is not possible for a client just to send a message and then go do something else except by bypassing the RPC interface, which is done only under very special circumstances. The RPC primitive that sends the request automatically blocks the caller until the reply comes back, thus forcing a certain amount of structure on programs. Separate send and receive primitives can be thought of as the distributed system's answer to the goto statement: parallel spaghetti programming. They should be avoided by user programs and used only by language runtime systems that have unusual communication requirements.
Each standard server defines a procedural interface that clients can call. These library routines are stubs that pack the parameters into messages and invoke the kernel primitives to send the message. During message transmission, the stub, and hence the calling thread, are blocked. When the reply comes back, the stub returns the status and results to the client. Although the kernel-level primitives are actually related to the message passing, the use of stubs makes this mechanism look like RPC to the programmer, so we will refer to the basic communication primitives as RPC, rather than the slightly more precise "request/reply message exchange."
In order for a client thread to do an RPC with a server thread, the client must know the server's address. Addressing is done by allowing any thread to choose a random 48-bit number, called a port, to be used as the address for messages sent to it. Different threads in a process may use different ports if they so desire. All messages are addressed from a sender to a destination port. A port is nothing more than a kind of logical thread address. There is no data structure and no storage associated with a port. It is similar to an IP address or an Ethernet address in that respect, except that it is not tied to any particular physical location. The first field in each capability gives the port of the server that manages the object (see Fig. 7-3).
RPC Primitives
The RPC mechanism makes use of three principal kernel primitives:
1. get_request — indicates a server's willingness to listen on a port.
2. put_reply — done by a server when it has a reply to send.
3. trans — send a message from client to server and wait for the reply.
The first two are used by servers. The third is used by clients to transmit a message and wait for a reply. All three are true system calls, that is, they do not work by sending a message to a communication server thread. (If processes are able to send messages, why should they have to contact a server for the purpose of sending a message?) Users access the calls through library procedures, as usual, however.
When a server wants to go to sleep waiting for an incoming request, it calls get_request. This procedure has three parameters, as follows:
get_request(&header, buffer, bytes)
The first parameter points to a message header, the second points to a data buffer, and the third tells how big the data buffer is. This call is analogous to
read(fd, buffer, bytes)
in UNIX or MS-DOS in that the first parameter identifies what is being read, the second provides a buffer in which to put the data, and the third tells how big the buffer is.
When a message is transmitted over the network, it contains a header and (optionally) a data buffer. The header is a fixed 32-byte structure and is shown in Fig. 7-8. What the first parameter of the get_request call does is tell the kernel where to put the incoming header. In addition, prior to making the get_request call, the server must initialize the header's Port field to contain the port it is listening to. This is how the kernel knows which server is listening to which port. The incoming header overwrites the one initialized by the server.
Fig. 7-8. The header used on all Amoeba request and reply messages. The numbers in parentheses give the field sizes in bytes.
When a message arrives, the server is unblocked. It normally first inspects the header to find out more about the request. The Signature field has been reserved for authentication purposes, but is not currently used.
The remaining fields are not specified by the RPC protocol, so a server and client can agree to use them any way they want. The normal conventions are as follows. Most requests to servers contain a capability, to specify the object being operated on. Many replies also have a capability as a return value. The Private part is normally used to hold the rightmost three fields of the capability.
Most servers support multiple operations on their objects, such as reading, writing, and destroying. The Command field is conventionally used on requests to indicate which operation is needed. On replies it tells whether the operation was successful or not, and if not, it gives the reason for failure.
The last three fields hold parameters, if any. For example, when reading a segment or file, they can be used to indicate the offset within the object to begin reading at, and the number of bytes to read.
Note that for many operations, no buffer is needed or used. In the case of reading again, the object capability, the offset, and the size all fit in the header. When writing, the buffer contains the data to be written. On the other hand, the reply to a read contains a buffer, whereas the reply to a write does not. After the server has completed its work, it makes a call
put_reply(&header, buffer, bytes)
to send back the reply. The first parameter provides the header and the second provides the buffer. The third tells how big the buffer is. If a server does a put_reply without having previously done an unmatched get_request, the put_reply fails with an error. Similarly, two consecutive get_request calls fail. The two calls must be paired in the correct way.
Now let us turn from the server to the client. To do an RPC, the client calls a stub which makes the following call:
trans(&header1, buffer1, bytes1, &header2, buffer2, bytes2)
The first three parameters provide information about the header and buffer of the outgoing request. The last three provide the same information for the incoming reply. The trans call sends the request and blocks the client until the reply has come in. This design forces processes to stick closely to the client-server RPC communication paradigm, analogous to the way structured programming techniques prevent programmers from doing things that generally lead to poorly structured programs (such as using unconstrained GOTO statements).
If Amoeba actually worked as described above, it would be possible for an intruder to impersonate a server just by doing a get_request on the server's port. These ports are public after all, since clients must know them to contact the servers. Amoeba solves this problem cryptographically. Each port is actually a pair of ports: the get-port, which is private, known only to the server, and the put-port, which is known to the whole world. The two are related through a one-way function, F, according to the relation
The one-way function is in fact the same one as used for protecting capabilities, but need not be since the two concepts are unrelated.
When a server does a get_request, the corresponding put-port is computed by the kernel and stored in a table of ports being listened to. All trans requests use put-ports, so when a packet arrives at a machine, the kernel compares the put-port in the header to the put-ports in its table to see if any match. Since get-ports never appear on the network and cannot be derived from the publicly known put-ports, the scheme is secure. It is illustrated in Fig. 7-9 and described in more detail in (Tanenbaum et al., 1986).
Fig. 7-9. Relationship between get-ports and put-ports.
Amoeba RPC supports at-most-once semantics. In other words, when an RPC is done, the system guarantees that an RPC will never be carried out more than one time, even in the face of server crashes and rapid reboots.