Multiprogramming a Service

The easiest and most natural way to provide a shared service in Keykos, is to issue a start key for a domain that obeys code to provide the service. That strategy means that one service invocation must finish before another begins; such is the nature of domains. All advantages of concurrency are lost. We consider here successively more drastic fixes to this problem.

The main reason for beginning to process a request before previous requests finish is to exploit hardware systems that are capable of multiprogramming. Multiple CPUs, concurrent disk IO, remote servers are forms of multiprogramming.

The preferred form of multiprogramming in Keykos, for the case at hand, is to set multiple domains to work at once, obeying the same service code and perhaps sharing some amount of read-write RAM if that suits its needs. The problem is allocation of these worker domains to the service requests.

A first class and general approach is to create a domain for each request. This requires perhaps 10,000 instructions if the creation is via a factory. A factory ensures that requests cannot affect each other and remain confidential to the extent that the factory is discreet.

If the nature of the service causes the requests of one client to affect the responses to another, then the service is not discreet and the worker domains must share access to some mutable object—perhaps a memory segment or access to another single sub-service domain that responds quickly. We conjecture here that servers that must, by their nature, share slowly reacting mutable state will themselves serve serially and slowly.

The general factory overhead can be avoided by directly employing a domain creator. This avoids about half the cost of creating a new domain.

To avoid the cost of creating a domain per request we might reuse a pool of worker domains, trusting the worker domains to keep successive requests private. The service key is a start key to a domain, the dispatcher, which operates briefly at the beginning (at least) of each service invocation and dispatches the job to some worker domain. The dispatcher must know which of these domains are still occupied with previous requests unless the service time is remarkably constant. The easiest and cleanest way is for the worker domains to respond via the dispatcher, via a unique start key, passing the service response with the original resume key. Hopefully the dispatcher is usually available and this will seldom cause a blockage. The dispatcher thus explicitly learns when worker domains become available. A cost in this design is passing both the service request and the service response messages thru the dispatcher domain.

The response might be returned directly to the original requestor if the dispatcher and the workers share a writable page where a worker leaves a note to the dispatcher just before delivering its response directly to the original requester. Some of the Keykos servers worked this way.

The next escalation eliminates the dispatcher domain and the extra message passing by putting the dispatcher logic in the kernel. The kernel is able to cheat and know which worker domains are available. The key to the dispatcher would actually designate a node full of ordinary start keys to the workers. Perhaps the dispatcher node could itself hold kernel dispatcher keys if one level did not suffice. This scheme suffered from requiring a kernel object that could not be implemented in user code. The new scheme revealed things about availability of domains that was otherwise unavailable. We knew of no reason not to reveal this but still it was a new source of signals that made certain security arguments more complex.

Keykos contemplated some such mechanism but did not encounter real applications for it and it was not implemented. There were a few other rabbits in the design hat, on call for rapid implementation if the demand should arise. Often a rabbit got smarter before it was reified. These on call designs served to guide other design efforts. A design change impacting features used by real code was seldom adopted. Design changes impacting rabbits were made more freely but not without acknowledging a dead rabbit. Checkpoint to tape was a rabbit for at least a year and then implemented upon customer request. The above scheme, and another, are described here. There remain many rabbits in the Gnosis manual.

Atomic Kernel Ops

We must address another important server contingency. Most requests may finish so quickly as not to warrant the cost of arranging concurrency. And yet the server may sometimes discover that it cannot be quickly or promptly responded to the request due to some obstacle that must first be dispatched. Meanwhile the server must become available for further requests even as it takes steps to remove the obstacle.

When the kernel comes to this situation it pretends that the requestor had never made the request. The kernel observes the ‘dry run’ discipline, avoiding changing mutable state until it has all the necessary locks to complete the transaction. The execution of the instruction that invoked the key to the kernel object leaves the domain in a state to reproduce the request if the instruction were reissued. If the program counter is moved back one instruction to indicate that the domain is about to make that invocation then the next time the domain runs it will invoke the same key with the same message. The kernel will put the domain whose request is deferred on a kernel queue of domains to be started when some such class of obstacles as this yield. This strategy was used in DEC’s PDP-10 operating system where the kernel was said to repop the user program.

Domain code is not able to do this trick. We explore some architectural modifications to let domain code do something like this. Some of these ideas were developed in discussions with Jonathan Shapiro. Suppose that server held a kernel key with which it can operate on resume key to the requestor that backs up the program counter leaving the requesting domain in the waiting state. This act consumes the resume key and returns a fault key to the domain which is not in a position to receive a message. The server can stash the fault key while there is an obstacle or hand it off to whoever is charged with dispatching the obstacle.