Coyotos

The Coyotos Microkernel Specification is an interesting new thrust in capability kernel design. It reviews some recent developments and arguments about these designs. These are some notes here to add my 2¢ worth.

The spec notes that “The register state of a modern Pentium-family processor occupies nearly four kilobytes ...”. Here I explore ramifications of deleting the floating point values from the state that is usually swapped for each task switch. I do not recall verifying that the substantial MMX and SSE state are separable from the smaller legacy user state as well, but I think that they are. Note that in this scheme the space for the large user state need not even be purchased from the space bank until when and if the first instruction to use those facilities has attempted to execute. Meanwhile the extra space exists at no level of abstraction. The time to save and restore this state may be an even bigger reason to amputate unnecessary state.

There is a problem that does nag, however. The stack used by a domain (in Keykos), or a process (in Eros), or a thread (in Unix) is liable to hold spoiled stack state that must, by current logic, be studiously retained in RAM or at best swapped to disk, and then back again upon next actions of the thread. This may be an even bigger burden.

Time passes

I note the argument with text “Suppose there is a server that wishes to wait simultaneously for activity on one of 1024 network connections.”. Certainly this is a valid thing to do. Here are two ways to do this in what I see as a blocking IPC architecture:

Appoint one domain R to be usually available and distribute start keys to R with distinct data bytes to those who need to send to this server which I will call a consolidator here. When a message is received R’s domain code uses the data byte to index into an array of state blocks, each of which holds the FST state for the relevant circuit. The code then does the first level processing for that packet and almost by definition usually becomes available again after having updated the circuit state. This means that the keys by which the 1024 processes rely upon to deliver their payloads are not prompt in the kernel sense and that the ‘server’ must thereby be trusted to be prompt.
Use light weight domains, using this aforementioned kernel hack to keep them light.

Regarding the delivery of Unix ‘signals’ to processes.
Keykos has been criticized by several for its lack of a primitive mechanism for interrupting the code that a domain is executing. This lack makes it difficult to emulate Unix kernel semantics. Perhaps for the strict purpose of efficiently emulating such semantics we need additional kernel logic but let me first demur and claim that there are no other benefits. I have asked several Unix application programmers how they contend with the following infelicity and I have found few who programs at this level and they say that it is hell. Perhaps there are convenient ways to use these semantics.

Suppose the program running in a unix process has enabled itself to be interrupted by exogenous events. When this happens the CPU begins to execute code designed to respond to the event. This code is in the same address space as the interrupted code, presumably to share its data. (Is there any other reason to run in the same address space?) The interrupted code is like normal code that temporarily violates invariants that are depended upon by other parts of the application. If the application does multi threading this may be dealt with by shared and exclusive locks. If the interrupt code needs access to this possibly invalid state, what is to be done? A lock can determine that the state is inaccessible, but then what? Does it help that the interrupt code knows that the interrupted code is not running? I don’t think so. Where is the mechanism to wait until the data structures are put back together? This is exactly what domains do for a living—to solve problems like this. It is hard doing it with domains and even harder doing it without. I think I must be missing some programming paradigm.

Another problem, I think, is the peculiar semantics of interrupting when the thread is blocked on a kernel call. If I am not confused the call’s function is silently suppressed. Can this be so? Where is this described?

If you really want to do this peculiar operation the code that fronts for the exogenous event wields a domain service key with which it performs whatever mayhem is necessary on the thread currently occupied by the domain after having stopped the domain by such means as denying it its meter or memory.