I hate register windows (but)

I have just come up with a way of thinking about register windows that makes kernel design easier and perhaps easier to implement.

When a SPARC processor encounters a save instruction and has run out of register windows, it traps to privileged code that must copy at least one register window onto the stack which is presumably in the user’s memory. It is exceedingly awkward if the stack location where these values belong is not mapped when this happens. The current SPARC domain has space to model these windows which are thus considered part of the unprivileged CPU architecture.

The Motorola 88K did imprecise store faults which would lead to a halted process state that could be described as

This process must store the value X at virtual address Y as its next action.

The program counter might be several instruction beyond the store that caused the fault. Such information is presented by the hardware to the privileged code upon interrupt and there might more than one such pending store. There is a small known limit on the number of pairs, however. There seems, in general, to be no need to know the address of the faulting instruction; only the need to perform the store.

This scheme seemed to be a small conceptual burden in the 88K kernel design, whereas register windows seemed vexing and heavy, beyond the issue of mere domain size. Viewed as an extension of the deferred store idea, register windows seem less vexing to me. The deferred stores would still be buffered in the domain but would not be explicitly associated with particular register windows. I think that it seems simpler because the solution is split into two familiar parts: the kernel discovering that it needs to store into the user’s virtual memory at an unmapped location, and the deferred store states.

The IA64 also has a similar scheme but it also has an “RSE” (Register Save Engine) that asynchronously saves and restores stack values. I have not studied the behavior of the RSE when the virtual addresses that it needs are unavailable. I think, however, that if the RSE merely declined speculative actions that would fault, such as fetching frames that are not known to be needed, then both the hardware and software would be simpler. Such actions should be declined until they are no longer speculative.