This note documents a bug in the Keykos kernel. It is also the best explanation of the kernel depend logic. See Charlie’s explanation of a similar scheme.

“PENDVAL” is the name for the fix of subtle problem with the 370 depend logic. The current code is efficient but a bit arcane. I will record here the bug so that the explanation of the fix will make more sense and also help prevent reintroducing the bug in the name of simplifying the code.

AXIOM: DEPEND keeps a mutable binary relation between key slots and table entries for memory maps. If accessing a memory location by a domain would read slot x according to segment semantics, and accessing the real memory location using the hardware memory map consults a table entry E, or a TLB entry derived therefrom, then DEPEND relates x to E.

Slots related by DEPEND are involved and reside in nodes that are prepared as segment nodes.

When a mapping table entry is built, the code that examines the memory tree is coded to invoke DEPEND to record the slots that are examined.

When code modifies a prepared slot in a segment node it invokes DEPEND to zap the related mapping entries.

DEPEND’s internal state is well hidden and of a fixed size. It resorts to load shedding when it hurts for space. It sheds load by zapping map entries and this allows it to forget those entries and related slots. The state is organized to make it fast to find slots related to a given mapping entry. The slot address is hashed to select a chain of entry locators. Whole chains are sacrificed (zapped) to respond to a change of a slot.

The very very clever reader may already see the pitfall. We didn’t and it was one of the later bugs to be removed from the kernel.

We build valid memory map entries only as a direct result of map faults. Naturally we locate the map entry whose invalidity causes a fault before we begin to examine the memory tree slots to learn what real memory the entry should represent. In the process of examining the slots we invoke depend. It may occur that while evaluating one map entry we make a DEPEND entry for slot x and subsequently, while still evaluating the same entry, hurt for space and zap the DEPEND entry that would relate x to the new incomplete map entry, even before that value has been computed. DEPEND reclaims its space and forgets the relation between some slot and that entry. The evaluation finally succeeds and the entry is placed in the map table and we are in an unsafe state that violates the above axiom.

I do not recall if there is anything wrong with the PENDVAL solution but it is not the one used in the current C code.

The 370 code and Capros, I think, put a special invalid value in the mapping table location. If this phenomenon happens then that special value will have been zapped during the evaluation of the map entry and this will be noticed and will cause the new value to be discarded and recreated in order to avoid the unsafe state.

In the current Keykos C code a list of slot address hashes are built during the evaluation of a map table entry. If the evaluation succeeds and a valid entry is produced then the entry is placed in the table and the chain is submitted to DEPEND for incorporation. If the evaluation fails the list is abandoned, and its space is reclaimed. This has the considerable advantage of not putting into the DEPEND state, slot viewings that did not result in a table entry. This happens when there is a node fault and a node must be brought in from disk. When a valid mapping entry has been formed one must ensure enough room in DEPEND to make all of the queued entries before making any, lest the same bug appear on a shorter time scale. The information is available to do this however. Where the new code abandons the build list, the PENDVAL solution would incorporate that load into DEPEND.