Caching results of computations

I will present this pattern in the Keykos context. Perhaps it can be generalized.

What do `make` and `jam` do?

The Unix utilities make and jam are designed to automate computation of a network or graph of values. There is a constant acyclic directed graph. The extreme values, the inputs, change exogenously; they are typically text files which are occasionally modified by an editor. Other values result from running known transformer programs with inferior values as inputs. The graph and transformers are fairly constant but the inputs change. These values are often cached. There are several ultimate output values whose values are required externally. The task is to run the transformer programs only when required to keep the outputs current with the inputs.

Make and jam must compare dates, which is an error prone Unix facility. Keykos can arrange for reliable change notification. The most frequent form of such arrangement is for the segment keeper of the file holding the input to leave the segment in read-only mode so as to be promptly notified of any action that modifies the input. The program that attempts the write will be delayed a few milliseconds and not notice. The cache master, notified by the segment keeper, flushes those cached values that directly or indirectly depend on the modified input. Some of the cached values may be outputs and if those outputs are also segments, the segment keepers will remain on duty to respond to segment accesses by rebuilding the values.

Note in this scenario the “user” did not have to invoke make. The output is always up to date!! Of course accessing the output segment may trigger many compilations.

There are two advantages over the Unix solution:

You need not maintain a ‘makefile’,
You need not invoke make,
You need not rely on dates of inputs for each build. The modifications themselves cause immediate cache purging.

More Detail

We illustrate with an imaginary transformer application, T, with one input file and one output file which is to consist of the even bytes of the input. When T first runs it lazily creates the output file (segment) but leaves it empty. By empty I don’t mean zeros, but in the state that fetching from the segment causes a signal to T via the segment keeper of the output file. This signal arrives via a channel established as T created the output segment. When T gets the signal it is for page 42 which causes T to write into page 42 of the output segment, the even bytes from pages 84 and 85 in the input segment. As T reads from the input segment that may, in turn, cause further keepers and transformers to do their thing.

When T started it arranged to be notified of page recalls within its input segment. If T should subsequently learn that input page 42 has changed, it will notify the output segment keeper to recall pages 84 and 85.

The Unix make utility generally deals with transformers that do much larger units of operations—entire files at once. The imaginary transformer above illustrates some deadlock pitfalls — signals coming to T from two directions.

One pattern observant to Unix directory style, is a modified directory object which holds the inputs and outputs. The compilers are run with their read access to these inputs dynamically observed. When compiling x.c causes the compiler to read file Z.h, the output file, x.o is observed by extra directory logic to depend on both x.c and Z.h. It is not necessary to maintain a ‘makefile’.