Dependent and Independent Compilation

These terms are used occasionally and I have not found a definition. Let me suggest a definition that is compatible with what I have seen.

Compilation is a process that reads the source of a program and produces a relocatable value that is used in the execution environment. (In the Unix environment such values are typically .o files.) They may contain machine instructions or byte codes for a ‘virtual machine’. Compilation does not begin until the source is fixed and execution does not begin until compilation ends.

Often between compilation and execution a linking process runs that joins relocatables from several computations and combines them into larger values in the same format. Programs may thus be the result of many such compilations.

The pattern as described above is called independent compilation. With dependent compilation the compiler will produce an additional output called bind here; I know no general name for this. The compiler will also, depending on the source, require bind files from previous compilations of sources. If a symbol is defined in source X and used in source Y then in may be necessary to compile X first and for the compilation of Y to read the bind value from the first compilation. The most common use of this is to compile a function definition before compiling calls to the function. The type (signature) of the function is thus available as the call sites are compiled. This works as well for any value, however.

This presumes scopes that span source files. Such a possibility is explicitly built into C.

The Algol68 compiler from Cambridge University uses dependent compilation but the language provides no notion of program files. There was a meta construct added to language whereby any program fragment, such as a routine definition, occurring in a strong context, could be replaced by a short construct that named a file. Upon seeing this short construct the compiler emits a bind file which includes information for all variables in scope at the point of the construct. This information includes how to access the value of the variable—perhaps where it is allocated on the stack. Subsequent compilation of a source file whose name matches that found in the construct causes the compiler to read the bind file in order to interpret free variables found in the new file.

I think Java works this way but I have not decoded the ramifications of the javac command. I don’t know what plays the rôle of the bind file—perhaps the .class file.

Pros & Cons

The good news is that the multi-file semantics does not increase the language complexity; the new logic is orthogonal to language semantics. The new function is very much like the simple include function which is easy to understand.

But when a new program file is introduced to a collection of program files, it may be necessary to mention it in the top level file, which in turn requires recompiling all of the files mentioned there, as the bind file will possibly have changed. It might have been possible for the compiler to know which of the old files did not need to be recompiled but I think that the Cambridge compiler could do that.