History of Addressing

The first machines I knew used no data or code addresses except those explicitly stored in the instruction stream. Most instructions had a fixed portion that held an address, probably of data, but sometimes of code that the CPU might switch its attention to. The ‘effective address’ (modern jargon) was exactly that found in memory; there were no index registers or general registers. On the IBM 701 (1954) here is a typical loop (one machine instruction per line) to sum the numbers in locations 100 thru 199:

load (add 100) ; put an instruction into the accumulator.
store vinst ; plant the add instruction two words hence.
load (0) ; initialize sum
vinst: (add x) ; a modified instruction.
store temporary
load vinst
add (1)
store vinst
sub (add 200)
tz done ; exit if we are done
load temporary
t vinst ; loop
done:

Total size: 6 (36 bit) words; memory access time time = 12 μs; 196 μs/iteration.

There was another register, the MQ, but it was useless for this job. Each of these instructions required two sequential memory cycles (12 μsec each) except tz and t which required one. 22*12 = 264 μs / iteration. The instructions were 18 bits each. See this for more detail on instructions. The address was 12 bits long. Words, not bytes, were addressed. The 704 arrived in 1956 with index registers. The above loop was reduced to:

ldx 1,(100) ; Fetch 100 into index register 1
load (0) ; load 0 into accumulator
lp: add 200,1 ; add the contents of location (200 − content of xr1)
tix lp,1,1 ; decrement xr1, and go to lp unless it goes to 0.

Total size: 4 (36 bit) words; time = 3(12 μs) = 36 μs / iteration.

Index registers values were subtracted from the address in the instruction. The instructions were 36 bits and an address occupied 15 bits within the instruction and an index register. The control block had not been invented and a physics problem with a 1000 points and 10 variables per point would use 10 arrays of 1000 numbers. An index register would hold a small integer identifying the current point. You can probably guess the sort of code required on the 701. Fortran assumed this pattern and included no features suitable to a control block.

The first Fortrans had arrays but not structures. Early operating systems, which were written in assembler, would assign an index to a new job and Job 9’s file system root would be found in FSR[9] of the array of roots FSR. Many different arrays would each be indexed by job number. These arrays were typically fixed in size and their origins would be scattered in the instruction stream. The different arrays would have several different element sizes so that shifting or even multiplying would be required to apply a job index to the array.

See some history of segmented addressing architectures.

The Control Block

I recall that the new idea of the control block seemed awkward at first. The various pieces of information that the kernel needed to keep about a particular job would be kept in a contiguous area of memory, called a control block and the address of the block would supplant the job index. It is hard now for me to remember why I resisted the move to control block structures, but I did not resist long. Assemblers before the IBM 360, did not support a safe and convenient way of naming fields within structures.

The IBM 360 could not store large addresses directly in the code stream and I was not yet convinced that control blocks could entirely supplant arrays. When Fortran compiled array oriented code for the 360 it had to spend extra instructions loading array origins in order to compute the address of the array element. It seemed to do conceptual damage to a program to transform it from arrays to structures. One reason is that in physics element i is related to element i+1 and this is naturally expressed with arrays. In short the array index, as integer, has topological meaning. It can be naturally done with arrays of structures but arrays of structures still cause compilers problems with remembering array origins. A future side benefit of control blocks was that they were more cache efficient; but they arose before caches.

IBM’s File Control Block

When a file was opened on one of IBM’s 360 operating systems, a File Control Block (FCB) was allocated in the user’s memory for that file. Along with the addresses of several pieces of data pertaining to the new file, addresses of code for that particular file were also included in the FCB. IBM operating systems provided a wide variety of file formats and ways of mapping user concepts such as records onto such real hardware artifacts as blocks and such. Each of the mappings required different code and this code was selected largely at file open time. I recall that on some small systems several hundred disk accesses might be required to open a file in order to discover and dynamically load the various access methods necessary for that particular logical format on that particular IO device.

This is the earliest application that I am aware of, of polymorphism. The FCB included (rather than shared) the v-table. The compilers could compile polymorphic call sites that were cognizant of the layout of the FCB. Alas that compiler logic was restricted to the particular logic of IBM’s file systems.

In the beginning the only concept of multiple data in a computer was that of array. (Well before the beginning there were machines without sufficient memory to contemplate even a small array. Babbage’s proposed engine would hold only a few variables.) The first Fortran provided arrays whose size was fixed at compile time. There were no pointers or structures. For complex applications that would use structures and pointers today, several arrays would be declared of the same length and in place of a pointer to a structure an index into these arrays would be used. The program failed it the common array size was exhausted. Early operating systems used this plan for their internal state.

Digitek pioneered a scheme for growable arrays. It adopted the convention that each reference to the array consults a fixed place that describes the current size and location of that array. When the array needs to grow it and other arrays may be moved in necessary. There is only one place where the location of an array is kept. I can remember being impressed with this scheme. Using this technique Digitek produced a series of moderately fast Fortran compilers. These compilers cost about $50,000 each and drove other compiler technologies out of the market except for a few compilers known for their excellent optimizations.

See a sequel.