Having recently come across the Charles Babbage Institute web page and thence the Cray Research Virtual Museum, I decided to jot down some notes from memory here about the insides of some of those computers.
Here is more good information on the subject.
When a programmer thinks of a computer s/he has known it is likely in terms of bits per word rather than color of main frame.
In those days compilers did not insulate the programmer from such details.
Here are some interesting machines from other manufacturers.
I use the ambiguous gender purposefully here.
In the 50's and 60's many important contributors to the new art were women, both in hardware and software.
Often the women were not as well known.
CDC 1604
The 1604 was the first of Cray's machines that I became aware of.
The machine had 32K of 48 bit words.
It had 48 bit floating point, two 48 bit registers in the classic style and six 15 bit index registers.
Here is Seymour's diagram.
The machine was one's complement.
There were two instructions per word.
Here is the programming manual.
The machine supported interrupts with the following wart: Upon interrupt the hardware reported the address of the instruction being interrupted.
That address identified only the word holding the instruction.
The hardware remembered whether the first instruction of that word had been executed.
This memory was inaccessible except to the mechanism that resumed after interrupts.
This made nested interrupts or switching contexts upon interrupt excessively arcane.
This is about the only Cray wart that I recall.
I think that “channels” kept their state in low core.
The state was word count and word address.
CDC 160A
After the 1604 arrived we got a 160A.
(Here too.)
I subsequently learned that Seymour had used the 160 as a test bed for 1604 ideas.
Indeed there were many similarities.
The story also has it that the first 160 was built from reject transistors unsuitable for their original purpose.
In any case the machine was extremely elegant in its simplicity.
The programmer could always find some quick way to do something simple but usually there was just one such way.
This orthogonality extended to the 6600 PPUs.
The 160A had 12 bit words and a 12 bit one's complement accumulator but no multiply or divide.
There were no interrupts(?).
There were two banks of 4K words each and some sort of “small address hack”.
There was a full complement of instructions that indirected thru low core.
There were also two word instructions that included a displacement to be added to the low core content to form the effective address.
This was almost as good as index registers.
The 160A had several unusual IO devices attached and served at Livermore for media conversion.
Another use of the 160A was to emulate the 6600 PPUs which also had 12 bit words.
In this case the small address hack worked to the advantage of the application as the second bank was used exactly to model the 4K words of the PPU.
There was a political aspect to the fact that the machine came in the form of a desk at which the programmer could comfortably sit and debug a program.
Other machines were so expensive as to presumably warrant a style of debugging where the programmer would spend hours at his (non-computer) desk to solve a mystery that might have been solved in minutes with hands-on access to the computer.
This was the milieu leading up to timesharing.
CDC 3600
This was not designed by Cray but it was (sort of) upwards compatible with the 1604.
The data formats and addressing patterns were compatible.
The addresses were still 15 bits with some small address hack allowing access to 218 words.
The machines had some sort of flexible configuration ability that allowed a pool of such machines to dynamically (with the help of an ever present operator) reallocate a bank of memory from one machine to another.
Livermore's compiler group had their first real success with the 3600 machine, as I recall.
The 3600 was a cautious extension of the 1604 but it was rather useful.
Laroy Tymes, who was then a machine operator, wrote a program that gained considerable fame inside and outside the Lab.
It was Stars and Stripes Forever, for the CDC 3600.
It used the speaker attached to high order bits of the program counter to provide one or two voices, the noisy Anelex (Analex?) line printer to provide percussion and for a tuba like effect the windows of the tape drives would be opened a bit.
The starting and stopping of the tape was under direct millisecond control of the software and the vacuum columns (designed to provide mechanical buffering of the tape) provided excellent sound coupling.
The 6600
was the first machine to be delivered that I know of that meets the common definition of RISC.
The program specifically arranged for concurrent execution of instructions.
It had eight 60 bit X registers holding either floating or fixed (one's complement) words, 8 A registers and 8 B registers each holding 18 bits values.
When an A registers was loaded with an address, the corresponding X register would be loaded from that address (or stored if it was A6 or A7).
There were 15 and 30 bit instructions packed in the 60 bit memory words.
15 bit arithmetic instructions referred to three registers.
Some 15 bit instructions caused loads or stores at addresses specified by A or B registers.
The 30 bit instructions included an 18 bit displacement and caused loads or stores.
I quote from the programming manual here:
- An instruction is issued to a functional unit when
- The specified functional unit is not reserved.
- The specified result register is not reserved for a previous result.
- Instructions are issued to functional units at minor cycle intervals when no reservation conflicts (above) are present.
- Instruction execution starts in a functional unit when both operands are available (execution is delayed when an operand (s) is a result
of a previous step which is not complete).
- No delay occurs between the end of a first unit and the start of a second unit which is waiting for the results of the first.
- No instructions are issued after a branch instruction until the branch instruction has been executed.
The branch unit uses
- An increment to form the go to K+Bi and go to
K if Bi ... instructions, or
- The long add unit to perform the go to K if Xi
... instructions
in the execution of a branch instruction.
The time spent in the long add or increment units is part of the total branch time.
- Read central memory access time is computed from end of increment unit time to the time operand is available in X operand register.
Minimum time is 500 ns assuming no central memory bank conflict.
An additional limitation was that 60 bit instruction words could be fetched at a maximum rate of one per 800 ns.
Loops confined to two instruction words (up to eight instructions) were not limited by this.
There were two multiply functional units with 1000 ns latency.
There was one divide unit.
The floating add unit was 400 ns, yielding an unnormalized sum which could be normalized by a subsequent instruction.
Livermore found that most floating adds needed to be normalized.
Central memory was 32 boxes each of 4K 60 bits words for a total of 217 words.
The cycle time was 1 microsecond and the memory bus cycle time was 100 ns like the clock of the rest of the machine.
The Peripheral Processing Units were novel in that the ten of them shared execution hardware.
Each processor had virtually an 18 bit accumulator, a 12 bit instruction address, and a three bit instruction phase register that was not really visible to the programmer.
Each processor had its dedicated box of 4K 12 bit words.
There were 12 “channels” which were merely registers that could either drive or be driven by external cables.
One PPU instruction was able to either send or receive a block of 12 bits words over one of these channels at one word per microsecond.
There was also shared hardware allowing a PPU to move data between central memory and the PPU's private memory.
Operating systems dynamically allocated these PPUs to the tasks at hand.
IO devices were permanently attached to some cable which in turn was permanently attached to one of these channels.