Virtual IO in VM/370

The IBM 370 inherited the channel architecture from the earlier 360. The channel was a real or conceptual box in addition to the CPU that ran asynchronously and concurrently with other channels and the several CPUs. Some of the more real channels had a thick cable connecting to several IO devices chained together serially. This is called a daisy chain sometimes. The SCSI architecture is very similar.

The program obeyed by the CPU initiates IO by issuing a SIO instruction that designates a channel and locates a channel program in main storage for the channel to execute. When the channel program finishes it signals an interrupt to some CPU.

A channel program is composed of channel control words (CCWs) which have operation codes, called orders, that are interpreted by the channel. Most orders are also sent to the currently connected device. The channel does minimal decoding of the order to distinguish the direction of the ensuing data flow. There are read orders and write orders. The CCWs also holds addresses and byte counts of data to be transferred. The various IO devices discriminate further the meanings of the orders.

For example, the meaning of a CCW might be:

Send this read order: 42, to the device and when data start arriving, put them into locations 41000, but no more than 1000 bytes.
A bit in the CCW would direct the channel to go onto (chain to) the next CCW for another address and length with which to continue the transfer. The order there is ignored.

There is a conditional transfer CCW that is interpreted just by the channel which transfers to a new sequence of CCWs if a certain recent signal from the IO device was 1. Thus a channel program is a network of read and write operations with embedded real addresses. The real channels were a bit more complex than this but this much detail will illustrate all of the points I want to talk about.

When VM/370 finds that a guest program has attempted to execute a SIO instruction, it consults the device address provided by the guest to determine the meaning of the SIO. Often the guest has been allocated a real device that behaves much like the guest imagines. In this case VM/370 translates the channel program to a real channel program which some real channel then executes. The translated channel program has real addresses where the virtual program has virtual addresses.

As the virtual channel program is translated it is necessary to lock down the pages to which the program refers since the point of SIO is the last time that delays can be tolerated. Pages are brought into RAM if necessary.

When a virtual CCW describes a range of virtual addresses crossing a page boundary, a chained CCW in the translated channel program does the trick.

The Bug

There was a famous bug in the translations of channel programs which is indeed the whole motivation of this note. Many people had read the code, as IBM shipped the source in those days. Still it was several years before the bug was found. It was a serious security hole. If you have imagined the logic of a translation program based on the information I have given you, your design may already have that bug.

When it comes time to translate a conditional transfer CCW the translator recalls whether the target of the transfer has already been translated and if so, compiles a conditional transfer to the already translated CCW.

Do you see the bug?

First note that read CCWs require write access to RAM while write CCWs require read access. The already translated CCW may have belonged to a chained sequence of write operations. Such operations require only that the program issuing the SIO have read access to the page. If a transfer CCW follows a read CCW then the translator will already have ensured that the program has write access those addresses identified in the read CCW. The program must also have write access to the RAM in the already translated CCW, lest the effect is to write RAM that the program lacks legitimate write access to.

I think that the cleanest fix was to remember whether each translated CCW had been in write mode as it was translated. Translating a conditional transfer during a read sequence of CCWs requires that the already translated CCW have been translated in read mode. If no such translated CCW exists, begin a new translation as if no translation had been found. This new translation will check for write access to RAM.