In the past, more so than now, instruction decoding logic was to a degree micro programmed. This means that individual bits in the instruction had similar meanings in different instructions. In the PDP-10 this pattern was exploited in both the hardware logic design and also in the manual that defined the instructions.
The 704 instruction was not crowded; its predecessor, the 701, had had 18 bit instructions but the 704 instruction used 36 bits. The 704 took advantage of the extra room to allocate op-codes so that the same bits across a family of instructions meant the same thing. As a result some combinations of bits that matched nothing in the manual were found to have useful functions. For instance the instruction to store the accumulator was just like the command to store the MQ register, except each had a 1 bit where the other had a 0. It turned out that if both bits were 0 then the instruction would store 0 in memory. That would have seemed to the original designers (not yet called “architects”) to be an ungeneral and non-orthogonal command. It was, however, extremely useful in many contexts. By count of use it came to be one of the most common commands. If both bits were 1 then the or of the two registers was stored. This was less often useful.
Other instructions were discovered at the same time and a few gained limited use but the rest together were not as useful as the new STZ. The 709 and 7090 were the next generation of this architecture and were careful to include at least STZ, which made its way into later instruction manuals.
The IBM 7040 was a low end transistorized machine that did not claim 704 compatibility but tried to extend the successful 7090 series into a larger market. It omitted the STZ instruction. The reaction of programmers was unanimous: they demanded STZ. They got it.
The 1401 was strange in several ways, at least to modern tastes. The addressable unit of memory was called a “word” but the word was of variable length. It seems to be described here in remarkable detail. (Originally found at http://farm.co.us/~david/code/ibm1401/manual/ and resurrected from the archive) The addressable unit was the character and within each character was the word mark bit that delimited the word. A word was addressed by the address of its first (left-most) character and extended to the first character with a word mark set. Instructions each occupied such a word.