Register Renaming

This trick would be better referred to as virtual registers. We will call the registers imagined by the machine language programmer “virtual registers” or VRs. There are no real VRs; they are virtual. There is a map (or several) from the VR number to a larger set of real registers, much as virtual memory uses a map from virtual addresses to physical (real) memory. The kernel builds the VM map and the hardware maintains the map for the VRs, the ‘reg map’. This deceit plays roles in speculation, hyper threading, and TSX. As an instruction that produces a value for architectural reg 5 (VR5), some real register is allocated for that value and the number of that register is put in slot 5 of the reg map. Subsequent instructions that need the value in VR5 consult slot 5 of the reg map to find the real reg with the value. If VR5 changes often there may be several real regs allocated to those successive values. Think of the real regs as “single assignment” or Haskell like things. They only get new values when they are reallocated. Some languages provide promises where a value is named before it is produced. The real registers serve this purpose for micro architectural goals.

In multithreading there are two reg maps, one for each thread. The threads may share the pool of real regs. It takes another PC too and I don’t know whether that is thought of as part of the reg map. You might think of a reg map as a soul of some machine.

If a branch instruction appears and the condition is unknown, a guess is made and the machine speeds ahead, but only after making a snap-shot of the reg map. A real reg is not reallocated while some such snap-shot names it. If the guess is wrong the snap-shot is restored and the VRs are restored to their old values. If the guess is right every one cheers and the snap-shot is abandoned (and some real regs are deallocated). I think there are several places to keep snap-shots for when another conditional branch appears while already speculating. When a condition becomes available and is right, one snap-shot is discarded. I imagine a circular buffer of snap-shots.

TSX uses the same stuff to back out of failed transactions, but such a transaction must keep dirty cache lines quarantined as well. There is a bit more hair. The MESI cache logic gets a few new states for cache lines.

This speculation is faithful enough to be taken as the truth when all the guesses were right. Effective virtual addresses are tentatively calculated and loads performed and cache lines sometimes loaded. I don’t know about stores, perhaps stores into dirty cache lines first cause write back and then the newly cleaned line is dirtied and quarantined as well.

A quarantined cache line cannot be cleaned until the calculation becomes real, or until the TSX transaction succeeds. If the cache becomes full of dirty lines speculation stops or the transaction mysteriously aborts.

I wrote this to help me think about and understand Meltdown. There is a contingency that I ignored above. Load operations that address kernel memory must be blocked. It seems that they are normally blocked except not until speculation has already fetched values from the kernel, used those values to compute other kernel addresses and loaded those as well. The pipeline nature of execution calls for data paths to be fixed several clocks ahead of data passing over those paths. Still I cannot imagine why it seems advantageous to ignore the fact that the kernel load could never become real. The conflict is noticed. What better time to notice it than when the load is speculatively issued. Perhaps it is because the CPL (Current Privilege Level) is not in the soul of the virtual machine.

While in this space it is disturbing to imagine one thread of a core running kernel code while another the other thread runs attack code. It is hard to imagine a proof that there is no leakage.