About this compact code

Routine fdiv does IEEE floating division for positive normal floating operands, if the quotient is also normal. It rounds to nearest, tie to even. It tests the code against native division 10⁸ times. This variation reports some information relevant to hardware implementation. This reports gamut of intermediate remainders. This probes some obvious corners. This note is for all these variations.

rf(): returns a random double between 0 and 1.
l2R(x): returns a double with the same bits as the 64 bit integer x.
R2l(x): returns a 64 bit integer with the same bits as the double x.
j(n, f): moves n left 52 places and puts low 52 bits of f after that.
j(1, x): extracts the fraction from the float input with the suppressed normalization bit.
nf & df: are fixed versions of the numerator and denominator fractions. For each: 2⁵² ≤ x < 2⁵³.
ad: is a Boolean, 0 or 1, that just remembers if nf < df.
N: is the strategic 10 (M+2) bit integer used to replace n/d with Nn/Nd.
Nd: is the aforementioned product for which N was chosen to make Nd as large as possible, while < 2⁶².

The remaining three variables actually vary during the division which is of a classical pattern (the 1st while): subtracting of multiples of Nd from Nn, while accumulating the multipliers to become the quotient.

r: is the running remainder, initially the modified numerator, Nn.
q: where we accumulate the quotient,
rb: how many more quotient bits are needed.

The expression for N is (1<<20)/((df>>42)+1). The “+1” can be informally explained as guarding against the possibility that the bits discarded by the right shift were all 1’s and thus ((df>>42)+1) is closer to the truth. The “+1” helps insure that N∙df < 2⁶³. This value must be large enough so that (1<<20)/((df>>42)+1) (= N) will be small enough so that (r<<8) - Nd*nq will not be negative.

In the expression “(((q>>-rb)+1)>>1)”, q carries all the computed quotient bits from the exact infinite bit stream. (q>>-rb) just discards the overrun bits due to computing M bits each cycle. The rounding bit is kept however and adding one to that performs the rounding function. If the rounding bit is 1 then it is not the rightmost 1 bit in the infinite quotient and we need not avoid rounding up. The full expression discards the rounding bit and delivers the fraction bits of the ultimate quotient. Substituting “((q>>-rb)>>1)” for “(((q>>-rb)+1)>>1)” results in rounding toward zero.

Hardware

There are steps omitted here such as reacting to either operand not being positive and normal. Extra steps in the critical path are required in case of denormal input or anticipation of denormal output. This delay is known early. Even a denormal output is seen early as we contemplate the sign of “1023 + (a>>52) - (b>>52) - ad”. Including such steps would greatly interfere with understanding the mathematical content of the idea which is what this note is about.

Computing ad simplifies things by making the output exponent known early. It may be overlapped with computing Nn and Nd. Computing N probably involves table lookup plus interpolation. One or two bits of interpolation logic should suffice. See “Table Logic” here.

We need to multiply 63 bit (or 53 bit) numbers by 10 bit numbers several times here. Curiously the low 64 bits of these products always suffice which is fortunate for this demonstration code whose current best claim for credibility is working right on 10⁸ random samples. This sort of multiplication is at the heart of the best multiply plans that I know. Even the Carry Save Add tricks used there can be largely imported but that is not shown here. This code suggests that the right 52 bits of the running remainder can be done with CSA. P is the worst error stemming from not seeing into CSA bits. The worst case in 10⁹ tests is no worse than without CSA.

Perhaps division is parasitic on multiply hardware.

The last while loop: while(r >= Nd){r -= Nd; ++q;} executes at most twice and I have fading hopes of decreasing this to at most once. I do not see a plan that eliminates it.

See this concerning a comprehensible case analysis.

Counting clocks