There is the real code but here are a few comments. The C code for the transform is merely the 10 line routine “fft”. At the end is a main routine to illustrate invocation of fft and illustrating errors.

You may not like my formatting but the algorithm is obscure with any formatting.

fft provides about 8 bits less precision than provided by the type named “c”. This is due to the way that w and b are calculated. A more complicated way of computing b and w can improve their accuracy or alternatively the computing platform might provide a type for b and w with 10 more bits of precision than needed for the entire calculation. If the type identifier “cx” in this version is bound to that type, the problem is solved. This effects the computation of b, “b = csqrtl(b);”, and w, “w *= b;” even if it is slow. Most critically w needs to carry 9 or 10 more bits of precision than the input to fft so that the 512 iterations of “w *= b;” do not corrupt the calculation. The 10 iterations of “b = csqrtl(b);” could be replaced by a table lookup for any practical application. Again each successive value of b needs one more bit than the previous and the first needs the precision of the input.

In 2006 Nov the Mac OS X csqrtl produced only 52 bits of precision. In 2007 July, it gives 64 bits as it should! Here is a complex square root.

Here is experimental code for bit reversal.