Close

The binary comb instruction

A project log for Square roots

A collection of posts related to efficient digital algorithms

yann-guidon-ygdesYann Guidon / YGDES 09/18/2022 at 23:100 Comments

I remember being "somewhat focused" on integer square roots, from about 1994 to 1996 until other pet ideas took over. I saw some algorithms from naïve/crude to more established and the Newton method seemed nice. Except that it never converges fast enough and Newton-Raphson is too computationally intensive for small/weak processors. The critical aspect was now to find the best approximation with the fewest complexity and I didn't even have my Baccalauréat when I found a way, which I still haven't seen out there. The problem is that it requires a new opcode but this one would be very simple to compute.

It didn't give it a proper name that I remember, but today I would call this opcode COMB because it acts like a comb on binary data. It would keep one bit in each pair and dump the other. As an illustration, if you have 8 bits callled ABCDEFGH respectively, the result would be 0000ACEG. It drops one bit every 2 bits, and pads the MSB with 0.

Mathematically, it selects all the odd indices of bit, which are a way to rearrange the polynomial construction of the value. The algorithm would be :

Of course, if the opcode map is large enough, one could implement two opcode versions to select the odd and even bits to save one shift instruction.

The multiply with a constant is necessary to get enough precision for somewhat large values, though the mathematical definition is clearly still not fully respected (there are extra terms to compute), but I have tested this and seen that the approximation is "good enough" to speed up Newton to satisfaction.

A cruder version would use no multiply and provide a worse approximation. In fact I believe it was my first idea :

a = COMB ( n | n>>1 )

It's faster but lousy.

Anyway, COMBE and COMBO look like opcodes that can be very easily implemented in #F-CPU  for example.

Discussions