Close

faster sqrt but less accurate

A project log for faster speed: how Optimize math and process tasks

how to make math, and other processes faster for specific tasks on any processor. and logs of examples of math performance

jamesdanielvjamesdanielv 09/17/2019 at 15:250 Comments

i found this in the background somewhere on the internet.

https://community.arm.com/developer/tools-software/tools/f/keil-forum/20313/speed-up-square-root-computation-approximation

and indeed it is faster . i had to correct the 1L to 127L as explained in the article bellow with some users having issues with it originally. it however has an error rate of up to 8% but for whole numbers <6%

float fastsqrt(float val) {
    long tmp = *(long *)&val;
    tmp -= 127L<<23; /* Remove IEEE bias from exponent (-2^23) */
    /* tmp is now an appoximation to logbase2(val) */
    tmp = tmp >> 1; /* divide by 2 */
    tmp += 127L<<23; /* restore the IEEE bias from the exponent (+2^23) */
    return *(float *)&tmp;
}

this code makes error rate with whole number <3%

 float fastsqrt(float val) {
   float invertDivide=0.66666666666 ;//   ~1/1.5 rounded down to float precision single float
  long tmp = *(long *)&val; 
  val*=0.22474487139;
    long tmp2 = *(long *)&val;   
    tmp -= 127L<<23; //* Remove IEEE bias from exponent (-2^23) */1065353216
    tmp2 -= 127L<<23; //* Remove IEEE bias from exponent (-2^23) */1065353216
       temp2=tmp>>20;//any time number is negative it is more error rate 
    tmp = tmp >> 1; //* divide by 2 *
    tmp2 = tmp2 >> 1; //* divide by 2 *
     temp=tmp;//any time number is negative it is more error rate
   // if (tmp <0) tmp=tmp+1L<<22;//invert         -10066330
    //23-8bits
    temp2=tmp>>23;//any time number is negative it is more error rate
    //when tmp=0 error rate is high also -2,-1
   tmp +=1065353216; /* restore the IEEE bias from the exponent (+2^23) */
   tmp2 +=1065353216; /* restore the IEEE bias from the exponent (+2^23) */
   float offset=*(float *)&tmp2;    
    val= *(float *)&tmp;      
    return (val+offset)*invertDivide;
   
}

first version <4.3% numb range 0 to 100

2  1.414214===== 1=1.5 variation % = 4.289323

version with offset  <2.7% numb range 0 to 100

2 1.414214===== 1=1.466326 variation % == 2.605647

but it gets tricky for the numbers with digits before 0.

original has an error rate of >24% for below as new version is 11.5%

0.01  0.100000===== 1=0.101153 variation % == 11.530593 

so some work needs to be done to make it adjust to the precision of the float

i'll play around with it more. it seems something that is useful for some people, but i would like precision to be closer to 0.5% to 1% or even better.

I'll update if i get something faster or better.....

Discussions