Page 1 of 1
less cpu hungry power approximation?
Posted: Sun Feb 06, 2022 1:12 pm
by tester
Theme like tkis in stream:
is pretty cpu hungry. Are there any faster (non-hoped, mono4) approximations, that could do the job?
For the design, base can be an integer, starting from 2, or even power of 2; exp is in range (0;1).
The most minimalistic design requires base=2 (it's for signal scaling).
Re: less cpu hungry power approximation?
Posted: Sun Feb 06, 2022 6:39 pm
by nix
square: val * val
cube: val * val * val
does that simple thought from a simple soul help?
Re: less cpu hungry power approximation?
Posted: Sun Feb 06, 2022 6:48 pm
by tester
Nix, base as integer, exp as continuous range between 0 and 1.
Like 2^0.432, 2^0.456, etc.
Re: less cpu hungry power approximation?
Posted: Sun Feb 06, 2022 7:26 pm
by juha_tp
Hmm... how much faster approximation is depends on accuracy you need.
Is this assembler code FS compatible
https://wurstcaptures.untergrund.net/as ... ricks.html ?
Re: less cpu hungry power approximation?
Posted: Sun Feb 06, 2022 7:38 pm
by martinvicanek
This is my fast Mono4 2^x implementation for float x, accuracy close to machine precision.
Code: Select all
streamin x; streamout y; // 2^x
// 2^x Approximation
// Author: Martin Vicanek
// Relative Error < 1e-7
// CPU load 2% of built-in pow() function
// y = 2^x
// decompose x = int + frac
// compute I = 2^int by bit shifting
// approximate F = 2^frac by polynomial
// so y = I*F
float xmax=127.5; // yields 1.#INF
float xmin=-126.5; // yields 0
float F0P5=0.5; float a0=1;
float a1=0.693147034; float a2=0.2402295 ;
float a3=0.055484164; float a4=0.009678109;
float a5=0.001243999; float a6=0.000217193;
int I127=127;
// decompose x into int and frac parts
movaps xmm0,x; minps xmm0,xmax; maxps xmm0,xmin;
movaps xmm1,xmm0; subps xmm1,F0P5; cvtps2dq xmm1,xmm1;
cvtdq2ps xmm2,xmm1; // xmm1 is the int part
subps xmm0,xmm2; // xmm0 is the frac part
// evaluate 2^int
paddd xmm1,I127; pslld xmm1,23; // xmm1 is 2^int
// evaluate 2^frac (polynomial approx.)
movaps xmm2,a6; mulps xmm2,xmm0;
addps xmm2,a5; mulps xmm2,xmm0;
addps xmm2,a4; mulps xmm2,xmm0;
addps xmm2,a3; mulps xmm2,xmm0;
addps xmm2,a2; mulps xmm2,xmm0;
addps xmm2,a1; mulps xmm2,xmm0;
addps xmm2,a0; // xmm2 is 2^frac
mulps xmm2,xmm1; // put it together
movaps y,xmm2;
Re: less cpu hungry power approximation?
Posted: Mon Feb 07, 2022 11:47 am
by tester
Thank you very much Martin, this should do the job for scaling cases.
Re: less cpu hungry power approximation?
Posted: Mon Feb 07, 2022 10:10 pm
by nix
oh sorry
I see now that this can use decimals
thanks guys