Age | Commit message (Collapse) | Author | Lines |
|
for these functions f(x)=x for small inputs, because f(0)=0 and
f'(0)=1, but for subnormal values they should raise the underflow
flag (required by annex F), if they are approximated by a polynomial
around 0 then spurious underflow should be avoided (not required by
annex F)
all these functions should raise inexact flag for small x if x!=0,
but it's not required by the standard and it does not seem a worthy
goal, so support for it is removed in some cases.
raising underflow:
- x*x may not raise underflow for subnormal x if FLT_EVAL_METHOD!=0
- x*x may raise spurious underflow for normal x if FLT_EVAL_METHOD==0
- in case of double subnormal x, store x as float
- in case of float subnormal x, store x*x as float
|
|
When FLT_EVAL_METHOD!=0 (only i386 with x87 fp) the excess
precision of an expression must be removed in an assignment.
(gcc needs -fexcess-precision=standard or -std=c99 for this)
This is done by extra load/store instructions which adds code
bloat when lot of temporaries are used and it makes the result
less precise in many cases.
Using double_t and float_t avoids these issues on i386 and
it makes no difference on other archs.
For now only a few functions are modified where the excess
precision is clearly beneficial (mostly polynomial evaluations
with temporaries).
object size differences on i386, gcc-4.8:
old new
__cosdf.o 123 95
__cos.o 199 169
__sindf.o 131 95
__sin.o 225 203
__tandf.o 207 151
__tan.o 605 499
erff.o 1470 1416
erf.o 1703 1649
j0f.o 1779 1745
j0.o 2308 2274
j1f.o 1602 1568
j1.o 2286 2252
tgamma.o 1431 1424
math/*.o 64164 63635
|
|
previously 0x1p-1000 and 0x1p1000 was used for raising inexact
exception like x+tiny (when x is big) or x+huge (when x is small)
the rational is that these float consts are large enough
(0x1p-120 + 1 raises inexact even on ld128 which has 113 mant bits)
and float consts maybe smaller or easier to load on some platforms
(on i386 this reduced the object file size by 4bytes in some cases)
|
|
modifications:
* avoid unsigned->signed conversions
* removed various volatile hacks
* use FORCE_EVAL when evaluating only for side-effects
* factor out R() rational approximation instead of manual inline
* __invtrigl.h now only provides __invtrigl_R, __pio2_hi and __pio2_lo
* use 2*pio2_hi, 2*pio2_lo instead of pi_hi, pi_lo
otherwise the logic is not changed, long double versions will
need a revisit when a genaral long double cleanup happens
|
|
zero, one, two, half are replaced by const literals
The policy was to use the f suffix for float consts (1.0f),
but don't use suffix for long double consts (these consts
can be exactly represented as double).
|
|
thanks to the hard work of Szabolcs Nagy (nsz), identifying the best
(from correctness and license standpoint) implementations from freebsd
and openbsd and cleaning them up! musl should now fully support c99
float and long double math functions, and has near-complete complex
math support. tgmath should also work (fully on gcc-compatible
compilers, and mostly on any c99 compiler).
based largely on commit 0376d44a890fea261506f1fc63833e7a686dca19 from
nsz's libm git repo, with some additions (dummy versions of a few
missing long double complex functions, etc.) by me.
various cleanups still need to be made, including re-adding (if
they're correct) some asm functions that were dropped.
|