Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need to provide these IEEE754-2019 functions #32877

Open
JeffreySarnoff opened this issue Aug 12, 2019 · 13 comments
Open

We need to provide these IEEE754-2019 functions #32877

JeffreySarnoff opened this issue Aug 12, 2019 · 13 comments
Labels
feature Indicates new feature / enhancement requests maths Mathematical functions

Comments

@JeffreySarnoff
Copy link
Contributor

JeffreySarnoff commented Aug 12, 2019

from IEEE 754-2019: sec 9.2 Additional mathematical operations

"Language standards should define, to be implemented according to this subclause, as many of the operations in Table 9.1 as is appropriate to the language. As with other operations of this standard, the names of the operations in Table 9.1 do not necessarily correspond to the names that any particular programming language would use. In this subclause the domain of an operation is that subset of the affinely extended reals for which the corresponding mathematical function is well defined. A conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain."

these functions include

rsqrt(x) ≝ 1 / sqrt(x)
compound(x, n) ≝ (1 + x)^n

rootn(x, n) ≝ x^(1/n)
pown(x,n) ≝ x^n
pow(x, y) ≝ x^y [-Inf, +Inf] x [-Inf, +Inf] 
powr(x,y) ≝ x^y [0,+Inf] x [-Inf, +Inf]

exp2m1(x)  ≝ 2^x - 1
exp10m1(x) ≝ 10^x - 1

log2p1(x)  ≝ llog2(1+x)
log10p1(x) ≝ log10(1+x)


sinpi(x) ≝ sin(π * x)
cospi(x) ≝ cos(π * x)
tanpi(x) ≝ tan(π * x)

asinpi(x) ≝ asin(x) / π
acospi(x) ≝ acos(x) / π
atanpi(x) ≝ atan(x) / π

atan2pi(y, x) is the angle subtended at the origin by the point (x, y)
    and the positive x-axis. The range of atan2pi is [−1, +1].

there are short tables of required y = f(x) where x in {±0, ±∞}
for many of these functions

getpayload(x)
    if the source operand is a NaN, the result is the payload
        as a non-negative floating-point integer.
    if the source operand is not a NaN, the result is −1.

setpayload(x)
    If the source operand is a non-negative floating-point integer whose value
        is one of an implementation-defined set of admissible payloads for
        the operation, the result is a quiet NaN with that payload. 
    Otherwise, the result is +0, with a preferred exponent of 0.

setpayloadsignaling(x)
    like setpayload, the result is a signaling NaN
@JeffBezanson JeffBezanson added the maths Mathematical functions label Aug 13, 2019
@JeffBezanson
Copy link
Member

See also #6148

@mschauer
Copy link
Contributor

mschauer commented Sep 3, 2019

Is there a corresponding section for constants which should defined?

@JeffreySarnoff
Copy link
Contributor Author

There is no such section in the current standard; however some constants should be available at their most accurate values for given representations and under the various rounding mode to comply best. These come to mind (and there are others which would be advantageous):

π, sqrt(2), exp(1), log(2), log(10), log2(10), log10(2), and their inverses.

A finite IEEE754 floating point value is constructible using ldexp(significand, exponent) where
significand, exponent = frexp(value). It would be very helpful to return the nearest binary representation to a decimal significand with a radix 10 exponent e.g. frexp(value, base=10) .. and similarly with ldexp.

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Sep 16, 2019

also missing is negate(x)

negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit. negate(x) is not the same as subtraction(0, x). [negate(NaN) forces msb, 0.0 - NaN leaves msb]
-- IEEE754-2019 (also in 2008 rev, maybe in original)

@mschauer
Copy link
Contributor

Should -x call 0 - x or negate?

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Sep 16, 2019

In what follows, Float16 is used as an example. The same holds for Float32 and Float64.

-x where x::Float16 should be processed as negative(x): corrected

negate(x::Float16) = reinterpret(Float16, reinterpret(UInt16, x) | 0x8000)
changesign(x::Float16) = reinterpret(Float16, reinterpret(UInt16, x) ⊻ 0x8000) 
negative(x::Float16) = isnan(x) ? x : changesign(x)

with the above this (which is important for proper propagation of NaNs) would hold

msbit(x::Float16) = reinterpret(UInt16, x) & 0x8000
isnan(x) && (msbit(x) === msbit(-x))

[it does not in Julia v1.2].
The relationship is important. The standard speaks to it:

When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings—copy, negate, abs, copySign—specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand.

immediateNaN = NaN
arithmeticNaN = 0 / 0

signbit(immediateNaN) !== signbit(arithmeticNaN)  # otherwise the bits are identical

[this signbit distinction holds in Julia 1.2]

signbit(immediateNaN) === false
signbit(arithmeticNaN) === true

with that, proper NaN propagation over arithmetic expressions needs two things
(i) NaN1 op NaN2 --> consistently either NaN1, NaN2, or the NaN of larger payload
(Julia 1.2 has NaN1 op NaN2 --> NaN1 for +, -, *, / and others)

(ii) see following (Julia 1.2 does have this property)

msbit(x::Float16) = reinterpret(UInt16, x) & 0x8000
isnan(x) && (msbit(x) === msbit(-x))

If this hold, one may determine if a calculation which results in a value that isnan has encountered an indeterminate expression (0.0/0.0, Inf/Inf, 0.0*Inf, Inf*0.0 and those variously signed) or has encountered a sentinel NaN or perhaps used an uninitialized value preset to NaN.

@mschauer
Copy link
Contributor

So we have to fix Float16 and the others are fine? I am confused that that -x destroys
information about what went wrong during computations

julia> reinterpret(Int, (0.0/0.0)) - reinterpret(Int, NaN)
-9223372036854775808

julia> reinterpret(Int, -(0.0/0.0)) - reinterpret(Int, NaN)
0

julia> reinterpret(Int, 0-(0.0/0.0)) - reinterpret(Int, NaN)
-9223372036854775808

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Sep 16, 2019

The use of Float16 was as an example -- Float32 and Float64 need the same treatment.

The difficulty with -x is only solved by changing things as I describe in my previous comment.

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Sep 17, 2019

also [Julia 1.2] does not do this, although the standard specifies the following behavior
abs(x::T) where {T<:IEEEFloat} = isnan(x) ? x : flipsign(x, x)
Along with the correction to -x described above, to get usable behavior, this is needed too.

What happens with copysign(NaN, x) is not specified in the standard. To provide propagated support that preserves each of these sorts of non-numbers,

immediateNaN = NaN
arithmeticNaN = 0 / 0

in addition to the handling of -x given above

abs(x::T) where {T<:IEEEFloat} = isnan(x) ? x : flipsign(x, x)
copysign(x::T, y::T) where {T<:IEEEFloat} = 
     (signbit(x) === signbit(y) || isnan(x)) ? x : flipsign(x, x)

@JeffreySarnoff
Copy link
Contributor Author

JeffreySarnoff commented Sep 17, 2019

"Implementations shall provide the following non-computational operations for all supported [floating point] arithmetic formats "

another missing mandated function is class(x):

class(x) tells which of the following ten classes x falls into:
signalingNaN
quietNaN
negativeInfinity
negativeNormal
negativeSubnormal
negativeZero
positiveZero
positiveSubnormal
positiveNormal
positiveInfinity

and we need totalorder(x::T, y::T) where {T<:IEEEFloat}

totalOrder(x, y) imposes a total ordering on canonical members of the format of x and y:
a) If x<y, totalOrder(x, y) is true.
b) If x>y, totalOrder(x, y) is false.
c) If x=y:

  1. totalOrder(−0, +0) is true.
  2. totalOrder(+0, −0) is false.
  3. If x and y represent the same floating-point datum:
    i) If x and y have negative sign,
    totalOrder(x, y) is true if and only if the exponent of x ≥ the exponent of y
    ii) otherwise,
    totalOrder(x, y) is true if and only if the exponent of x ≤ the exponent of y.
    d) If x and y are unordered numerically because x or y is a NaN:
  4. totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a
    floating-point number.
  5. totalOrder(x, −NaN) is false where −NaN represents a NaN with negative sign bit and x is a
    floating-point number.
  6. totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a
    floating-point number.
  7. totalOrder(+NaN, y) is false where +NaN represents a NaN with positive sign bit and y is a
    floating-point number.
  8. If x and y are both NaNs, then totalOrder reflects a total ordering based on:
    i) negative sign orders below positive sign
    ii) signaling orders below quiet for +NaN, reverse for −NaN
    iii) otherwise, the order of NaNs is implementation-defined.
    Neither signaling NaNs nor quiet NaNs signal an exception. For canonical x and y, totalOrder(x, y) and
    totalOrder( y, x) are both true if x and y are bitwise identical.
    Unsigned NaNs, as may occur in non-interchange formats, should order like NaNs with positive sign bit
  1. If x and y are both NaNs, then totalOrder reflects a total ordering based on:
    i) negative sign orders below positive sign
    ii) signaling orders below quiet for +NaN, reverse for −NaN
    iii) otherwise, the order of NaNs is implementation-defined.

regarding (5.iii), in this case our implementation should order according to the constructed "signed payload" values

@nsajko
Copy link
Contributor

nsajko commented Oct 6, 2024

Should all of these go into Base? Would an interface package be more appropriate for some of the functions?

@JeffreySarnoff
Copy link
Contributor Author

regarding negate(x::AbstractFloat) where AbstractFloat is IEEE 754-2019 conformant

section 5.5.1 Sign bit operations
negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit.
negate(x) is not the same as subtraction(0, x).

note section 6.3 The sign bit

When either an input or result is a NaN, this standard does not interpret the sign of a NaN. However, operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicates totalOrder and isSignMinus are also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.

[however, I find this to be unhelpful :)]

@JeffreySarnoff
Copy link
Contributor Author

Should all of these go into Base? Would an interface package be more appropriate for some of the functions?

Do you want to say Julia fully embraces IEEE 754-2019? If so, then they belong in Base.
otoh "Language standards should define, to be implemented according to this subclause, as many of the operations in Table 9.1 as is appropriate to the language." [see first comment for a list]. From the perspective of the Standard, the question is not where to put them .. it is whether they are to be offered as standard part of Julia.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests maths Mathematical functions
Projects
None yet
Development

No branches or pull requests

4 participants