Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LangRef] Specify NaN behavior more precisely #66579

Merged
merged 8 commits into from
Oct 4, 2023
36 changes: 30 additions & 6 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3394,17 +3394,41 @@ Floating-Point Environment
The default LLVM floating-point environment assumes that traps are disabled and
status flags are not observable. Therefore, floating-point math operations do
not have side effects and may be speculated freely. Results assume the
round-to-nearest rounding mode.
round-to-nearest rounding mode, and subnormals are assumed to be preserved.
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
Running default LLVM code in an environment where these assumptions are not met
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default LLVM code" is not a well-defined phrase :-)

You should explicitly refer to the role of the strictfp attribute here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mentioning strictfp now, I hope what I am saying makes sense. :)

can lead to undefined behavior.

The representation bits of a floating-point value do not mutate arbitrarily; if
there is no floating-point operation being performed, the NaN payload (if any)
is preserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

floating point bits do mutate arbitrarily on 32-bit x86:
here, all I do is bitcast 0x7f800001 to float, return it from a function, then bitcast back to an integer -- that changes it to 0x7fc00001.
https://clang.godbolt.org/z/sjaq55q3G

Copy link
Contributor Author

@RalfJung RalfJung Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a bug in the x86-32 backend / calling convention.

Given that this bug is hard-coded in the calling convention, I wonder if it should be called out here. But the main point of this sentence is that LLVM itself will not do anything that would mutate FP values.

In fact I assume one can get LLVM to miscompile code by assuming that return values don't change their bits, similar to #44218.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we should at least mention the major ways in which LLVM backends violate the spec:

  • x86-32 with SSE2 enabled may implicitly convert floating-point values returned from a function to x86_fp80 and back; such conversions may signal a floating-point exception, or modify the value.
  • x86-32 without SSE2 may do such conversions just about anywhere.
  • older MIPS versions use the opposite polarity for the sNaN vs qNaN bit, and LLVM does not correctly represent this (this is at least potentially fixable, unlike the above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a paragraph with backend problems and linked to relevant issues. Is there an issue for the "x86-32 return value problem"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of one in LLVM, only in rust, so I filed a new one, #66803.


When a floating-point math operation produces a NaN value, the result has a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prepend "Unless otherwise specified here", and add notes to fneg, llvm.fabs, and llvm.copysign that they are guaranteed to not affect the NaN payload or qNaN/sNaN status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those 3 shouldn't be considered floating point math operations (as the term is used here and before). We definitely need to make it clear in their definitions -- it has implications beyond just the NaN behavior.

In particular, they all ought to be specified as: "This operation never raises a floating-point exception, and the result is an exact copy of the input, other than the sign bit. It is not considered a floating-point math operation, but rather, bit-manipulation which operates on floating-point values."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paragraph above says that FP exceptions cannot be observed anyway, so it seems confusing/misleading to now talk about some operations not raising FP exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am explicitly calling out bitcasts and fnet, fabs, copysign as not being affected by the NaN non-determinism now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some text added to the specification sections for the fneg/fabs/copysign operations. Noting there that exceptions cannot be signaled (in addition to them only touching the sign bit) is meaningful -- it means that you may use them in a "strictfp" function. Unlike other operations, they do not need a "constrained" variant, because their behavior is fully defined regardless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still add "unless otherwise specified", because you have cases like llvm.canonicalize which is an FP operation that can arbitrarily mutate NaN payloads but requires the output to be qNaN, even if the input is sNaN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. That can then also cover fmin/fmax, should we need to add an exception for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some text added to the specification sections for the fneg/fabs/copysign operations.

Done.

non-deterministic sign. The payload is non-deterministically chosen from the
following set:

- The payload that is all-zero except that the ``quiet`` bit is set.
("Preferred NaN" case)
- The payload of any input operand that is a NaN, bit-wise ORed with a payload that has
the ``quiet`` bit set. ("Quieting NaN propagation" case)
RalfJung marked this conversation as resolved.
Show resolved Hide resolved
- The payload of any input operand that is a NaN. ("Unchanged NaN propagation" case)
- A target-specific set of further NaN payloads, that definitely all have their
``quiet`` bit set. The set can depend on the payloads of the input NaNs.
This set is empty on x86 and ARM, but can be non-empty on other architectures.
(For instance, on wasm, if any input NaN is not the preferred NaN, then
this set contains all quiet NaNs; otherwise, it is empty.
On SPARC, this set consists of the all-one payload.)

In particular, if all input NaNs are quiet, then the output NaN is definitely
quiet. Signaling NaN outputs can only occur if they are provided as an input
value. For example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN.

Floating-point math operations are allowed to treat all NaNs as if they were
quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. This also
means that SNaN may be passed through a math operation without quieting. For
example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN. However,
SNaN values are never created by math operations. They may only occur when
provided as a program input value.
quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.

Code that requires different behavior than this should use the
:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
In particular, constrained intrinsics rule out the "Unchanged NaN propagation" case;
they are guaranteed to return a QNaN.

.. _fastmath:

Expand Down