Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] LangRef: state explicitly that floats generally behave according to IEEE-754 #102140

Merged
merged 7 commits into from
Oct 11, 2024

Conversation

RalfJung
Copy link
Contributor

@RalfJung RalfJung commented Aug 6, 2024

Fixes #60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations.

This does not reflect what happens on x87, but what happens there is just plain unsound (#89885, #44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way.

There's also the questions around flushing subnormals to zero, but this discussion seems to indicate a general stance of: this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiled code on hardware configured to flush subnormals will lead to #89885-like issues.

AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).

@nikic @arsenm @jcranmer-intel what do you think is the best way to word this?

@llvmbot llvmbot added the llvm:ir label Aug 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 6, 2024

@llvm/pr-subscribers-llvm-ir

Author: Ralf Jung (RalfJung)

Changes

Fixes #60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations.

This does not reflect what happens on x87, but what happens there is just plain unsound (#89885, #44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way.

There's also the questions around flushing subnormals to zero, but this discussion seems to indicate that the expectation here is -- this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiler code on FTZ hardware will lead to #89885-like issues.

AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).

@nikic @arsenm @jcranmer-intel what do you think is the best way to word this?


Full diff: https://github.com/llvm/llvm-project/pull/102140.diff

1 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+6)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index b17e3c828ed3d..7fc4e9385a1e1 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3577,6 +3577,12 @@ seq\_cst total orderings of other operations that are not marked
 Floating-Point Environment
 --------------------------
 
+Unless noted otherwise, LLVM works with IEEE 754 floating-point semantics: LLVM
+backends assume that the CPU is configured to provide IEEE-compatible behavior,
+and LLVM frontends can assume that LLVM IR floating-point operations behave
+according to the IEEE specification (with an :ref:`exception around NaN values
+<floatnan>`).
+
 The default LLVM floating-point environment assumes that traps are disabled and
 status flags are not observable. Therefore, floating-point math operations do
 not have side effects and may be speculated freely. Results assume the

@arsenm arsenm added the floating-point Floating-point math label Aug 12, 2024
Comment on lines 3580 to 3584
Unless noted otherwise, LLVM works with IEEE 754 floating-point semantics: LLVM
backends assume that the CPU is configured to provide IEEE-compatible behavior,
and LLVM frontends can assume that LLVM IR floating-point operations behave
according to the IEEE specification (with an :ref:`exception around NaN values
<floatnan>`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyphenate IEEE-754? This is also worded a bit broadly, and seems to contradict some of the text later in the section.

Ideally the LangRef would not be in terms of "CPUs" or "configurations".

Maybe should specify the bitlayouts are assumed to be the IEEE pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also worded a bit broadly, and seems to contradict some of the text later in the section.

I assume you are talking about the exceptions related to NaN results? That's why it starts with "unless noted otherwise". I'm open for better wordings here if you have any suggestions.

Maybe should specify the bitlayouts are assumed to be the IEEE pattern?

It already says that.

The goal of this PR is to say that fadd et al will use IEEE-754 rules to compute the desired results. That's a much stronger statement than just saying that the bitpatterns are interpreted as per IEEE-754.

backends assume that the CPU is configured to provide IEEE-compatible behavior,
and LLVM frontends can assume that LLVM IR floating-point operations behave
according to the IEEE specification (with an :ref:`exception around NaN values
<floatnan>`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the denormal exception

Copy link
Contributor Author

@RalfJung RalfJung Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the denormal exception? Is this about what happens when denormal-fp-math is set, but the default is to be IEEE-compatible?

Given that IEEE says that denormals are not flushed and LLVM assumes the same by default, I don't think this is an exception from "IR float ops behave according to IEEE".

@jcranmer-intel
Copy link
Contributor

I'm broadly supportive of the idea to bring greater clarity here; I'm not entirely certain this is the best way to do it. IEEE 754 just isn't quite precise enough to say that "it's just IEEE 754 behavior!" answers the questions that people are trying to ask. Also, we have three non-IEEE 754 types: bfloat (whose behavior is completely describable with IEEE 754 format parameters, although I think the denormal flushing behavior is likely to be especially inconsistent here), x86_fp80 (which is mostly describable with IEEE 754, although it also introduces noncanonical values which don't exist with the basic IEEE 754 binary formats, but are allowed for in the standard), and ppc_fp128 (which is very pointedly not an IEEE 754 format type, and I have yet to find a document describing its semantics in any useful detail).

What I think should be explicitly mentioned are these:

  • it's not legal to x87-style store the result in a larger register without truncating it back to range (and you must ensure no double-rounding occurs when emulating in a larger type); fadd float really does mean IEEE 754 binary32 addition (as far as rounding is concerned).
  • optimizations cannot change the precision of the result unless annotated with fast-math flags

(Denormal flushing is cursed for reasons beyond the scope of this PR, and I don't think the denormal-fp-math is really sufficient to capture the cursed nature.)

@RalfJung
Copy link
Contributor Author

RalfJung commented Aug 13, 2024

I'm not entirely certain this is the best way to do it. IEEE 754 just isn't quite precise enough to say that "it's just IEEE 754 behavior!" answers the questions that people are trying to ask.

It answers some questions. Like, currently the LangRef does not say what the result of fadd on half, float, double, and fp128 even is. It says that the arguments and return values are interpreted using the IEEE-754 binary format, but doesn't say how the return value is computed -- so every frontend that relies on this being "the infinite-precision result, rounded to the next representable number, with round-ties-to-even" is technically incorrect. I think it should be possible to provide at least this basic promise.

I will restrict the new part to only apply to those types, not all float types.

What I think should be explicitly mentioned are these:

To me these are both direct consequences of just saying unambiguously what the result is -- if the result can only be this, any optimization that changes the result is obviously wrong. But I can add clarifications to make sure this point is clear.

@RalfJung
Copy link
Contributor Author

I have attempted to clarify these concerns, please take another look.

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
This means that optimizations and backends cannot change the precision of these
operations (unless there are fast-math flags), and frontends can rely on these
operations deterministically providing perfectly rounded results as described
in the standard (except when a NaN is returned).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add "This also means that backends are not allowed to implement floating-point instructions using larger floating-point types unless they take care to consistently narrow the results back to the original range without inducing double-rounding." or some similar text that makes it clear that mapping fadd float via just an x87 FADD instruction is not legal lowering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO that is covered by "backends cannot change the precision of these operations". If we start listing all the consequences of that statement, we'll never be done...

Comment on lines 3587 to 3591
For types that do correspond to an IEEE format, LLVM IR float operations behave
like the corresponding operations in IEEE-754, with two exceptions: LLVM makes
:ref:`specific assumptions about the state of the floating-point environment
<floatenv>` and it implements :ref:`different rules for operations that return
NaN values <floatnan>`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to come up with good wordsmithing here. It's not the case that there are two exceptions to IEEE-754 here. As @arsenm notes, we don't necessarily agree with IEEE 754 on denormal values. Also, we (rather explicitly) don't guarantee the side-effect on flags or exceptions of IEEE 754 semantics. The more we try to fine-tune the text to make "it's exactly IEEE 754 BUT", the more I think we obscure what the actual thing we're trying to state is.

This is a quick attempt I have of a direction that makes more sense to me:

Unless otherwise specified, floating-point instructions return the same value [i.e., we make no implication about flags or exceptions here, although this may be subtle] that the corresponding IEEE 754 operation for their type when executing in the default floating-point environment [link], except that the behavior of NaN values is instead as specified here [link]. In particular, a floating-point instruction lacking fast-math flags operating on normal floating-point values is guaranteed to always return the same bit-identical result on all machines. When executing in the non-default floating-point environment, the result is typically undefined behavior unless constrained intrinsics [link] are used.

(This still doesn't cover denormal behavior, because that's non-default floating-point environment, but I'm not sure I have a firm grip on what the denormal guarantees we want to make in the first place are!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a shot, based on your suggestion but structuring things a bit differently.

If the compiled code is executed in a non-default floating-point environment
(this includes non-standard behavior such as subnormal flushing), the result is
typically undefined behavior unless attributes like ``strictfp`` and
``denormal-fp-math`` or :ref:`constrained intrinsics <constrainedfp>` are used.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is basically an exact duplicate of the second paragraph in the floatenv section, so I am inclined to remove it... but your draft did include such a sentence.

The way I view it, the floatsem section is just about the IEEE float formats. This paragraph is true for all formats so it should be in the floatenv section.

the returned *value*; we make no statement about status flags or
traps/exceptions.) In particular, a floating-point instruction returning a
non-NaN value is guaranteed to always return the same bit-identical result on
all machines and optimization levels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to specify "all machines that support IEEE-754 arithmetic"? I don't know if we support any targets that don't support IEEE-754, but it seems like there should be some provision for that. The C standard, for instance, talks about some transformations that are legal on "IEC 60559 machines."

Or are we saying that architectures that don't support IEEE-754 should indicate the differences in the IR or use a different type?

Copy link
Contributor Author

@RalfJung RalfJung Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, LLVM assumes that all backends implement IEEE-754 arithmetic, and will miscompile code if the backend doesn't do that. One example of a target that does not implement IEEE-754 arithmetic is x86 without SSE, and #89885 has examples of code that gets micompiled due to that.

The point of this PR is to make that more explicit. If instead the goal is to make LLVM work with backends and targets that do not implement IEEE-754 arithmetic, that will require changes to optimization passes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already at the point where we expect float et al to correspond to the IEEE 754 binary32 et al formats. (This is documented, although somewhat subtly, by the current LangRef). There is also agreement at this point that excess precision (à la x87) is not correct behavior for LLVM IR, although it's not (yet) explicitly documented in the LangRef.

The only hardware deviation from IEEE 754 that we're prepared to accept at this point is denormal handling. I'm reluctant to offer too many guarantees on denormal handling because I'm not up to speed on the diversity of common FP hardware with respect to denormals, but I'm pretty sure there is hardware in use that mandates denormal flushing (e.g., the AVX512-BF16 stuff is unconditionally default RM+DAZ+FTZ, with changing MXCSR having no effect).

In short, we already require that hardware supporting LLVM be IEEE 754-ish; this is tightening up the definition in the LangRef to cover what we already agree to be the case. In the putative future that we start talking about cases where float et al are truly non-IEEE 754 types (say, Alpha machines, or perhaps posits will make it big), then we can talk about how to add support for them in LLVM IR (which, given the history of LLVM, probably means "add new types", not "float means something different depending on target triple").

Copy link
Contributor Author

@RalfJung RalfJung Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only hardware deviation from IEEE 754 that we're prepared to accept at this point is denormal handling.

Even there, the pass that causes trouble in #89885 would lead to miscompilations. Analysis/ScalarEvolution will assume that float ops that don't return NaNs produce a given bit pattern (including denormals), and if codegen later generates code that produces a different bit pattern, the result is a miscompilation. If we don't accept "always return the same bit-identical result on all machines", then this pass (and possibly others) has to be changed.

So non-standard denormal handling is only supported with an explicit marker, which works very similar to the markers required for non-default FP exception handling.

is instead :ref:`as specified here <floatnan>`. (This statement concerns only
the returned *value*; we make no statement about status flags or
traps/exceptions.) In particular, a floating-point instruction returning a
non-NaN value is guaranteed to always return the same bit-identical result on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "floating-point instruction" here? Is sqrt included?

I understand that the main point here is to say that without further IR constructs an instruction like fdiv is assumed to be correctly rounded. IEEE-754 also assumes this of sqrt. I believe the latest version specifies that other math functions should also return correctly rounded results. That's why I think it needs to be explicit here which ones you mean.

Copy link
Contributor Author

@RalfJung RalfJung Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant all the operations that have an equivalent IEEE-754 operation. So yes that would include sqrt, though I was under the impression that it does not include transcendental functions.

I am not sure what is the best way to say that. Having a list seems awkward? Should each such operation have a comment, like "This corresponds to <op> in IEEE-754, so if the argument is an IEEE float format then the :ref:floating-point semantics <floatsem> guarantees apply."?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that is hard to come up with a good term for. IEEE 754 has a core list of operations in section 5 which is a good starting point, but these omit the minimum/maximum operations (which are section 9.6). Section 9 is "recommended operations", and 9.2 is the main list of transcendental functions you're thinking of; IEEE 754 requires that they be correctly rounded, but C explicitly disclaims that requirement in Annex F. There's also a few functions in C that aren't in IEEE 754, notably ldexp and frexp.

(Note too that it was recently brought up in the Discourse forums that the libm intrinsics are meant to correspond to libm semantics, not IEEE 754 semantics.)

Copy link
Contributor Author

@RalfJung RalfJung Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minimum/maximum don't do any rounding, and already seem to unambiguously describe their semantics in the existing docs, making this clarification much less relevant. So maybe we should just say that this is about the core operations listed in section 5?

…d floatnan section; clarify which operations it applies to
@RalfJung
Copy link
Contributor Author

@jcranmer-intel @andykaylor I think I resolved/answered all of your comments, please let me know if there's anything else to do. :)

@RalfJung RalfJung changed the title LangRef: state explicitly that floats generally behave according to IEEE-754 [IR] LangRef: state explicitly that floats generally behave according to IEEE-754 Oct 11, 2024
@RalfJung
Copy link
Contributor Author

@nikic @arsenm maybe you could take a look? I am not sure who'd be a good reviewer for this. :)

:ref:`floating-point environment section <floatenv>` regarding flags and
exceptions.)

Various flags and attributes can alter the behavior of these operations and thus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and metadata (e.g. !fpmath)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't know about that one, thanks. I added a mention, and also used the opportunity to add links for strictfp and denormal-fp-math.


This means that optimizations and backends may not change the observed bitwise
result of these operations in any way (unless NaNs are returned), and frontends
can rely on these operations providing perfectly rounded results as described in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Term is usually "correctly rounded" not "perfectly rounded"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fair. It should be clear from context who is in charge of defining "correct" here (namely, IEEE-754).

I am adding these edits as new commits so it's easy to see what changed; I can squash them later or now if you prefer.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure the correctly rounded bit applies to the default sqrt expansion on all targets, but I think we should consider that a bug if it's not.

@RalfJung
Copy link
Contributor Author

@arsenm thanks for the review! I don't have merge rights, so may I ask you to handle that? :)

@arsenm arsenm merged commit a8a6624 into llvm:main Oct 11, 2024
6 of 8 checks passed
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
… to IEEE-754 (llvm#102140)

Fixes llvm#60942: IEEE semantics
is likely what many frontends want (it definitely is what Rust wants),
and it is what LLVM passes already assume when they use APFloat to
propagate float operations.

This does not reflect what happens on x87, but what happens there is
just plain unsound (llvm#89885,
llvm#44218); there is no coherent
specification that will describe this behavior correctly -- the backend
in combination with standard LLVM passes is just fundamentally buggy in
a hard-to-fix-way.

There's also the questions around flushing subnormals to zero, but [this
discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378)
seems to indicate a general stance of: this is specific non-standard
hardware behavior, and generally needs LLVM to be told that basic float
ops do not return the standard result. Just naively running
LLVM-compiled code on hardware configured to flush subnormals will lead
to llvm#89885-like issues.

AFAIK this is also what Alive2 implements (@nunoplopes please correct me
if I am wrong).
bricknerb pushed a commit to bricknerb/llvm-project that referenced this pull request Oct 17, 2024
… to IEEE-754 (llvm#102140)

Fixes llvm#60942: IEEE semantics
is likely what many frontends want (it definitely is what Rust wants),
and it is what LLVM passes already assume when they use APFloat to
propagate float operations.

This does not reflect what happens on x87, but what happens there is
just plain unsound (llvm#89885,
llvm#44218); there is no coherent
specification that will describe this behavior correctly -- the backend
in combination with standard LLVM passes is just fundamentally buggy in
a hard-to-fix-way.

There's also the questions around flushing subnormals to zero, but [this
discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378)
seems to indicate a general stance of: this is specific non-standard
hardware behavior, and generally needs LLVM to be told that basic float
ops do not return the standard result. Just naively running
LLVM-compiled code on hardware configured to flush subnormals will lead
to llvm#89885-like issues.

AFAIK this is also what Alive2 implements (@nunoplopes please correct me
if I am wrong).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
floating-point Floating-point math llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basic floating-point operations are underspecified
5 participants