-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] LangRef: state explicitly that floats generally behave according to IEEE-754 #102140
Conversation
@llvm/pr-subscribers-llvm-ir Author: Ralf Jung (RalfJung) ChangesFixes #60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations. This does not reflect what happens on x87, but what happens there is just plain unsound (#89885, #44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way. There's also the questions around flushing subnormals to zero, but this discussion seems to indicate that the expectation here is -- this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiler code on FTZ hardware will lead to #89885-like issues. AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong). @nikic @arsenm @jcranmer-intel what do you think is the best way to word this? Full diff: https://github.com/llvm/llvm-project/pull/102140.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index b17e3c828ed3d..7fc4e9385a1e1 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3577,6 +3577,12 @@ seq\_cst total orderings of other operations that are not marked
Floating-Point Environment
--------------------------
+Unless noted otherwise, LLVM works with IEEE 754 floating-point semantics: LLVM
+backends assume that the CPU is configured to provide IEEE-compatible behavior,
+and LLVM frontends can assume that LLVM IR floating-point operations behave
+according to the IEEE specification (with an :ref:`exception around NaN values
+<floatnan>`).
+
The default LLVM floating-point environment assumes that traps are disabled and
status flags are not observable. Therefore, floating-point math operations do
not have side effects and may be speculated freely. Results assume the
|
llvm/docs/LangRef.rst
Outdated
Unless noted otherwise, LLVM works with IEEE 754 floating-point semantics: LLVM | ||
backends assume that the CPU is configured to provide IEEE-compatible behavior, | ||
and LLVM frontends can assume that LLVM IR floating-point operations behave | ||
according to the IEEE specification (with an :ref:`exception around NaN values | ||
<floatnan>`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hyphenate IEEE-754? This is also worded a bit broadly, and seems to contradict some of the text later in the section.
Ideally the LangRef would not be in terms of "CPUs" or "configurations".
Maybe should specify the bitlayouts are assumed to be the IEEE pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also worded a bit broadly, and seems to contradict some of the text later in the section.
I assume you are talking about the exceptions related to NaN results? That's why it starts with "unless noted otherwise". I'm open for better wordings here if you have any suggestions.
Maybe should specify the bitlayouts are assumed to be the IEEE pattern?
It already says that.
The goal of this PR is to say that fadd
et al will use IEEE-754 rules to compute the desired results. That's a much stronger statement than just saying that the bitpatterns are interpreted as per IEEE-754.
backends assume that the CPU is configured to provide IEEE-compatible behavior, | ||
and LLVM frontends can assume that LLVM IR floating-point operations behave | ||
according to the IEEE specification (with an :ref:`exception around NaN values | ||
<floatnan>`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the denormal exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the denormal exception? Is this about what happens when denormal-fp-math
is set, but the default is to be IEEE-compatible?
Given that IEEE says that denormals are not flushed and LLVM assumes the same by default, I don't think this is an exception from "IR float ops behave according to IEEE".
I'm broadly supportive of the idea to bring greater clarity here; I'm not entirely certain this is the best way to do it. IEEE 754 just isn't quite precise enough to say that "it's just IEEE 754 behavior!" answers the questions that people are trying to ask. Also, we have three non-IEEE 754 types: What I think should be explicitly mentioned are these:
(Denormal flushing is cursed for reasons beyond the scope of this PR, and I don't think the |
It answers some questions. Like, currently the LangRef does not say what the result of I will restrict the new part to only apply to those types, not all float types.
To me these are both direct consequences of just saying unambiguously what the result is -- if the result can only be this, any optimization that changes the result is obviously wrong. But I can add clarifications to make sure this point is clear. |
I have attempted to clarify these concerns, please take another look. |
llvm/docs/LangRef.rst
Outdated
This means that optimizations and backends cannot change the precision of these | ||
operations (unless there are fast-math flags), and frontends can rely on these | ||
operations deterministically providing perfectly rounded results as described | ||
in the standard (except when a NaN is returned). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add "This also means that backends are not allowed to implement floating-point instructions using larger floating-point types unless they take care to consistently narrow the results back to the original range without inducing double-rounding." or some similar text that makes it clear that mapping fadd float
via just an x87 FADD
instruction is not legal lowering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO that is covered by "backends cannot change the precision of these operations". If we start listing all the consequences of that statement, we'll never be done...
llvm/docs/LangRef.rst
Outdated
For types that do correspond to an IEEE format, LLVM IR float operations behave | ||
like the corresponding operations in IEEE-754, with two exceptions: LLVM makes | ||
:ref:`specific assumptions about the state of the floating-point environment | ||
<floatenv>` and it implements :ref:`different rules for operations that return | ||
NaN values <floatnan>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling to come up with good wordsmithing here. It's not the case that there are two exceptions to IEEE-754 here. As @arsenm notes, we don't necessarily agree with IEEE 754 on denormal values. Also, we (rather explicitly) don't guarantee the side-effect on flags or exceptions of IEEE 754 semantics. The more we try to fine-tune the text to make "it's exactly IEEE 754 BUT", the more I think we obscure what the actual thing we're trying to state is.
This is a quick attempt I have of a direction that makes more sense to me:
Unless otherwise specified, floating-point instructions return the same value [i.e., we make no implication about flags or exceptions here, although this may be subtle] that the corresponding IEEE 754 operation for their type when executing in the default floating-point environment [link], except that the behavior of NaN values is instead as specified here [link]. In particular, a floating-point instruction lacking fast-math flags operating on normal floating-point values is guaranteed to always return the same bit-identical result on all machines. When executing in the non-default floating-point environment, the result is typically undefined behavior unless constrained intrinsics [link] are used.
(This still doesn't cover denormal behavior, because that's non-default floating-point environment, but I'm not sure I have a firm grip on what the denormal guarantees we want to make in the first place are!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a shot, based on your suggestion but structuring things a bit differently.
If the compiled code is executed in a non-default floating-point environment | ||
(this includes non-standard behavior such as subnormal flushing), the result is | ||
typically undefined behavior unless attributes like ``strictfp`` and | ||
``denormal-fp-math`` or :ref:`constrained intrinsics <constrainedfp>` are used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is basically an exact duplicate of the second paragraph in the floatenv
section, so I am inclined to remove it... but your draft did include such a sentence.
The way I view it, the floatsem
section is just about the IEEE float formats. This paragraph is true for all formats so it should be in the floatenv
section.
the returned *value*; we make no statement about status flags or | ||
traps/exceptions.) In particular, a floating-point instruction returning a | ||
non-NaN value is guaranteed to always return the same bit-identical result on | ||
all machines and optimization levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to specify "all machines that support IEEE-754 arithmetic"? I don't know if we support any targets that don't support IEEE-754, but it seems like there should be some provision for that. The C standard, for instance, talks about some transformations that are legal on "IEC 60559 machines."
Or are we saying that architectures that don't support IEEE-754 should indicate the differences in the IR or use a different type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, LLVM assumes that all backends implement IEEE-754 arithmetic, and will miscompile code if the backend doesn't do that. One example of a target that does not implement IEEE-754 arithmetic is x86 without SSE, and #89885 has examples of code that gets micompiled due to that.
The point of this PR is to make that more explicit. If instead the goal is to make LLVM work with backends and targets that do not implement IEEE-754 arithmetic, that will require changes to optimization passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're already at the point where we expect float
et al to correspond to the IEEE 754 binary32 et al formats. (This is documented, although somewhat subtly, by the current LangRef). There is also agreement at this point that excess precision (à la x87) is not correct behavior for LLVM IR, although it's not (yet) explicitly documented in the LangRef.
The only hardware deviation from IEEE 754 that we're prepared to accept at this point is denormal handling. I'm reluctant to offer too many guarantees on denormal handling because I'm not up to speed on the diversity of common FP hardware with respect to denormals, but I'm pretty sure there is hardware in use that mandates denormal flushing (e.g., the AVX512-BF16 stuff is unconditionally default RM+DAZ+FTZ, with changing MXCSR having no effect).
In short, we already require that hardware supporting LLVM be IEEE 754-ish; this is tightening up the definition in the LangRef to cover what we already agree to be the case. In the putative future that we start talking about cases where float
et al are truly non-IEEE 754 types (say, Alpha machines, or perhaps posits will make it big), then we can talk about how to add support for them in LLVM IR (which, given the history of LLVM, probably means "add new types", not "float means something different depending on target triple").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only hardware deviation from IEEE 754 that we're prepared to accept at this point is denormal handling.
Even there, the pass that causes trouble in #89885 would lead to miscompilations. Analysis/ScalarEvolution
will assume that float ops that don't return NaNs produce a given bit pattern (including denormals), and if codegen later generates code that produces a different bit pattern, the result is a miscompilation. If we don't accept "always return the same bit-identical result on all machines", then this pass (and possibly others) has to be changed.
So non-standard denormal handling is only supported with an explicit marker, which works very similar to the markers required for non-default FP exception handling.
is instead :ref:`as specified here <floatnan>`. (This statement concerns only | ||
the returned *value*; we make no statement about status flags or | ||
traps/exceptions.) In particular, a floating-point instruction returning a | ||
non-NaN value is guaranteed to always return the same bit-identical result on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "floating-point instruction" here? Is sqrt included?
I understand that the main point here is to say that without further IR constructs an instruction like fdiv is assumed to be correctly rounded. IEEE-754 also assumes this of sqrt. I believe the latest version specifies that other math functions should also return correctly rounded results. That's why I think it needs to be explicit here which ones you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant all the operations that have an equivalent IEEE-754 operation. So yes that would include sqrt, though I was under the impression that it does not include transcendental functions.
I am not sure what is the best way to say that. Having a list seems awkward? Should each such operation have a comment, like "This corresponds to <op> in IEEE-754, so if the argument is an IEEE float format then the :ref:floating-point semantics <floatsem>
guarantees apply."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that is hard to come up with a good term for. IEEE 754 has a core list of operations in section 5 which is a good starting point, but these omit the minimum/maximum operations (which are section 9.6). Section 9 is "recommended operations", and 9.2 is the main list of transcendental functions you're thinking of; IEEE 754 requires that they be correctly rounded, but C explicitly disclaims that requirement in Annex F. There's also a few functions in C that aren't in IEEE 754, notably ldexp and frexp.
(Note too that it was recently brought up in the Discourse forums that the libm intrinsics are meant to correspond to libm semantics, not IEEE 754 semantics.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minimum/maximum don't do any rounding, and already seem to unambiguously describe their semantics in the existing docs, making this clarification much less relevant. So maybe we should just say that this is about the core operations listed in section 5?
d5136f7
to
51c3ac7
Compare
…d floatnan section; clarify which operations it applies to
51c3ac7
to
7d4deb8
Compare
@jcranmer-intel @andykaylor I think I resolved/answered all of your comments, please let me know if there's anything else to do. :) |
:ref:`floating-point environment section <floatenv>` regarding flags and | ||
exceptions.) | ||
|
||
Various flags and attributes can alter the behavior of these operations and thus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and metadata (e.g. !fpmath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't know about that one, thanks. I added a mention, and also used the opportunity to add links for strictfp
and denormal-fp-math
.
|
||
This means that optimizations and backends may not change the observed bitwise | ||
result of these operations in any way (unless NaNs are returned), and frontends | ||
can rely on these operations providing perfectly rounded results as described in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Term is usually "correctly rounded" not "perfectly rounded"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, fair. It should be clear from context who is in charge of defining "correct" here (namely, IEEE-754).
I am adding these edits as new commits so it's easy to see what changed; I can squash them later or now if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure the correctly rounded bit applies to the default sqrt expansion on all targets, but I think we should consider that a bug if it's not.
@arsenm thanks for the review! I don't have merge rights, so may I ask you to handle that? :) |
… to IEEE-754 (llvm#102140) Fixes llvm#60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations. This does not reflect what happens on x87, but what happens there is just plain unsound (llvm#89885, llvm#44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way. There's also the questions around flushing subnormals to zero, but [this discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378) seems to indicate a general stance of: this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiled code on hardware configured to flush subnormals will lead to llvm#89885-like issues. AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).
… to IEEE-754 (llvm#102140) Fixes llvm#60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations. This does not reflect what happens on x87, but what happens there is just plain unsound (llvm#89885, llvm#44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way. There's also the questions around flushing subnormals to zero, but [this discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378) seems to indicate a general stance of: this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiled code on hardware configured to flush subnormals will lead to llvm#89885-like issues. AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).
Fixes #60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations.
This does not reflect what happens on x87, but what happens there is just plain unsound (#89885, #44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way.
There's also the questions around flushing subnormals to zero, but this discussion seems to indicate a general stance of: this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiled code on hardware configured to flush subnormals will lead to #89885-like issues.
AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).
@nikic @arsenm @jcranmer-intel what do you think is the best way to word this?