-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i686 floating point behavior does not agree with unit tests in debug mode #73288
Comments
This is probably caused by x87 instructions... I think we have an issue about that somewhere already, not sure where though. |
Indeed. After the mask, the value in
The call to When the return value is popped off the FPU stack in
|
Independently of x87, the intent behind the test is on shaky ground. LLVM optimizations do not generally preserve NaN bit strings, see e.g. #55131. |
Seems like we should update the docs for
and
|
The first part isn't wrong, as far as I can tell. It literally is implemented as transmute, it's just that handling any float values in general does not always "preserves the exact bits". |
fn main() {
let snan: u32 = 0xff << 23 | 1;
let round_trip = unsafe {
let float: f32 = std::mem::transmute(snan);
std::mem::transmute(float)
};
assert_eq!(snan, round_trip);
// assert_eq!(f32::from_bits(snan).to_bits(), snan);
} @hanna-kruppe This succeeds in debug mode unless the last line is uncommented, so the docs are misleading at the very least. |
I believe that difference is due to the
It's not that the functions in std do something different from |
If you mean MIR const propagated, it does not. Not sure about LLVM const propagation, but adding I'm not saying that this area isn't a mess, but the docs |
@Mark-Simulacrum Besides the unresolved problems around NaN payloads, this raises another issue: Should we be running unit tests for the standard library in debug mode on CI? As far as I know, we don't distribute a non-optimized |
I am fairly certain (hopeful, I guess) that most of our builders do have debug assertions enabled... however, it sounds like what you actually want here is a debug libstd. I don't know how feasible that is from a performance perspective (I imagine running those tests is considerably slower?). We do actually have a i686-gnu-nopt builder, and I think some other nopt builders, (https://github.com/rust-lang/rust/blob/master/src/ci/docker/i686-gnu-nopt/Dockerfile#L23). That's currently configured to not enable optimizations for tests, but that only affects compiletest-run tests. For tests like this which are heavily dependent on such, we can move them to run-pass tests to get benefits of debug testing. Note that debug assertions are currently disabled on that builder due to CI time concerns. |
Basically, as with all such questions, it comes down to someone sitting down and timing how long a debug build of std/std tests + running those tests would take, then timing some current task (e.g. release std tests) and scaling that to CI time. If that's sufficiently small in CI timescales, we can add it, if not we'll have to not do so. |
In terms of the bug here, should we have a proper meta-bug (maybe even labelled I-unsound) about LLVM FP semantics around NaN payloads being a mess? This is not immediately actionable, but it is certainly something we should track in a single place. Or should we convert either this or #55131 into that meta-bug? Cc @rust-lang/lang -- the short version is, looking at the bits of NaNs in LLVM is broken. :/ |
@RalfJung I went ahead and opened #73328 to track documenting our guarantees around NaN. I think we can close this issue once we remove the tests that are relying on unspecified behavior. I can look into running |
This is a duplicate of #46948. |
Don't our existing debug CI builders already do that? EDIT: Ah I see, the point is that we do it for ui tests but not unit tests. |
So which of these would you prefer to close? |
From reading the thread, it sounds like the bits of a NaN can basically change arbitrarily, as a result of LLVM optimizations that are not guaranteed to preserve them? (And other things, I suppose) |
…r=Mark-Simulacrum Run standard library unit tests without optimizations in `nopt` CI jobs This was discussed in rust-lang#73288 as a way to catch similar issues in the future. This builds an unoptimized standard library with the bootstrap compiler and runs the unit tests. This takes about 2 minutes on my laptop. I confirmed that this method works locally, although there may be a better way of implementing it. It would be better to use the stage 2 compiler instead of the bootstrap one. Notably, there are currently four `libstd` unit tests that fail in debug mode on `i686-unkown-linux-gnu` (a tier one target): ``` failures: f32::tests::test_float_bits_conv f32::tests::test_total_cmp f64::tests::test_float_bits_conv f64::tests::test_total_cmp ``` These are the tests that prompted rust-lang#73288 as well as the ones added in rust-lang#72568, which is currently broken due to rust-lang#73328.
I always thought the x86-specific problems that arise due to x87 are side-stepped when SSE2 is enabled. But now @ecstatic-morse points out that this bug still remains. How is that possible, isn't SSE2 replacing the x87 mess? |
So I guess that also alleviates any question about whether LLVM could adjust codegen to not have this problem. But this also points to a way out, doesn't it? If we adopt a model where we say something like "floating-point operations may change the NaN bits in a non-deterministic way" (which is what wasm does so arguably LLVM has to support it), then we "just" have to also say that as a special exception, on x86-32, returning floating-point values from a function also is considered a "floating-point operation". (Does this also affect arguments passed to a function?) |
The x87 stack is only used for the return value, not for passing arguments. That formulation would be correct on i686 AFAIK. An alternative would be to use a different calling convention for |
the above is true with some important exceptions: |
The wasm spec doesn't guarantee anything like that, does it? We cannot guarantee more than our compilation targets do. Or maybe that's what the "arithmetic NaN" stuff is about, which I didn't really understand. |
arithmetic NaN is another name for quiet NaN on all but a few unusual platforms: HPPA and MIPS pre-2008. See the definitions for |
Is it possible for us to adjust the Rust float calling convention for the pentium4-* ("i686") targets so that they always use the xmm registers for Rust code (essentially requiring they follow the x86-64 ABI conventions), never passing through the x87 stack-registers (unless a value is given to us from outside of Rust, which is of course not something we can meaningfully control)? |
It should be possible for the rust abi, though I don't know how to do this with LLVM. For the C abi I think the only option that doesn't break compatibility with existing C compilers would be to roundtrip between x87 and xmm registers to remove the excess precision. |
I think that would be fine. If we implemented that strategy, that would resolve this bug for Rust for the i686 target, and any other unusual behavior would be functionally Not A Bug: if you want to be sure you are actually moving 64 bits across an FFI barrier, then you can pass it as a u64 and remake it as a float on the other side, otherwise we should follow whatever idiosyncratic behavior a given ABI mandates. About the previous #73288 (comment) in this thread, I am actually somewhat skeptical as to how much tooling it would break (not that I think it would be zero) as opposed to how many issues would be closed (which would be at least one). For instance, any debuggers should be looking at all the registers, not just the x87 stack, and MSVC actually sets the x87 stack to a different precision than other C compilers, apparently, so we should be avoiding it in general if we want portable results: https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/ |
apparently the x87 stack can be avoided by using the
|
Belatedly: if that is the case, then that answers why this changes! Our functions wind up annotated with |
I propose that we close this issue by documenting this as a known non-compliance on x86-32 targets: #113053. |
Closing in favor of a dedicated tracking issue that summarizes what we know about the issue: #115567. |
…bilee add notes about non-compliant FP behavior on 32bit x86 targets Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look! In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used. I opened rust-lang#114479 to concisely describe and track the issue. Cc `@workingjubilee` `@thomcc` `@chorman0773` `@rust-lang/opsem` Fixes rust-lang#73288 Fixes rust-lang#72327
Rollup merge of rust-lang#113053 - RalfJung:x86_32-float, r=workingjubilee add notes about non-compliant FP behavior on 32bit x86 targets Based on ton of prior discussion (see all the issues linked from rust-lang/unsafe-code-guidelines#237), the consensus seems to be that these targets are simply cursed and we cannot implement the desired semantics for them. I hope I properly understood what exactly the extent of the curse is here, let's make sure people with more in-depth FP knowledge take a close look! In particular for the tier 3 targets I have no clue which target is affected by which particular variant of the x86_32 FP curse. I assumed that `i686` meant SSE is used so the "floating point return value" is the only problem, while everything lower (`i586`, `i386`) meant x87 is used. I opened rust-lang#114479 to concisely describe and track the issue. Cc `@workingjubilee` `@thomcc` `@chorman0773` `@rust-lang/opsem` Fixes rust-lang#73288 Fixes rust-lang#72327
Compiling and running the following code with
-C opt-level=0
for 32-bit Linux on a recent nightly (2020-06-07) results in a failing assertion.rustc +nightly --target=i686-unknown-linux-gnu test.rs && ./test
Output:
However, this code is taken directly from a unit test for
libstd
added in #46012, so presumably it should succeed.rust/src/libstd/f32.rs
Lines 1508 to 1514 in 5949391
The text was updated successfully, but these errors were encountered: