Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Must a const fn behave exactly the same at runtime as at compile-time? #77745

Closed
oli-obk opened this issue Oct 9, 2020 · 85 comments
Closed

Must a const fn behave exactly the same at runtime as at compile-time? #77745

oli-obk opened this issue Oct 9, 2020 · 85 comments
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. A-floating-point Area: Floating point numbers and arithmetic C-discussion Category: Discussion or questions that doesn't represent real issues. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@oli-obk
Copy link
Contributor

oli-obk commented Oct 9, 2020

TLDR: should we allow floating point types in const fn?

Basically the question is whether the following const fn

const fn foo(a: f32, b: f32) -> f32 {
    a / b
}

must yield the same results for the same arguments if it is invoked at runtime or compile-time:

const RES1: f32 = foo(1.0, 0.0);

fn main() {
  let res2: f32 = foo(1.0, 0.0);
  assert_eq!(RES1.to_bits(), res2.to_bits());
}

Depending on the platform's NaN behavior, the result will differ between runtime and compile-time execution of foo(1.0, 0.0). Compile-time execution is determined by the Rust port of apfloat (a soft-float implementation); runtime behavior depends on the actual NaN patterns used by the hardware which are not always fully determined by the IEEE specification.

Note that this is entirely independent of any optimizations; we are discussing here the relationship between code that the user explicitly requests to be executed at compile-time, and regular run-time code. Optimizations apply to all code equally and they treat fn and const fn the same, so the the questions of how floating-point operations can be optimized is an entirely separate from and off-topic for this issue.

cc @rust-lang/wg-const-eval

@oli-obk oli-obk added A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. A-const-eval Area: Constant evaluation (MIR interpretation) C-discussion Category: Discussion or questions that doesn't represent real issues. labels Oct 9, 2020
@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2020

(Replying to an older version of the OP)

I think you are asking two different questions here and treat them as if they are the same. Let me explain. :)

More concretely, when we optimize the following code, are we allowed to const propagate the foo call?

Note that this is a different question from the issue title. The behavior at runtime could be non-deterministic according to the spec, so even if you see one particular runtime behavior and a different compile-time behavior, it would still be correct to do const-propagation. Floating-point operations likely are non-deterministic.

I feel rather strongly that const fn must behave in a way that is allowed to occur at runtime; that is just a different way of saying that CTFE must implement the Rust spec. It would be rather strange if that was not the case. From this alone it already follows that const propagation like you are asking is allowed. I cannot see any reasonable way (assuming a bug-free CTFE engine) in which this optimization is not allowed.

But then there is the separate question, do we want to allow non-deterministic operations in const fn? CTFE inherently has to make some choice to resolve the non-determinism(*), and that choice might be different from what codegen+LLVM happen to currently do, which could be surprising for programmers that do not expect such non-determinism to actually be observable (even though it could, in theory, be observable even without any CTFE being involved). I think this is really the question you are asking here, but it is unrelated to const propagation. Unlike "is this optimization correct", this question cannot be answered by proving a theorem; this is a judgment call we could make either way as part of language design.

(*) Actually that is not entirely true -- allocation base addresses are another example of non-determinism, and there CTFE uses a form of symbolic execution to basically track all possible non-deterministic choices at once, and halt evaluation for cases where that is not possible. This is required because the non-deterministic choice made for a certain allocation must be consistent for a given Rust program across const-time and run-time execution. For runtime code that choice is only made when the program actually starts (thanks to ASLR) and thus CTFE has to be done in a way that is compatible with every possible choice made later. We do not need such a heavy hammer for floating-point operations because their non-deterministic choice is much more local, confined to each individual operation.

@bugadani
Copy link
Contributor

bugadani commented Oct 9, 2020

I wonder in what real-world use case would somebody want to do complex calculations with NaN-s in compile time. I also wonder if the answer is the same if a compile-time calculation results in a NaN - is it something somebody actually wants, or does it hide an error?

@est31
Copy link
Member

est31 commented Oct 9, 2020

If NaN is the only concern, then the const fn implementation could just error out every time a NaN is encountered/obtained from computation. I also worry about floating point implementations having minuscle differences even for well-defined floats, but can't come up with an example. I see miri gives an error when a dangling pointer is returned in a constant as well (requires nightly to trigger it), so it's doable.

@jonas-schievink jonas-schievink added the A-floating-point Area: Floating point numbers and arithmetic label Oct 9, 2020
@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2020

So.. if we have a const F: f32 = A + B; and at runtime a let f: f32 = A + black_box(B);, then it is not necessary for F.to_bits() == f.to_bits(), but it must at least be possible for that to be equal for some execution of the runtime code (even if not observable in practice due to CPU bugs).

I also worry about floating point implementations having minuscle differences even for well-defined floats, but can't come up with an example.

I always thought that even non-NaN floating point math can differ (in miniscule ways) between hardware even if it's the same target triple.

the const fn implementation could just error out every time a NaN is encountered/obtained from computation

That's one way, but it may be expensive. We specifically do not validate in const eval like we do in miri, because that validation is expensive. Doing it for floats may be cheaper, but if we also have to look at floats in large structs or arrays it can get expensive very quickly again. We could of course just check during each operation, which is much more direct and should only impact float ops. I think that if nondeterministic operations include non-NaN numbers, then that doesn't help us a lot though.

I think that we can just make a judgement call on how we make such nondetermism deterministic (by choosing one possibility), even if that choice changes between target platforms, compiler versions, optimization levels or other compiler flags.

The main question is then, how to make that choice I guess. Just put it on the const eval roadmap that I should really really finish and have T-lang sign it off?

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2020

So.. if we have a const F: f32 = A + B; and at runtime a let f: f32 = A + black_box(B);, then it is not necessary for F.to_bits() == f.to_bits(), but it must at least be possible for that to be equal for some execution of the runtime code (even if not observable in practice due to CPU bugs).

No CPU bugs involved. If the spec says that non-deterministically, A or B can happen, then it is completely okay for runtime execution to always do A. Or to do A on Tuesdays and B every other day of the week. Or to do A only in crates whose name starts with a vowel. Or whatever. And CTFE can make its own choice of A or B completely independently on that.

So in your case, both f.to_bits() and F.to_bits() must be results that are allowed by the spec, but since the spec might allow multiple results, the two do not have to be the same. And that's really all we can say; there is no requirement that running the program many times must eventually produce all possible results or so. (This distinguishes non-determinism from randomness.)

I always thought that even non-NaN floating point math can differ (in miniscule ways) between hardware even if it's the same target triple.

AFAIK IEEE fully specifies what happens for primitive FP operations (except for NaN bits). Basically the operation has to return the best possible approximation to the result of the computation if it were carried out on actual rational numbers. (Transcendental functions are a different game.) The remaining differences here really are down to CPU bugs -- 32bit x86 is notorious here, and I hear some architectures handle subnormals (values very close to 0) incorrectly.

See rust-lang/unsafe-code-guidelines#237 for some more open questions about "what even are our floating-point semantics".

That's one way, but it may be expensive. We specifically do not validate in const eval like we do in miri, because that validation is expensive. Doing it for floats may be cheaper, but if we also have to look at floats in large structs or arrays it can get expensive very quickly again. We could of course just check during each operation, which is much more direct and should only impact float ops.

If the latest analysis in #73328 is correct, then only floating-point operations are non-deterministic, but copying around FP values is deterministic. I do not think it would be expensive to check at each floating point addition etc whether the result is deterministic (basically, if the result is non-NaN, but we can easily add further conditions), and raise an error if it is not. This has nothing to do with checking the validity invariant.

I think that if nondeterministic operations include non-NaN numbers, then that doesn't help us a lot though.

Why that? I think it would work just as well. Unless of course all operations are non-deterministic, then this would be equivalent to ruling out FP entirely.

@oli-obk
Copy link
Contributor Author

oli-obk commented Oct 9, 2020

Why that? I think it would work just as well. Unless of course all operations are non-deterministic, then this would be equivalent to ruling out FP entirely.

What I mean is if we error whenever any operation returns a NaN, but still have other nondeterministic ops, then we haven't gained anything and should either keep ruling out FP ops or just allowing NaN and not checking anything.

If the latest analysis in #73328 is correct, then only floating-point operations are non-deterministic, but copying around FP values is deterministic.

Oooh, neat. That is an improvement to the info I had when we created min_const_fn

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2020

What I mean is if we error whenever any operation returns a NaN, but still have other nondeterministic ops, then we haven't gained anything and should either keep ruling out FP ops or just allowing NaN and not checking anything.

Sure, we'd have to capture every possible form of non-determinism. Also everything I say on this topic should be checked by an FP expert, which I am not.^^

Oooh, neat. That is an improvement to the info I had when we created min_const_fn

Yes, new info came up since then. Basically I think we should just copy whatever WebAssembly does -- it has an exhaustive and precise spec, and I am sure they involved enough FP experts to make sure the spec is also realistically implementable. Also this means if LLVM does soemthing else we can complain that they are incompatible with WebAssembly, which gives our complaint more weight. ;)

@jyn514 jyn514 added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Oct 9, 2020
@ecstatic-morse
Copy link
Contributor

Transcendental functions (e.g. trigonometry and non-integral exponentiation), can give different results between platforms. I don't think we can take a "hard-line" approach, since then functions like f32::tan that will take advantage of hardware support if it exists can never be const fn; Users would have to opt-in to a software emulated version (either in std or in the ecosystem). I view CTFE as just another platform on which floating point is supported. Users don't (or at least shouldn't) expect cross-platform consistency at runtime, why is CTFE any different?

Even trying to promise that the result given by CTFE is consistent between compiler versions would be foolish, since it would lock us into a particular software emulation strategy that may become outdated. I think we should guarantee that CTFE engine for a given compiler at a given optimization level is deterministic between runs and nothing more. I think we might already have this for NaN payloads, although it would be nice to have an actual set of rules for how they get handled instead of "whatever LLVM does".

@workingjubilee
Copy link
Member

If NaN is the only concern, then the const fn implementation could just error out every time a NaN is encountered/obtained from computation. I also worry about floating point implementations having minuscle differences even for well-defined floats, but can't come up with an example.
-@est31

An example of what you may be thinking about is, for instance, ARMv7 Neon (the SIMD vector unit) flushing subnormal float values to zero. A floating-point operation in a register that flushes subnormal values to zero would potentially cause an alteration of the value even when the mantissa is not supposed to change (e.g. f32x4::abs). This is no longer a concern on ARMv8 Neon (usually "aarch64"), which will not do that if you don't ask it to, as far as I'm aware (and I am reasonably sure I would have found out by now).

Part of the reason that x86_64 is well-behaved while x86 is not is that, as @thomcc told me, on x86_64 the XMM registers (introduced as SIMD registers) are fully compliant and compilers exhibit a preference for them even for "scalar" floating point ops. That may sound like a niche oddity but I have been finding in my reading and experiments that "The floating point registers and SIMD registers are actually the same" is actually a fairly common hardware implementation approach... correct FP math is expensive in transistors to implement, so I suppose one saves some surface area on the die by not doing so twice.

In fact, this got me curious enough to feed some code into Godbolt...

pub fn f32_add(a: f32, b: f32) -> f32 {
    a + b
}

pub fn f32_abs(a: f32) -> f32 {
    a.abs()
}
; Compiled with rustc -C opt-level=3 -Cdebuginfo=0 -Ccodegen-units=1 --target armv7-unknown-linux-gnueabihf
example::f32_add:
        vadd.f32        s0, s0, s1
        bx      lr

example::f32_abs:
        vabs.f32        s0, s0
        bx      lr

Yes, those are vector instructions. I had developed a slightly more thorough platform comparison for x86 and Arm targets at the link (nothing particularly exciting, mind) but I excerpted that because it does seem that at least on some ARMv7 triples there is a similar default to using the (potentially buggy, as mentioned) vector unit for floating point math... a sound choice elsewhere, an implementation concern here.

There's more extensive documentation on all the nuances of Arm's various FP implementations spread across Arm's website but I found a relatively succinct explanation of some of the details on Debian's wiki in relation to their own support choices: https://wiki.debian.org/ArmHardFloatPort#Background_information

@tavianator
Copy link
Contributor

This GCC bug is possibly related to this discussion: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681. LLVM may have similar issues, I'm not sure.

The moral is that it's important that compile-time float evaluation can't lead the optimizer to make assumptions that may be contradicted at runtime.

@thomcc
Copy link
Member

thomcc commented Oct 10, 2020

This GCC bug is possibly related to this discussion: gcc.gnu.org/bugzilla/show_bug.cgi?id=93681.

That's different since it's constant propagation done by the compiler as an optimization, and not compile-time function execution like const fn. It might be challenging to fix for similar reasons though, I don't know.

A floating-point operation in a register that flushes subnormal values to zero would potentially cause an alteration of the value even when the mantissa is not supposed to change (e.g. f32x4::abs)

abs/neg shouldn't according to https://www.keil.com/support/man/docs/armasm/armasm_pge1423647771863.htm Generally flush-to-zero just treat subnormal numbers as non-canonical encodings of zero. (This is typically more efficient for them since making abs/neg a bit mask)

Even in the case that the're vabs/vneg, according to https://www.cl.cam.ac.uk/research/srg/han/ACS-P35/zynq/ARMv7-A-R-manual.pdf :

A8.8.280 VABS Vector Absolute takes the absolute value of each element in a vector, and places the results in a second vector. The floating-point version only clears the sign bit

A8.8.355 VNEG Vector Negate negates each element in a vector, and places the results in a second vector. The floating-point version only inverts the sign bit.

Anyway this doesn't really matter a ton since flush-to-zero is still a problem regardless of were it's happening.


@workingjubilee Just a couple quick bits of couple bits of elaboration to prevent future confusion. Sorry if you already are aware of these things.

ARMv7 has both "vector floating point" (VFP) operations, and "Advanced SIMD" (Aka NEON). The instructions in that godbolt you linked are VFP and not NEON. (AFAICT the VFP is operations are moestly for scalar use.)

However: "Advanced SIMD" on these machines is (apparently — I didn't know this) never able to disable flush to zero. VFP operations (which is what scalar float operations tend to use) will flush or not based on a status register bit that user mode can change: FZ bit.

By default, this bit is set, although it can be changed at a performance cost. I think it's plausible if we might want to investigating turning it off before main and making it UB to turn back on if this is a soundness problem (note that the related DN/denormals-are-zero bit should also be turned off).

Another option would be to have floats in const fns that target this platform to emulate the flush-to-zero behavior...

I'll try and dig a bit more later. Neither of these are great.


Basically I think we should just copy whatever WebAssembly does

For CTFE? Canonicalizing would probably be better than nondeterminism. The times when canonicalization is allowed to occur is very exhaustively specified in IEEE-754, and the design rationale for it Wasm indicates this wasn't done for performance reasons.

It's also worth noting that wasm when designed was very worried about doing things that would expose more fingerprinting bits to untrusted code. Rust doesn't have nearly the same set of requirements as it — as a result anywhere without 100% consistency is nondeterministic.

Another concern here is that Wasm has no issue preventing people from e.g. changing rounding modes and such. Rust can't prevent this, so... ultimately it would be good to come to terms with the fact that probably can't match target semantics 1:1 from const fn.

Hell, on Rust as it is currently you can change rounding modes and even enable flush to zero: https://doc.rust-lang.org/nightly/core/arch/x86_64/fn._mm_setcsr.html I've never been clear if this was actually sound, but it doesn't say there are any real problems...

@workingjubilee
Copy link
Member

@thomcc Ahhh. I was aware of the ASIMD/Neon behavior having "always-on" FTZ... that was what I mentioned when we first discussed this a few days ago... and I had known about there being a difference between VFP and Neon! But I had indeed gotten confused in the middle of everything re: what I was looking at... that's just my week, I suppose. And I didn't know the normal FP unit also had default FTZ! That's... unfortunately interesting. I see you have managed to leap ahead of me on doing reading on the exact details of floating point weirdness on Arm, so clearly I should catch up. :^)

The important (floating) data point: Arm has too many floating point units and they all until recently had overly interesting behavior.

Regarding x86: Actually there has in fact been an argument that writing the MXCSR register is unsound in the past.

@RalfJung
Copy link
Member

RalfJung commented Oct 10, 2020

Even trying to promise that the result given by CTFE is consistent between compiler versions would be foolish, since it would lock us into a particular software emulation strategy that may become outdated.

If we restrict CTFE to the deterministic subset of Rust, then there is no lock-in beyond "CTFE adheres to the Rust specification" (and I hope we agree that that is a kind of lock-in that we want).

Once we permit CTFE for non-deterministic operations, I fully agree -- CTFE should make no more promises than the Rust spec itself.

I think we might already have this for NaN payloads, although it would be nice to have an actual set of rules for how they get handled instead of "whatever LLVM does".

That is the subject of discussion in #73328. The most promising approach (I think) so far is to basically copy what wasm does, and then hope/ensure that LLVM complies with this.


This GCC bug is possibly related to this discussion: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681. LLVM may have similar issues, I'm not sure.

Yes, this is the kind of thing I mean when I say "LLVM might violate the wasm spec" -- if they assume floating-point oprations are deterministic, then we (and they) are in trouble.


Basically I think we should just copy whatever WebAssembly does

For CTFE?

No, for the Rust spec. CTFE then should choose one legal implementation of that non-determinism (or none, if we decide to not support non-deterministic operations in CTFE for now).

But I think we have to resolve the Rust spec question (#73328) before we stabilize any of this for CTFE.

Regarding rounding modes, AFAIK LLVM's stance basically is that changing them is UB, and Rust inherits that.

Also, this issue is drifting away from CTFE and towards "what even is the Rust spec". ;) That is not too surprising, as I think we have to figure out the spec before we can really say much about CTFE, but it means there is a lot of overlap with other issues such as #73328.

@RalfJung
Copy link
Member

RalfJung commented Oct 11, 2020

I just realized another aspect of this discussion: if NaNs are non-deterministic, and if we allow computations with NaNs in const context, then we are departing from the idea that const fn must be deterministic. Centril would have been strongly opposed and I tend to agree we should tread carefully here -- there are some nice ideas out there for unsafe (runtime) code requiring const fn arguments and exploiting the purity of that computation; those plans would be much harder or might even become impossible if we permit non-determinism in const fn.

So right now I lean rather strongly towards not permitting any computation in CTFE that would be non-deterministic. This would mean adding checks to our floating-point operations and bailing out when there is a NaN, i.e., FP operations would be "unconst". Incidentally, this is also what @thomcc suggested after a long discussion we had recently, but for totally different reasons. ;)

@thomcc
Copy link
Member

thomcc commented Oct 12, 2020

this is also what @thomcc suggested after a long discussion we had recently, but for totally different reasons.

I mean, wanting to keep float arithmetic deterministic was part of my desire there.

Anyway my suggestion here for NaN in const fn (Note: just NaN — I'm still thinking about the issue for subnormals on ARM) is:

  1. Performing operations on NaN in const fn is an error.

    • Note: Operations is defined precisely by IEEE-754 2019 in clause 5 if that's a concern but it more or less means what you'd assume — arithmetic, special functions, casting, etc.
    • Signbit ops like Neg/Abs/Copysign should probably be okay, but it also might be weird to allow some ops and not others, and it's unclear how useful these actually are.
    • Also, a handful of other operations (mostly 5.7.2's stuff, like is_nan, for example) are probably worth allowing. The precise details here feel like a libs concern, though.
  2. Producing a NaN in const fn except via {float}::from_bits is an error.

    • This is probably too late to allow for literals that define const/static items, so perhaps for them it produces the qNaN with the correct sign for the expression and all-bits-zero payload. (This is not actually what the hardware would have produced in all cases, but I doubt we do the right thing here as-is, and that ship has sailed)

At runtime the operations would behave as on the target, e.g. unlike some const eval errors these wouldn't panic at runtime.

In cases like Miri, by default it these error, with a flag you can set to say "I know that the bit-pattern of these NaNs may or may not match that of the target" (Ralf informs me that my initial idea of "defer the error to when you read the bits" would be a lot of work, and doing it this way would actually turn miri into a more valuable tool for debugging numerical code).

All of this has the benefits of:

  • Keeping const fn fully deterministic.
  • Keeping floating point math fully deterministic on targets that implement it deterministically.
  • Behaving the same at runtime as at compile time (It forbids operations that would behave differently).
  • Preserving target semantics around NaN without having to explicitly model each platform.
  • Producing very similar results on all targets.
    • Not totally identical though — "qNaN with the correct sign for the expression and all-bits-zero payload" actually is different on a small number of easily enumerated targets — MIPS most notably swaps the meaning of qNaN and sNaN.
  • Miri could be used to catch when NaN is produced.
    • Note: This is a unintentional benefit (and one I'd be willing to give up if needed), but it's a huge nice-to-have — I've probably spent literal months of my life trying to track down which expression it was that produced a the NaN.

Not specifically relevant to const fn but related: For constant propagation done as a compiler optimization, I think should just stop constant propagating when it hits a NaN unless it's willing to implement the correct target semantics.

It's very hard for me to imagine this is such a common case that the optimization is very profitable anyway, and sounds way less error-prone than any other option.

Additionally, it follows the logic of "compiler optimizations should not change the observable behavior of the program". (Of course if rust ever gets "fast math" options, one might be to disable this and constant propagate despite it being wrong)

@workingjubilee
Copy link
Member

workingjubilee commented Oct 13, 2020

My understanding is that on "subnormals are zero, flush to zero" hardware, the subnormal will effectively zero when interacted with. So if Rust produces a subnormal from const compilation, then at runtime the value would be read as zero. The problem is then if an operation that behaves differently if it would see a non-zero subnormal or zero interacts with that number. Even then, if it is an arithmetic op and the result would also be a subnormal then it actually is of no consequence to continue to const eval float operations.

So it seems there is an option of implementing a similar set of rules around subnormal as you would propose around NaN.

There is also an option of choosing to only const eval floats using subnormal number support, on the thesis that reading the resulting number at zero at runtime is of minimal consequence. This produces a slight compilation vs. runtime deviation, unfortunately, but also is actually pretty simple. It would also imply never advancing hosts which cannot support it (if there are any) to tier 1, unless soft floats become involved.

@thomcc
Copy link
Member

thomcc commented Oct 13, 2020

My understanding is that on "subnormals are zero, flush to zero" hardware, the subnormal will effectively zero when interacted with. So if Rust produces a subnormal from const compilation, then at runtime the value would be read as zero

This is true (that's what the subnormals-are-zero flag does), but unfortunately there are a lot of computations which end up having very different (normal) results if an intermediate step flushes subnormals to zero vs if they don't.

In general, I'd be devastated to give up subnormals (or more broadly, IEEE-754 compatibility, but subnormals are very important for floats actually behaving well in practice) because of some mistakes ARM made on one of their older architectures.

The options I see here are:

  1. assuming the option I suggested above for NaN is viable, it might be fine just to stop const eval when a subnormal is produced.
  2. emulate arm32 behavior in miri when compiling to arm32.
  3. accept that the result will be different, and compute it using proper IEEE754 semantics. (This is also arguably the closest to the spirit of IEEE754 which is that operations compute things at infinite precision and you get the rounded result).
  4. enable correct handling of subnormals on ARM32 at startup.

I suspect 4 is unrealistic because of bad performance, and people will just turn it back on (when I worked in games I did way worse things than this to fix performance bugs). Also IDK if we can do it from some use cases (shared libraries), etc.

Number 2 keeps consistency between const eval and runtime code... but only if someone else doesn't mess with the floating point flags. In practice, I don't know how common that is but I know if I had to write numerically sensitive code on Arm32 I'd probably at least try turning them on to see how bad the slowdown is (and whether it only happens on computations involving subnormal numbers).

For 1 vs 3, it's tricky. I've never worked in a situation where powerful const eval tools were available for numerically sensitive code. My closest comparison is stuff that ran at build time as part of an asset pipeline. For example, when I worked in games, I maintained a program that ran in the asset pipeline and took a 3d model as input, and spat out the best possible convex hull it could find that had no more than (say) 16 vertices. I think in practice this might be a bit too slow for a const fn to do, but it would be cool, so I'm going to use it as a hypothetical example of numerically intensive code in a const fn.

I don't know for a fact, but strongly suspect that it had computations that went through subnormals, as it normalized all vertices to be in the -1 .. 1 in order to operate where there's more precision (50% of all floats are in that range). (This is also valid and lossless, because the output of the convex hull function was just the triangles that make up the hull)

In a case like that, I'd prefer number 3. Number 1 is tricky because IDK what I'd have been able to do if I had an input that happened to hit that in an intermediate computation. If it's in a const fn, it might be very hard to move to a build script, since it might be using a bunch of other code in internal modules. It's not like NaN where as soon as NaN shows up, you're pretty much hosed anyway — the computation may just briefly dip into the subnormals.

That said, number 2 would have been the worst, because I'd have silent data corruption that I have no way to defend against beyond e.g. printing the result out. I'd wonderfully wait until runtime to learn that my hulls are weird because (a - b) * something became 0 when a and be were too close (this can never happen for a != b when subnormals are enabled).

None of these are great, sadly.

(Damnit ARM, why didn't you just support the standard...)

@Lokathor
Copy link
Contributor

Note about option 2: changing the FP state causes LLVM UB with the standard IR operations. You have to use special IR ops that account for non-standard floating point state if that's what you want. And I'm pretty sure that no part of rust does that (yet?).

@RalfJung
Copy link
Member

RalfJung commented Oct 13, 2020

All of this has the benefits of:

The major drawback, as mentioned above, is that floating-point operations become "unconst". So far my plan for "unconst" operations was to basically make them unsafe in const context and to say that violating these conditions is CTFE UB; that would certainly not be tenable here, so we'd need to figure out a better unconst story.


The subnormal flushing story on ARM is sad, but then it is not really worse than the mess that is x86_32, is it? So one approach here would be to say that yes these targets are supported for Rust, but their FP support is sub-par and you can unpredictably get non-IEEE754-conforming results.

Number 2 keeps consistency between const eval and runtime code... but only if someone else doesn't mess with the floating point flags.

We already assume for correct operation that the FP environment is left at its default, so e.g. changing the rounding mode is effectively UB (or at least, there is no telling which FP operations are running with which rounding mode, as LLVM will happily move them around even if that means crossing a mode change). This sounds similar. (EDIT: Ah that's what Lokathor already wrote.)

@thomcc
Copy link
Member

thomcc commented Oct 13, 2020

changing the rounding mode is effectively UB (or at least, there is no telling which FP operations are running with which rounding mode, as LLVM will happily move them around even if that means crossing a mode change).

I'm actually more familiar than I'd like with LLVM moving operations around a mode change (The only time I've ever used LLVM as a programmer was working on a LLVM JIT to accelerate interval arithmetic expressions — the naive implementation of these changes rounding mode once per operation. It went poorly).

Anyway, that might be "UB"... but currently nothing that bad will happen even in the wildest of cases beyond LLVM not respecting your rounding, and I suspect a compiler_fence would be enough to force the issue. Most cases of UB I'm aware of in rust are serious issues that are major cause for concern. For this, it seems very likely that people are going to expose Rust code to mobile in contexts where the default float env has been change to something less pathological.

For example: Wouldn't a WASM intepreter have to change the rounding mode? An interpreter for a language that — Hell, we could just be a shared library called from code where the rounding mode has been changed.

These being UB seems like a bad outcome, since it either means that code has a serious, serious bug, or we're stretching the definition of UB very broadly. Either way — I strongly feel that even if it is UB, in practice it shouldn't behave exceptionally poorly, otherwise we've introduced a very surprising way for rust programs to have security issues that other languages don't.


All that said... emulating ARM32's default float env when compiling to that target wouldn't bother me — Maybe even we can even add a lint eventually to warn you if the computation goes through denormals (that said, I don't know how plausible adding a lint from inside miri is).

The subnormal flushing story on ARM is sad, but then it is not really worse than the mess that is x86_32, is it?

Ehh... x86_32 gives you your result at a higher precision than you asked for, which in some sense is great, although it comes with a lot of side effects that are not so great, and in general the x87 stack is a really idiosyncratic beast. That said, for the most part binary80 itself is a pretty natural extension to IEEE754.

So one approach here would be to say that yes these targets are supported for Rust, but their FP support is sub-par and you can unpredictably get non-IEEE754-conforming results.

That seems like a fine thing to say to me. I mean, it's true now after all. That doesn't answer what to do for const fn though.

@RalfJung
Copy link
Member

RalfJung commented Oct 13, 2020

Anyway, that might be "UB"... but currently nothing that bad will happen even in the wildest of cases beyond LLVM not respecting your rounding, and I suspect a compiler_fence would be enough to force the issue.

Being a formal methods person I have to note that this is not fully correct. I could totally write some code that does some FP operations, casts the result to raw bits, inspects them for matching exactly what IEEE754 says, and deref's a NULL pointer if they do not match. This is a correct UB-free program, until changing the FP mode (or running on x86_32 or arm32) messes is up.

No sane person would write such code, of course. But to my knowledge there is unfortunately no principled way in which "misbehavior due to unexpected rounding" is bounded; it can cause arbitrary changes in program behavior.

These being UB seems like a bad outcome, since it either means that code has a serious, serious bug, or we're stretching the definition of UB very broadly. Either way — I strongly feel that even if it is UB, in practice it shouldn't behave exceptionally poorly, otherwise we've introduced a very surprising way for rust programs to have security issues that other languages don't.

I am happy for any proposal that makes them not UB. :) But I think that's very non-trivial. For example, only very recently has there been the first formal work trying to define precisely what the semantics of fast-math are.

Ehh... x86_32 gives you your result at a higher precision than you asked for, which in some sense is great, although it comes with a lot of side effects that are not so great, and in general the x87 stack is a really idiosyncratic beast. That said, for the most part binary80 itself is a pretty natural extension to IEEE754.

I don't think there is any guarantee that overall precision is higher, is there? Precision of some individual operations might be higher, but that does not imply overall precision is higher. These things are not monotone. (Also see some of the discussion in rust-lang/rfcs#2686, specifically this.)

And even if it is true, "higher precision" does not imply "less UB". This can still make conforming programs cause UB, if only notorious programs like what I mentioned above.

That doesn't answer what to do for const fn though.

If the spec says "unpredictably get non-IEEE754-conforming results", i.e. some operations are conforming and some or not, then const fn can just use IEEE754. Basically what this would boil down to is that on these platforms, the affected operations are non-deterministic. This is not sound of course if the optimizer at the same time assumes them to be deterministic. But well, I'd be willing to note that down as a platform bug and move on... that's what everyone else seems to do and it seems to work for them^^ At least x86_32 is "almost dead" as a platform. arm32 less so I guess.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 13, 2020

There is also an option of choosing to only const eval floats using subnormal number support, on the thesis that reading the resulting number at zero at runtime is of minimal consequence. This produces a slight compilation vs. runtime deviation, unfortunately, but also is actually pretty simple.
@workingjubilee

accept that the result will be different, and compute it using proper IEEE754 semantics. (This is also arguably the closest to the spirit of IEEE754 which is that operations compute things at infinite precision and you get the rounded result).
[..]
In a case like that, I'd prefer number 3.
@thomcc

If the spec says "unpredictably get non-IEEE754-conforming results", i.e. some operations are conforming and some or not, then const fn can just use IEEE754. Basically what this would boil down to is that on these platforms, the affected operations are non-deterministic. This is not sound of course if the optimizer at the same time assumes them to be deterministic. But well, I'd be willing to note that down as a platform bug and move on... that's what everyone else seems to do and it seems to work for them^^
@RalfJung

So it seems like there is at least some perceived wisdom in something like this (including myself, granted), regarding subnormal handling.

If I were making all the decisions re: const fn and floats, I would say we should

  1. specify CTFE as using IEEE754 floats with default rounding mode
  2. only const-eval floats in explicit const contexts, so no implicit const folding of floats
  3. specify that we accept weirdness in older platforms' float behavior
  4. introduce lints and diagnostics as needed to steer people correctly around this

This means that if the platform has what I might call a "cursed" implementation of floating point (contra "buggy", because really, it's not a bug, it's a feature), then Rust currently makes no effort to change that, but avoids introducing unintended deviance that isn't specified by the programmer... or the LLVM optimizer, which is a barrier but which we can, at least to some extent, say "that's LLVM's problem, we're trying our best!"

I believe this is the least work for the most correctness and the least limitations on future choices in Rust. Also, while setting FZ during runtime might invite programmer meddling, we can at least confidently set floating point environment flags to enable subnormal handling during compilation without worrying if they might be changed at runtime, so that barring other bugs we can even try to compile correctly on armv7 hosts.

@RalfJung
Copy link
Member

only const-eval floats in explicit const contexts, so no implicit const folding of floats

So you are saying even on "well-behaved" platforms we shouldn't do any constant propagation even if we fully know what result the platform would produce? Or are you suggesting this only for "ill-behaved"/"cursed" platforms or operations where IEEE754 leaves wiggle-room?

@workingjubilee
Copy link
Member

I am not entirely sure, honestly. I would be inclined to say that we would want implicit folding of floating point operations to be as invisible as possible to the programmer, and that given that many arches explicitly support executing code compiled for different targets, we should reasonably question our assumptions about what the target "really is".

@RalfJung
Copy link
Member

Optimization, by definition, must not change program behavior. That makes them "invisible to the programmer", in a sense.

But when there is non-determinism in the spec, optimizations can stlll lead to different non-deterministic choices being made, which is in some sense "visible".

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2021

That will still mean, however, that we give up on const fn being deterministic functions at runtime. There are some situations where it'd be really nice to say "this function must be pure" and have the compiler check that (DerefPure anyone?). I am not sure if it was ever realistic or a good idea to use const fn for this purpose, but it seems at least possible in principle.

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2021

We can start out by just forbidding NaNs in the final value of a constant, just like we forbid pointers in the final value of an integral constant

Also, this is a breaking change, so experiments need to be done to judge its impact. At the very least, the f32::NAN constant would need some special magic to still be definable.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 6, 2021

At the very least, the f32::NAN constant would need some special magic to still be definable.

we can transmute one specific NaN bit pattern to define that constant.

That will still mean, however, that we give up on const fn being deterministic functions at runtime.

Yea, ok, we shouldn't throw that out the window without giving it further thought.

@carbotaniuman
Copy link
Contributor

carbotaniuman commented Jun 6, 2021

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018

If we care about const functions being different at runtime, and not just between compile time and runtime, then the pointer machinery doesn't work either.

Wait, I just remember unconst discussions were a thing, let me read up on that again.

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2021

That's just a playground link without any particular example; which code did you want to share?

But yes, memory allocation is another possible source of non-determinism.

@carbotaniuman
Copy link
Contributor

carbotaniuman commented Jun 6, 2021

Derp. https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=22c60739105651a93291076ee9c451db

I finished my reading of unconst, and it appears to just be restricting actions that aren't supported in CTFE and erroring. Deterministic and pure don't show up at all, so if the semantics of unsafe are more restrictive in const contexts, I seem to have missed that.

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2021

Yeah, that's an example of leaking allocation non-determinism out of a const fn. Good point.

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2021

That said, we could say that this code is wrong, since it transmutes a pointer to an integer. The transmute docs in #85769 explicitly forbid this.

unconst is not a pinned-down concept yet, but making non-determinism part of it is a serious proposal and we are currently keeping the safety rules in const fn aligned with that possibility.

@RalfJung
Copy link
Member

RalfJung commented Jun 7, 2021

That said, we could say that this code is wrong, since it transmutes a pointer to an integer

See rust-lang/unsafe-code-guidelines#286 for what's wrong with such transmutes even outside the context of CTFE

@est31
Copy link
Member

est31 commented Jun 7, 2021

Doesn't miri already track pointer-to-int conversions and yield an error if they occur? Are these checks not enabled for CTFE?

@RalfJung
Copy link
Member

RalfJung commented Jun 7, 2021

During CTFE, the Miri core engine should indeed detect attempts to transmute ptrs to integers.

But that does not help for ensuring that const fn are deterministic functions at runtime.

@indolering
Copy link

indolering commented Jun 10, 2021

That will still mean, however, that we give up on const fn being deterministic functions at runtime. There are some situations where it'd be really nice to say "this function must be pure" and have the compiler check that (DerefPure anyone?). I am not sure if it was ever realistic or a good idea to use const fn for this purpose, but it seems at least possible in principle.

AFAIK IIRC const functions were the rationale to reject adding a remove the pure keyword, as const was the only tractable definition of purity. Edit: a lot of people were expecting to use const functions for specifying contracts and formal methods work.

At the risk of spamming this thread, couldn't you use the WASM interpretation by default but give programmers the option of matching runtime behavior by whitelisting deterministic functions like augmentedMath and (I think) trigPi? I'm a big fan of Watt and would love to see WASM leveraged for more compile time operations. I'll take this to the chat.

@memoryruins
Copy link
Contributor

memoryruins commented Jun 10, 2021

@indolering pure was a keyword before and during const fn, and was unreserved as a keyword ~2 years ago with the rationale in https://github.com/rust-lang/rfcs/blob/master/text/2421-unreservations-2018.md#rationale-for-pure (but this is unrelated to this issue so let's leave it at that)

@thomcc
Copy link
Member

thomcc commented Aug 5, 2022

I am going to use this space to talk about const_eval_select because it has no tracking issue. Basically, I'm not fond of the safety requirement for const_eval_select.

Here's an example where safe code might want to use it: it seems desirable to have some kind of debug_or_const_assert! in certain cases -- in fact, stdlib already uses this pattern all over the place to check for soundness violations.

Unfortunately, currently this is only useful for checking unsafe code, despite it being pretty reasonable to have the result of violating the condition to be "incorrect behavior" rather than "unsoundness" (if it's useful I can come up with several concrete places I'd want to use this).

It also seems to create some weird incentives to make/keep invariants as safety invariants (rather than correctness invariants). For example, let's imagine that say I have an unsafe const fn that checks it a safety invariant in this manner (that is, this is using the hypothetical debug_or_const_assert! implemented using const_eval_select).

If it turns out that after some time, the initial reason for making the function unsafe is no longer valid (this is common in my experience, as rustc improves we need less unsafe to get good performance), I still can't make the function safe unless I either give up performing the check at compile time, or accept doing it at runtime too (which may not be acceptable). This is a big bummer, and encourages keeping this function unsafe.


And all that is ignoring how easy it seems to violate the requirement, especially if they only differ in some edge case that the author may not have considered, either because one of them has a bug, or because they use some platform intrinsic that is only mostly identical to the operation the const code performs (due to the author not reading the documentation precisely enough, perhaps).

To be clear: having them differ semantically would certainly result in confusing and undesirable behavior, and I'm not saying it should be encouraged -- just the opposite! What I am saying is that we have a lot of precedent in std for not considering it to be unsound for users to do confusing/undesirable things, especially in the case where they're likely to do so inadvertently.

In other words, I feel that differing const/runtime behavior with const_eval_select feels... kinda similar to having a buggy Ord impl -- it's confusing and undesirable, but also will inevitably occur by accident.


I'll also note that this part of the documentation...

Safe code in other crates may assume that $reasonable_assumption_that_isnt_always_true

... is true for all sorts of things (including the implementation of traits like Ord). This isn't a problem, since bugs happen. And then the example goes on to show a case where unsafe code makes this the assumption.

It seems totally reasonable to me that it's fine to push the safety requirement onto the code that contains unreachable_unchecked() -- it's the thing that needs the assumption to hold for soundness.

This would mean that unsafe code can't rely on this being true for external code, similar to how unsafe code shouldn't assume traits in third-party crates are correctly implemented.

@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2022

The current requirements on const_eval_select reflect an abundance of caution. I am not aware of there being anything fundamentally wrong with having functions that behave differently at compile-time and run-time, and allowing such differences would also bring float-in-const a lot closer to stabilization. However, that should be done very explicitly, not as a byproduct of introducing an intrinsic that was needed for a quick hack. That's why we went with the super paranoid version for now.

Would you like to draft an RFC to officially declare that we are okay with the same function behaving differently on the same input in runtime and const contexts? I think that is the kind of official process we need. (Possibly a T-lang FCP could also be enough, but I don't know what the FCP would be on, so an RFC seems better.)

@Lokathor
Copy link
Contributor

Lokathor commented Aug 6, 2022

For something of this magnitude the additional publicity of an RFC would be essential.

@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2022

Another question that the RFC probably needs to settle at the same time: must a const fn, when called at runtime in a way that it could also have been called at compile-time, always behave deterministically and without side-effects (other than writing to references explicitly given to the function)?

This is currently true, and it is potentially useful for unsafe code: given a const fn, unsafe code might never actually want to call this function in const context, but it might want to do things which are only correct when that function is deterministic and side-effect-free. But overall I feel that the costs of this restriction (which effectively tries to use const as an effect system for ensuring some form of purity) outweigh the benefits. Still, abolishing that rule should be done explicitly and with intent, not accidentally.

@thomcc
Copy link
Member

thomcc commented Aug 7, 2022

Would you like to draft an RFC to officially declare that we are okay with the same function behaving differently on the same input in runtime and const contexts...

Uh, hm. Sure, I guess I have some time for this. I likely will need someone more familiar with const eval to actually look over it after I have something, do you mind?

(This may be in a bit since I have actually promised another RFC at rustconf)


Re: const fn purity, in general I'm not a fan of telling unsafe code they can rely on the behavior of things like this. IMO you should be pretty careful when unsafe and only trust your code, and code that has made a promise to uphold the behavior that you need (possibly an implicit promise in the case of std, for example).

In particular, while it's not supported for this to happen (and may even be UB), things like the floating point environment may change externally, which could influence the results. If Rust code is in a cdylib, this could happen externally across two separate calls. So, promising that unsafe code may rely on this sounds a little negligent -- it's not a promise we can actually keep in all cases. (While float env changes might be a form of UB, they definitely will happen, and the things unsafe code could get up to if it believes this is impossible might be worse (exploitable) UB -- who knows).

Also this sounds like it would end up limiting the functionality that can be provided by const fn in the future, which could end up being a bummer.

(Also, people who want purity should consider what that actually looks like, and try to get it added to the language. Tacking it on to "what constant evaluation functions do at runtime" is not the way to go about it IMO)

@RalfJung
Copy link
Member

RalfJung commented Aug 7, 2022

Uh, hm. Sure, I guess I have some time for this. I likely will need someone more familiar with const eval to actually look over it after I have something, do you mind?

Yeah just ping @rust-lang/wg-const-eval and we'll be happy to give feedback. :) AFAIK we all want something like this to happen (@oli-obk @lcnr stop me if I am wrong).

While float env changes might be a form of UB, they definitely will happen

They must happen in a way that Rust code never sees it, i.e., changing it back to the previous configuration before control flows back to Rust. Then it's fine. Otherwise it's UB and you might have bigger problems.

Also this sounds like it would end up limiting the functionality that can be provided by const fn in the future, which could end up being a bummer.

Yes indeed that is my main reason for wanting to drop this restriction. I am not sure how const-heap support will look like but creating a heap allocation is not pure...

(Also, people who want purity should consider what that actually looks like, and try to get it added to the language. Tacking it on to "what constant evaluation functions do at runtime" is not the way to go about it IMO)

👍

@dlight
Copy link

dlight commented Aug 7, 2022

I was thinking about this and the only way this issue can ever be addressed is if Rust had a software floatng point implementation that perfectly emulates each supported platform (not only nan shenanigans but also x87 weirdness etc). Then it would work even if cross compiling

On one hand this would introduce a great burden to every supported platform, to the point that many platforms would need to be marked as "const fn differs from fn". On the other hand, it's probably possible to borrow code from projects like qemu or emulators in general, so it seems at least doable.

@RalfJung
Copy link
Member

RalfJung commented Aug 7, 2022

Not allowing float operations in const fn also "addresses" the problem. ;) Not in a way that makes anyone happy, though...

Note that there has been tons of prior discussion on this specific problem, also considering e.g. disallowing NaNs in const.

@Noratrieb
Copy link
Member

I'm interested in writing an RFC for allowing this (assuming no one else is working on one already). I'll reach out on Zulip once I have the first draft for feedback (feel free to reach out to me before that if you want to tell me something important about it) 🙂

@823984418
Copy link
Contributor

I have an idea:

Disallow constant floating-point f32 operations, through the standard library built-in a soft floating-point SoftF32, and provide constant conversion between the two?

The user performs the calculation in a constant context by converting the floating point to a soft floating point, thus sensing possible platform differences.

Is there already a third-party repository for this?

@823984418
Copy link
Contributor

I simply copied the soft floating point implementation in compiler_builtins:
const_soft_float
Is it possible?

@RalfJung
Copy link
Member

RalfJung commented Aug 4, 2024

With rust-lang/rfcs#3514 having been accepted, part of the answer here is given: it is okay for a const fn to return different answers for the same input and compiletime and runtime, at least as long as both invocations could have given the same answer (but the non-deterministic choice ended up being different).

What is still open is whether we want a safe stable way of accessing const_eval_select, but that's tracked at #124625.

@RalfJung RalfJung closed this as completed Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation (MIR interpretation) A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. A-floating-point Area: Floating point numbers and arithmetic C-discussion Category: Discussion or questions that doesn't represent real issues. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests