Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for float_minimum_maximum #91079

Open
2 of 4 tasks
Urgau opened this issue Nov 20, 2021 · 31 comments
Open
2 of 4 tasks

Tracking Issue for float_minimum_maximum #91079

Urgau opened this issue Nov 20, 2021 · 31 comments
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Urgau
Copy link
Member

Urgau commented Nov 20, 2021

Feature gate: #![feature(float_minimum_maximum)]

This is the tracking issue for the f{32,64}::{min,max}imum functions.

It provide IEEE 754-2019 minimum and maximum functions to f32/f64.

Public API

impl f32 {
    pub const fn minimum(self, other: f32) -> f32;
    pub const fn maximum(self, other: f32) -> f32;
}

impl f64 {
    pub const fn minimum(self, other: f64) -> f64;
    pub const fn maximum(self, other: f64) -> f64;
}

Steps / History

Unresolved Questions

@Urgau Urgau added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 20, 2021
@CryZe
Copy link
Contributor

CryZe commented Nov 21, 2021

Just so it's tracked here: This currently does not compile to the corresponding instruction on WASM (fmin / fmax) and I believe Aarch64 has an instruction for this too (FMIN / FMAX), which it also doesn't compile to.

Godbolt

@workingjubilee
Copy link
Member

workingjubilee commented Nov 21, 2021

I believe there is an LLVM instruction for this. We can teach Rust to emit the appropriate LLVMIR directly, at the cost of adding the weight of an intrinsic. As we have discussed doing so for the portable SIMD API anyways, it might be profitable to do this in general.

@scottmcm
Copy link
Member

scottmcm commented Nov 22, 2021

LLVM does have an intrinsic for this, but as the PR that added these showed (https://godbolt.org/z/czz71M9KT), it doesn't actually work yet, which is why rustc isn't using it to implement these.

@workingjubilee
Copy link
Member

Ah, Damn.

@rben01
Copy link

rben01 commented May 15, 2022

Everything that follows will be stated in terms of the min and minimum functions but applies equally to max and maximum.

Does the function have to be called minimum? I know that's the IEEE name but I think I'd find having both min and minimum confusing, as their names don't really say which is which. "The function that requires both inputs be NaN for its output to be NaN is the one with the shorter name" is not a natural rule to remember IMO. I think I'd prefer having minimum be called (say) partial_min to suggest that it follow the same rules as PartialEq and PartialOrd, which is that a single NaN in the input is sufficient to produce a missing output.

@m-ou-se m-ou-se added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Aug 16, 2022
@m-ou-se
Copy link
Member

m-ou-se commented Aug 16, 2022

Nominated at the request of @Urgau.

@m-ou-se
Copy link
Member

m-ou-se commented Aug 23, 2022

The confusion between maximum vs max is definitely a problem, so we'll need to do something about that before we can stabilize that. The max and min functions are already stable, so we have limited flexibility.

We'll need some more input: How do other languages and libraries handle this? How are these named in the standard? What are the use cases for max vs maximum?

@alistaircarscadden
Copy link

I agree that having pairs of functions such that min != minimum and max != maximum is an issue. I had to check the documentation for why there were two functions to do the same thing to realize that they were functionally different. I think whatever the most explicit possible name is should be chosen. I don't think minimum / maximum is correct.

Here are some suggestions to add to the brainstorm. Any of these can be min or minimum and apply to both min and max.

  • min_nan,
  • min_propigate_nans
  • min_any_nan
  • min_ieee_754_2019
  • min_nans_win
  • etc.

I think that min with some suffix is preferable to minimum because it shows it is an alteration of the existing min function, not a competing function. I don't know the whole history of relevance of IEEE 754 2019 but I think choosing minimum because that's how it is named in that standard is not clear to the average Rust programmer.

@rben01
Copy link

rben01 commented Aug 25, 2022

To throw some other ideas in the mix:

By analogy with the distinction between quiet NaN and signaling NaN, maybe the new minimum (maximum) function should be called signaling_min (signaling_max).

I also like lesser_of (greater_of), as it kind of implies that < (>) will be used, which obviously return NaN if at least one argument is NaN. You could imagine the following conversation: "Which is the lesser of 1 and NaN?" "That makes no sense, they can't be compared." "Exactly."

IMO min_all/min_both (max_all/max_both) also get the idea across -- all/both of the arguments will affect the output. In particular, even if one of the args is -inf (+inf) you still have to look at the other.

@CryZe
Copy link
Contributor

CryZe commented Aug 25, 2022

This might also influence the naming situation: There's more than just these two algorithms.

Rust's max is IEEE Std 754-2008 maxNum
This new one is IEEE Std 754-2019 maximum
And there's also IEEE Std 754-2019 maximumNumber
And I feel like WebAssembly's pmax / C++'s std::max is also worth considering having (ignores NaN for a fast impl)

@scottmcm
Copy link
Member

The new one is arguably the "best" one, as it's the one that behaves most similarly to everything else with respect to NANs -- the other binary functions like add and atan2 do the same "NAN if either input is NAN", same as what maximum does.

So one option might be to deprecate max under that name and offer it as max_ignoring_nans or something, letting the new one get the name maximum.

(Thought that does have the obvious downside of being different from the max method on everything that's Ord.)

@Urgau
Copy link
Member Author

Urgau commented Aug 25, 2022

How do other languages and libraries handle this?

C/C++/Swift doesn't (yet ?). C#/Python have the IEEE 754-2019 minimum and maximum semantics as their min/max function from the start. BUT there doesn't seems to have any language that have both the 2008 and 2019 semantics.

How are these named in the standard?

They are named minimum and maximum in IEEE 754-2019.
Note that the previous functions minNum and maxNum from IEEE 754-2008 (and below) are deprecated in IEEE 754-2019 in favor of the new one.

What are the use cases for max vs maximum?

(I don't really understand the question) but I would say none. Only one should exist, in IEEE 754-2019 they deprecated minNum and maxNum in favor of minimum and `maximum, maybe we should do the same.

So one option might be to deprecate max under that name and offer it as max_ignoring_nans or something, letting the new one get the name maximum.

+1


Let me know if you need more input. Happy to help.

@CryZe
Copy link
Contributor

CryZe commented Aug 25, 2022

Only one should exist, in IEEE 754-2019 they deprecated minNum and maxNum in favor of minimum and `maximum, maybe we should do the same.

I still feel like the standard is being misrepresented here. There's 4 functions now, not 2:

https://i.imgur.com/Nl3tXju.png

One pair does NaN propagation, the other does NaN absorbtion. The 2008 variant turned into the new NaN absorbtion variant, though it does differ in how 0 is being treated. maxNum considers +-0 equal (implementation defined), whereas maximumNumber considers the sign.

Imo we should just add max_nan_prop (maximum), max_nan_absorb (maximumNumber) and possibly deprecate max.

(Also lmao, I just now see that they literally say that we should have all of them as part of the image I quoted)

Also here's a list I made quite a while ago that lists the different instructions and languages: https://gist.github.com/CryZe/30cc76f4629cb0846d5a9b8d13144649

C#/Python have the IEEE 754-2019 minimum and maximum semantics as their min/max function from the start.

They don't match the semantics, they don't follow either standard.

@scottmcm
Copy link
Member

@m-ou-se

What are the use cases for max vs maximum?

When used in isolation as a normal binary function between two calculated values, I'd say that maximum is the one that does what people want and expect.

But when considered as a way to collapse a full sequence, both are usable in different cases. You can think of them kinda like .flatten().fold() vs .try_fold() on Options:

  • .fold(f32::NAN, f32::max) will give you the largest non-NAN value if there is one, otherwise you get NAN (either because they're all NAN or the iterator is empty)
  • .reduce(f32::maximum) will give you the largest value if there are no NANs in the iterator, Some(NAN) if there's one or more NANs, and None if the iterator is empty.

@m-ou-se
Copy link
Member

m-ou-se commented Sep 6, 2022

Thanks for the clarifications. :)

The names maximum and maximumNumber from the standard are also a bit confusing, although at least the "number" part clarifies a bit that it tries its best to return a number instead of NaN. ^^'

So one option might be to deprecate max under that name

I agree it's a shame that we already have our current max method doing what it does. I'm not entirely sure we should deprecate such a common method, but it is a tempting way to make space for a better solution..

Also here's a list I made quite a while ago that lists the different instructions and languages: https://gist.github.com/CryZe/30cc76f4629cb0846d5a9b8d13144649

Thanks! It's a great overview, but the data is somewhat depressing. :(

@jsimmons
Copy link

I'm not sure it's a good idea to deprecate max/min with their existing names given that the segment of code which is likely to care about the difference in semantics between 2008 and 2019 is rather small. It's not exactly an upgrade for people using the API, just different. So lots of downstream churn for no advantage.

Seems like it would be better just to flip to the new semantics with an edition bump, if that's possible. Though it will likely break anyone who was relying on the performance of min and max being acceptable on arm compiler explorer. (Won't matter for x86 because you're already forced to write your own min and max to get good codegen there)

@CryZe
Copy link
Contributor

CryZe commented Sep 10, 2022

Though it will likely break anyone who was relying on the performance of min and max being acceptable on arm

That's "not true", as ARM has instructions for both 2019 behaviors as seen here: https://gist.github.com/CryZe/30cc76f4629cb0846d5a9b8d13144649

What's interesting to me though is that LLVM compiles the current behavior to FMINNM / FMAXNM, even though that's not matching the 2008 standard. Apparently these instructions CAN follow the 2008 behavior, but only via a ARMv8.7-A FEAT_AFP (alternative floating point) extension introduced in 2020 and only when the AH register is 1. I guess they noticed the mismatch to the 2008 behavior when the 2019 standard came out, so they introduced the alternative floating point behavior mode in 2020 and LLVM somehow still compiles to "the wrong instruction" (should conditionally compile to setting the AH register and then calling the instruction or if it's generic aarch64, to manual instructions that fix the behavior).

@jsimmons
Copy link

jsimmons commented Sep 10, 2022 via email

@CryZe
Copy link
Contributor

CryZe commented Sep 10, 2022

That documentation is correct, but it doesn't disagree with my list. The difference in behavior in the old and new standard is in the handling of -0 and +0. That's where the alternative floating point extension comes in (switches from 2019 mode to 2008). The difference in NaN behavior is supported by ARM as FMAX and FMAXNM. (Though the pseudo code also hints at some changes regarding signalling and quiet NaN between both floating point modes (not instructions) that I haven't fully grasped yet)

If you check the pseudo code for FPMaxNum (FMAXNM), then you see that it does a bunch of NaN handling and then calls into FPMax with the alternative floating point mode turned off:

https://i.imgur.com/ExvSo2T.png

FPMax has special code for preserving the -0 == +0 behavior of the 2008 standard here, but only if the alternative floating point mode is on, but the function was explicitly called with it being off:

https://i.imgur.com/4K0kz0D.png

So this early return doesn't trigger and we fall through to the 2019 behavior:

https://i.imgur.com/5Q6w8zA.png

where max returns the positive sign if either zero is positive.

@CryZe
Copy link
Contributor

CryZe commented Sep 10, 2022

I ran it on both qemu and actual hardware to confirm that what I said above is indeed true and FMAXNM is indeed following the 2019 semantics and LLVM emits the "wrong instruction" (though I guess one could argue that the 2008 semantics are superseded by the 2019 ones and it's all fine anyway, in particular because the 2008 standard even "says" that it's implentation defined)

https://i.imgur.com/TxCndLb.png

#[inline(never)]
fn hardware_max(a: f32, b: f32) -> f32 {
    a.max(b)
}

#[inline(always)]
fn llvm_evaluated_max(a: f32, b: f32) -> f32 {
    a.max(b)
}

fn main() {
    dbg!(hardware_max(0.0, 0.0).is_sign_positive());
    dbg!(hardware_max(0.0, -0.0).is_sign_positive());
    dbg!(hardware_max(-0.0, 0.0).is_sign_positive());
    dbg!(hardware_max(-0.0, -0.0).is_sign_positive());
    println!();
    dbg!(llvm_evaluated_max(0.0, 0.0).is_sign_positive());
    dbg!(llvm_evaluated_max(0.0, -0.0).is_sign_positive());
    dbg!(llvm_evaluated_max(-0.0, 0.0).is_sign_positive());
    dbg!(llvm_evaluated_max(-0.0, -0.0).is_sign_positive());
}

@jsimmons
Copy link

jsimmons commented Sep 10, 2022 via email

@CryZe
Copy link
Contributor

CryZe commented Sep 10, 2022

The 2019 standard has both NaN absorbtion (maximumNumber) and NaN propagation (maximum). I'm saying that Rust's f32::max maps to FMAXNM on ARM, which does the 2019 maximumNumber instead of the 2008 maxNum (which mostly only differ in their -0/+0 behavior) (unless it gets const folded, then it never hits hardware and LLVM follows the 2008 standard when const folding).

@nikic
Copy link
Contributor

nikic commented Sep 10, 2022

@CryZe I think your comments are causing unnecessary confusion here. IEEE-2008 leaves the behavior for signed zero unspecified (either operand may be returned). An instruction that follows IEEE-2019 signed zero behavior also conforms to IEEE-2008 -- just not the other way around. LLVM is correct to use FMAXNM to lower llvm.maxnum. There is no "wrong instruction" being used.

@CryZe
Copy link
Contributor

CryZe commented Sep 10, 2022

Yeah you are right I was under the impression that it actually specified it to choose either the first or second operand (in particular because the ARM documentation explicitly supports it via the alternate floating point mode, but it looks like I was misled), but it seems like it's implementation defined in the 2008 standard, the C standard, technically in Rust and in LLVM. I adjusted my list to reflect that and updated the wrong comments.

So one option would be to upgrade the f32/f64::min/max to the 2019 behavior as that's a backwards compatible change (though LLVM doesn't have an intrinsic for it atm).

@programmerjake
Copy link
Member

So one option would be to upgrade the f32/f64::min/max to the 2019 behavior as that's a backwards compatible change (though LLVM doesn't have an intrinsic for it atm).

it isn't actually backward compatible because IEEE 754-2008 maxNum behaves differently than IEEE 754-2019 maximumNumber for signalling NaNs: maxNum is defined to return a quiet NaN if either input is a signalling NaN, whereas maximumNumber returns non-NaN if either input is non-NaN, even if the other input is a signalling NaN.

background on min/max in IEEE 754-2019: https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/minNum_maxNum_Removal_Demotion_v3.pdf

@nikic
Copy link
Contributor

nikic commented Sep 11, 2022

So one option would be to upgrade the f32/f64::min/max to the 2019 behavior as that's a backwards compatible change (though LLVM doesn't have an intrinsic for it atm).

it isn't actually backward compatible because IEEE 754-2008 maxNum behaves differently than IEEE 754-2019 maximumNumber for signalling NaNs: maxNum is defined to return a quiet NaN if either input is a signalling NaN, whereas maximumNumber returns non-NaN if either input is non-NaN, even if the other input is a signalling NaN.

background on min/max in IEEE 754-2019: https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/minNum_maxNum_Removal_Demotion_v3.pdf

While this is technically true, this comes with the caveat that people don't actually use the IEEE 754-2008 definition of maxNum and instead use "maxNum, but with sNaN treated the same ways as qNaN". So basically, what IEEE 754-2008 maxNum is in practice, is IEEE 754-2019 maximumNumber with unspecified signed zero behavior.

Here is Rust's own definition:

If one of the arguments is NaN, then the other argument is returned. This follows the IEEE-754 2008 semantics for maxNum, except for handling of signaling NaNs; this function handles all NaNs the same way and avoids maxNum’s problems with associativity. This also matches the behavior of libm’s fmax.

And this is LLVM's more extensive definition:

Follows the IEEE-754 semantics for minNum, except for handling of signaling NaNs. This match’s the behavior of libm’s fmin.

If either operand is a NaN, returns the other non-NaN operand. Returns NaN only if both operands are NaN. The returned NaN is always quiet. If the operands compare equal, returns a value that compares equal to both operands. This means that fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.

Unlike the IEEE-754 2008 behavior, this does not distinguish between signaling and quiet NaN inputs. If a target’s implementation follows the standard and returns a quiet NaN if either input is a signaling NaN, the intrinsic lowering is responsible for quieting the inputs to correctly return the non-NaN input (e.g. by using the equivalent of llvm.canonicalize).

And now we are in the wonderful position where there are actually 5 different notions, all commonly accepted, of what a float maximum is supposed to be :)

@m-ou-se m-ou-se removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Sep 20, 2022
@juliuskoskela
Copy link

I'd like to add to this issue that I ran into simd_min/simd_max functions destroying the performance of my algorithm where I had previously used _mm_min_ps or the likes. Replacing them with LT/GT comparisons solved the issue, but I feel like the solution was less than obvious.

It's a tricky situation, but at the least the documentations should reflect these performance considerations.

@scottmcm
Copy link
Member

scottmcm commented Nov 30, 2022

Note that Intel specifically documents _mm_min_ps as

does not follow the IEEE Standard for Floating-Point Arithmetic (IEEE 754) minimum value when inputs are NaN or signed-zero values.

Whereas f32::minimum says it does follow the IEEE 754-2019 semantics.

I don't think we should generally add "but it might be faster, on some platforms, to do something that works differently" to method documentation.

(minss is just weird; it's not even commutative.)

@juliuskoskela
Copy link

I don't think we should generally add "but it might be faster, on some platforms, to do something that works differently" to method documentation.

I think the correct place to document this could be the beginners guide or if there's gonna be a chapter in "the book" about portable simd.

@GitSplits
Copy link

GitSplits commented May 23, 2024

Rather than having alternate names for max and min, why not use attributes instead? This way the attribute would specify how max/min is applied for a certain block of code, with consistent max/min algorithms and a well understood and expected output. This of course assumes different max/min behaviors are seldom intermingled together.

Using attributes also allows for adjusting code to different max/min mechanisms for different CPU types and also for different IEEE 754 standards as they’re published and adjusted over time.

@eduardosm
Copy link
Contributor

In #130568, I'm making these methods const, keeping their constness under the float_minimum_maximum feature gate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests