-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Floating minmax: fix negative zero handling and dedicated test coverage for arrays of +0.0 and -0.0 only #4734
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
AlexGuteniev
changed the title
Sedicated test coverage for floating minmax of +0.0 and -0.0 only
Dedicated test coverage for floating minmax of +0.0 and -0.0 only
Jun 18, 2024
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
AlexGuteniev
changed the title
Dedicated test coverage for floating minmax of +0.0 and -0.0 only
Floating minmax: fix negative zero handling and dedicated test coverage for arrays of +0.0 and -0.0 only
Jun 20, 2024
StephanTLavavej
added
bug
Something isn't working
and removed
test
Related to test code
labels
Jun 20, 2024
This comment was marked as resolved.
This comment was marked as resolved.
* `<algorithm>` for `generate` * `<climits>` for `CHAR_BIT` (pre-existing) * `<cmath>` for `signbit` * `<cstddef>` for `size_t` * `<cstdint>` for `uint32_t` * `<cstdio>` for `printf` * `<functional>` for `ref` * `<random>` for `mt19937_64`
Add other point-zeros for consistency.
StephanTLavavej
approved these changes
Aug 20, 2024
I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed. |
I had to push an additional commit to drop my eternal nemeses:
|
StephanTLavavej
approved these changes
Aug 25, 2024
With #162 it could have been noticed in advance |
Thanks for setting the maximum number of bugs in this area to negative zero! ➖ 0️⃣ 😹 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initially I thought that it could be fixed by using careful minmax implementation, that selects correctly either the first or the last value when the comparands are equivalent.
I've learned the behavior of
[v]{min|max}{s|p}{s|d}
instructions (thanks @statementreply and @Alcaro for enlightening me on that), figured out that it was possible to control which of the equivalent values is the result, also I've reported the compiler bug DevCom-10686775, and found a reliable workaround for it.Unfortunately, the control over a single minmax instruction result is not enough. The whole value-based vectorization appoach does not work well with order requirements for equivalent elements Efficient vectorization requires vertical comparisons (same elements on different vector values) to be performed first, and horiziontal comparisons (different elements on the same vector value) to be performed last.
With index-based approach, as in
minmax_element
, changed order is fine, as we're looking for smallest/greatest index among equal elements.As a result, we have to resort to using
minmax_element
approach for floatingminmax
, unless/fp:fast
is specified. Should be not a big loss though -- the benchmark results in #4659 shows that smaller types benefit fromminmax
approach a lot, but floats not a lot. Definitely still way faster than scalar./fp:fast
is still fine, as the compiler takes advantage of not distinguishing+0.0
and-0,0
and is able to emit vectorizedminmax
itself (see related issue #4453)I decided to keep comparisons reordering for floats in -- this seems to improve the handing of NAN values, which is decided to be unsupported, but why won't keep something that accidentally does things better.
⏱️ Benchmark results
/fp:fast
The reodreding of
_mm[256]_{min|max}_p{s|d}
args seems a bit unfavorable for performance, but not very much, at least the results difference is within variation.