-
Notifications
You must be signed in to change notification settings - Fork 220
math_brute_force: fmin/fmax can return a qNaN if either input is a sNaN #2285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For fmin/fmax, when either argument is a signaling NaN, a quiet NaN return is also acceptable, which is respect to C99, where signaling NaNs are supposed to get the same IEEE treatment.
Minor change for a 64-bit 1
|
From my perspective, the C spec does not distinguish between QNaN and SNaN in its description of fmax() and fmin(), so this PR, if it is accepted, might need a spec update as well. |
|
I haven't reviewed this CTS PR, but the description sounds consistent with the current spec, so I think we're OK there. Specifically, refer to the discussion on KhronosGroup/OpenCL-Docs#128 - this is an old PR, but I don't think anything has changed in the spec in this area. Also, note this footnote in the C spec, which applies to fmin and fmax specifically: https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#_footnotedef_43 |
|
Personally, I don't think the footnote is good enough. The spec for fmax/fmin says only NaN, and that means "any kind of NaN". If we allow fmax/fmin to return a QNaN when one of the arguments is not a NaN then we are disobeying the current spec. |
|
Sorry, I'm not following (my fault). Can you describe the concern in more detail? AFAICT, this PR is specifically about whether fmin and fmax must return a sNaN when an input is a sNaN, or whether fmin and fmax may return a qNaN when either (or both) inputs are sNaN. I think the spec is pretty clear about this, both in Section 7.2 which says "Support for signaling NaNs is not required", and then in Footnote 43, which specifically cautions that for fmin and fmax "signaling NaNs may behave as quiet NaNs". Note, if this is going to need a longer discussion we should spin it off into a separate OpenCL-Docs issue. |
|
The concern here is that the patch wants fmax/fmin to be able to return a QNaN if one or more of the arguments is SNaN. This is specifically not permitted by the spec for these functions which says that if one argument is (any kind of) NaN, and the other is not, the other argument is returned. |
bashbaug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change should be required. The functions that compute ULP already have special cases for NaNs, something like:
if (isnan(reference) && isnan(test))
return 0.0f; // if we are expecting a NaN, any NaN is fineIs it possible something else is going on here? Note, we just merged a fix related to fmin and fmax and NaNs: #2270
Not exactly, the result should be a quiet nan. The IEEE-754 2008 behavior for minNum and maxNum is to return a quiet nan if either input is a signaling nan. This is backwards from the treatment of a quiet nan input, in which case the numeric result argument is returned. The current conformance test behavior is checking for the result of fmin matches minNum(quiet(src0), quiet(src1)), thus avoiding the inversion of behavior. This was the buggy glibc implementation of fmin/fmax for years, but this was fixed (https://sourceware.org/bugzilla/show_bug.cgi?id=20947)
Yes. The confusion is the IEEE behavior for signaling nans is inverted. It is not a matter of whether the result is a quiet or signaling nan, but whether the result is a nan or non-nan for a signaling nan input. This PR does not discuss fmin and fmax, which are a special case.
This will not address this issue. This is still treating the signalingness as insignificant.
What the conformance test is currently checking is that signaling nan inputs must behave as quiet nans. This change is intended to relax this to allow signaling nans to behave as signaling nans |
|
I don't think we can change bruteforce without changing the C spec which specifically does not allow returning a NaN unless both arguments are NaN. |
With this commit, the CLC fmin/fmax builtins use clang's __builtin_elementwise_(min|max)imumnum which helps us generate LLVM minimumnum/maximumnum intrinsics directly. These intrinsics uniformly select the non-NaN input over the (quiet or signalling) NaN input, which corresponds to what the OpenCL CTS tests. These intrinsics maintain the vector types, as opposed to scalarizing, which was previously happening. This commit therefore helps to optimize codegen for those targets. Note that there is ongoing discussion regarding how these builtins should handle signalling NaNs in the OpenCL specification and whether they should be able to return a quiet NaN as per the IEEE behaviour. If the specification and/or CTS is ever updated to allow or mandate returning a qNAN, these builtins could/should be updated to use __builtin_elementwise_(min|max)num instead which would lower to LLVM minnum/maxnum intrinsics. The SPIR-V targets maintain the old implementations, as the LLVM -> SPIR-V translator can't currently handle the LLVM intrinsics. The implementation has been simplifies to consistently use clang builtins, as opposed to before where the half version was explicitly defined. [1] KhronosGroup/OpenCL-CTS#2285
With this commit, the CLC fmin/fmax builtins use clang's __builtin_elementwise_(min|max)imumnum which helps us generate LLVM minimumnum/maximumnum intrinsics directly. These intrinsics uniformly select the non-NaN input over the (quiet or signalling) NaN input, which corresponds to what the OpenCL CTS tests. These intrinsics maintain the vector types, as opposed to scalarizing, which was previously happening. This commit therefore helps to optimize codegen for those targets. Note that there is ongoing discussion regarding how these builtins should handle signalling NaNs in the OpenCL specification and whether they should be able to return a quiet NaN as per the IEEE behaviour. If the specification and/or CTS is ever updated to allow or mandate returning a qNAN, these builtins could/should be updated to use __builtin_elementwise_(min|max)num instead which would lower to LLVM minnum/maxnum intrinsics. The SPIR-V targets maintain the old implementations, as the LLVM -> SPIR-V translator can't currently handle the LLVM intrinsics. The implementation has been simplifies to consistently use clang builtins, as opposed to before where the half version was explicitly defined. [1] KhronosGroup/OpenCL-CTS#2285
With this commit, the CLC fmin/fmax builtins use clang's __builtin_elementwise_(min|max)imumnum which helps us generate LLVM minimumnum/maximumnum intrinsics directly. These intrinsics uniformly select the non-NaN input over the (quiet or signalling) NaN input, which corresponds to what the OpenCL CTS tests. These intrinsics maintain the vector types, as opposed to scalarizing, which was previously happening. This commit therefore helps to optimize codegen for those targets. Note that there is ongoing discussion regarding how these builtins should handle signalling NaNs in the OpenCL specification and whether they should be able to return a quiet NaN as per the IEEE behaviour. If the specification and/or CTS is ever updated to allow or mandate returning a qNAN, these builtins could/should be updated to use __builtin_elementwise_(min|max)num instead which would lower to LLVM minnum/maxnum intrinsics. The SPIR-V targets maintain the old implementations, as the LLVM -> SPIR-V translator can't currently handle the LLVM intrinsics. The implementation has been simplifies to consistently use clang builtins, as opposed to before where the half version was explicitly defined. [1] KhronosGroup/OpenCL-CTS#2285
For fmin/fmax, when either argument is a signaling NaN, a quiet NaN return is also acceptable, which is respect to C99, where signaling NaNs are supposed to get the same IEEE treatment.