[libclc] replace float remquo with amd ocml implementation by wenju-he · Pull Request #177131 · llvm/llvm-project

wenju-he · 2026-01-21T09:40:30Z

Current implementation has two issues:

unconditionally soft flushes denormal.
can't pass OpenCL CTS test "test_bruteforce remquo" on intel gpu.

This PR upstreams remquo implementation from
https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs/ocml/src/remainderF_base.h It supports denormal and can pass OpenCL CTS test.
Number of LLVM IR instructions of function _Z6remquoffPU3AS5i increased from 96 to 680.

Current implementation has two issues: * unconditionally soft flushes denormal. * can't pass OpenCL CTS test "test_bruteforce remquo" on intel gpu. This PR upstreams remquo implementation from https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs/ocml/src/remainderF_base.h It supports denormal and can pass OpenCL CTS test. Note __oclc_finite_only_opt is set to false as there is no dynamic dispatching for generic implementation. Number of LLVM IR instructions of function _Z6remquoffPU3AS5i increased from 96 to 678.

Copilot

Pull request overview

This PR replaces the current float remquo implementation with AMD's OCML version to fix denormal handling and OpenCL CTS test failures. The new implementation supports denormal numbers properly and passes the "test_bruteforce remquo" test on Intel GPUs.

Changes:

Replaced the existing remquo algorithm with AMD OCML implementation from ROCm
Removed soft flushing of denormals that was causing issues
Added support for proper denormal handling and edge cases

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
libclc/clc/lib/generic/math/clc_remquo.inc	Complete rewrite of remquo implementation using AMD OCML algorithm with denormal support
libclc/clc/lib/generic/math/clc_remquo.cl	Updated includes to support new implementation (added fabs, copysign, frexp, nan, native_recip, rint, isfinite, isnan)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T09:41:38Z

libclc/clc/lib/generic/math/clc_remquo.inc

+    ex = ({
+           int _exp;
+           __clc_frexp(ax, &_exp);
+           _exp;
+         }) -
+         1;
+    ax = __clc_ldexp(({
+                       int _exp;
+                       __clc_frexp(ax, &_exp);
+                     }),
+                     bits);
+    ey = ({
+           int _exp;
+           __clc_frexp(ay, &_exp);
+           _exp;
+         }) -
+         1;
+    ay = __clc_ldexp(({
+                       int _exp;
+                       __clc_frexp(ay, &_exp);
+                     }),
+                     1);


The frexp function is called twice on ax with identical logic but only the exponent is used the first time and only the mantissa the second time. This is inefficient and duplicative. Consider storing both the mantissa and exponent from a single frexp call and reusing them.

Suggested change

ex = ({

int _exp;

__clc_frexp(ax, &_exp);

_exp;

}) -

1;

ax = __clc_ldexp(({

int _exp;

__clc_frexp(ax, &_exp);

}),

bits);

ey = ({

int _exp;

__clc_frexp(ay, &_exp);

_exp;

}) -

1;

ay = __clc_ldexp(({

int _exp;

__clc_frexp(ay, &_exp);

}),

1);

int _exp_ax;

float mant_ax = __clc_frexp(ax, &_exp_ax);

ex = _exp_ax - 1;

ax = __clc_ldexp(mant_ax, bits);

int _exp_ay;

float mant_ay = __clc_frexp(ay, &_exp_ay);

ey = _exp_ay - 1;

ay = __clc_ldexp(mant_ay, 1);

libclc/clc/lib/generic/math/clc_remquo.inc

Copilot · 2026-01-21T09:41:38Z

libclc/clc/lib/generic/math/clc_remquo.inc

+  bool __oclc_finite_only_opt = false;
+  if (!__oclc_finite_only_opt) {
+    ret = y == 0.0f ? __clc_nan(0) : ret;
+    q7 = y == 0.0f ? 0 : q7;
+    bool c = !__clc_isnan(y) && __clc_isfinite(x);
+    ret = c ? ret : __clc_nan(0);
+    q7 = c ? q7 : 0;
  }



The variable name __oclc_finite_only_opt uses a naming convention that suggests it's a configuration constant or macro, but it's declared as a local boolean variable that's always false. This is confusing and misleading. Consider renaming it to something like finite_only_disabled or removing it entirely if it's meant to be a temporary placeholder.

Suggested change

bool __oclc_finite_only_opt = false;

if (!__oclc_finite_only_opt) {

ret = y == 0.0f ? __clc_nan(0) : ret;

q7 = y == 0.0f ? 0 : q7;

bool c = !__clc_isnan(y) && __clc_isfinite(x);

ret = c ? ret : __clc_nan(0);

q7 = c ? q7 : 0;

}

ret = y == 0.0f ? __clc_nan(0) : ret;

q7 = y == 0.0f ? 0 : q7;

bool c = !__clc_isnan(y) && __clc_isfinite(x);

ret = c ? ret : __clc_nan(0);

q7 = c ? q7 : 0;

libclc/clc/lib/generic/math/clc_remquo.inc

arsenm · 2026-01-21T09:43:03Z

libclc/clc/lib/generic/math/clc_remquo.inc

+    int qsgn = 1 + (((__clc_as_int(x) ^ __clc_as_int(y)) >> 31) << 1);
+    float t = __clc_fma(y, -(float)qsgn, x);
+    ret = c ? t
+            : (__builtin_isfpclass(__builtin_canonicalizef(0x1p-149f), 0x0040)


This should definitely not be inlining the DAZ_OPT hack. Either preserve it, or unconditionally canonicalize

done, changed to unconditionally canonicalize

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-01-21T11:25:59Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff origin/main HEAD --extensions cl,inc -- libclc/clc/lib/generic/math/clc_remquo.cl libclc/clc/lib/generic/math/clc_remquo.inc --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.

diff --git a/libclc/clc/lib/generic/math/clc_remquo.inc b/libclc/clc/lib/generic/math/clc_remquo.inc
index 79eef077b..836dc703c 100644
--- a/libclc/clc/lib/generic/math/clc_remquo.inc
+++ b/libclc/clc/lib/generic/math/clc_remquo.inc
@@ -66,7 +66,7 @@ _CLC_DEF _CLC_OVERLOAD float __clc_remquo(float x, float y,
   } else {
     ret = x;
     q7 = 0;
-    bool c = (ay < 0x1.0p+127f & 2.0f * ax > ay) | (ax > 0.5f * ay);
+    bool c = (ay<0x1.0p+127f & 2.0f * ax> ay) | (ax > 0.5f * ay);
 
     int qsgn = 1 + (((__clc_as_int(x) ^ __clc_as_int(y)) >> 31) << 1);
     float t = __clc_fma(y, -(float)qsgn, x);

libclc/clc/lib/generic/math/clc_remquo.inc

Current implementation has two issues: * unconditionally soft flushes denormal. * can't pass OpenCL CTS test "test_bruteforce remquo" on intel gpu. This PR upstreams remquo implementation from https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs/ocml/src/remainderF_base.h It supports denormal and can pass OpenCL CTS test. Number of LLVM IR instructions of function _Z6remquoffPU3AS5i increased from 96 to 680. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

kwk · 2026-02-11T16:55:05Z

I ran into this issue.

cd /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc && /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc
fatal error: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE %87:fid (in function: _Z12__clc_remquoffPi)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc
1.      Code generation
2.      Running pass 'Function Pass Manager' on module '/home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc'.
3.      Running pass 'Legalizer' on function '@_Z12__clc_remquoffPi'

ret = c ? t : __builtin_elementwise_canonicalize(x); was added in this PR.

jhuber6 · 2026-02-12T17:06:38Z

I ran into this issue.

cd /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc && /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc
fatal error: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE %87:fid (in function: _Z12__clc_remquoffPi)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/fedora/src/llvm-project/main/build/bin/clang-23 -c --target=spirv32-- -x ir -o /home/fedora/src/llvm-project/main/build/./lib/clang/23/lib/spirv32--/libclc.spv /home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc
1.      Code generation
2.      Running pass 'Function Pass Manager' on module '/home/fedora/src/llvm-project/main/build/runtimes/runtimes-bins/libclc/obj.libclc.dir/spirv-mesa3d-/builtins.link.spirv-mesa3d-.bc'.
3.      Running pass 'Legalizer' on function '@_Z12__clc_remquoffPi'

ret = c ? t : __builtin_elementwise_canonicalize(x); was added in this PR.

Either tell the SPIR-V backend people to support the canonicalize node or put #ifdef __SPIRV__ around this usage.

arsenm · 2026-02-12T20:39:26Z

SPIRV must implement canonicalize

tstellar · 2026-02-13T13:35:18Z

SPIRV must implement canonicalize

So should we revert this until that happens?

…177131)" This reverts commit 20c15c7.

wenju-he · 2026-02-14T01:35:17Z

SPIRV must implement canonicalize

So should we revert this until that happens?

revert in #181443

I discussed the issue with Ben Ashbaugh. We can propose a SPIR-V extension to add canonicalize instruction to SPIR-V. What do you think?

…181443) Reverts #177131 It broke SPIRV target: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE

…entation" (#181443) Reverts llvm/llvm-project#177131 It broke SPIRV target: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE

…lvm#181443) Reverts llvm#177131 It broke SPIRV target: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE

wenju-he requested review from arsenm, Copilot and frasercrmck January 21, 2026 09:40

Copilot AI reviewed Jan 21, 2026

View reviewed changes

arsenm reviewed Jan 21, 2026

View reviewed changes

llvmbot added the libclc libclc OpenCL library label Jan 21, 2026

wenju-he and others added 3 commits January 21, 2026 11:20

always canonicalize

84e91fe

Update libclc/clc/lib/generic/math/clc_remquo.inc

c8fd881

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update libclc/clc/lib/generic/math/clc_remquo.inc

792ca57

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

wenju-he requested a review from arsenm January 21, 2026 10:37

arsenm reviewed Jan 21, 2026

View reviewed changes

libclc/clc/lib/generic/math/clc_remquo.inc Outdated Show resolved Hide resolved

delete __oclc_finite_only_opt check

f0c7641

wenju-he requested a review from arsenm January 22, 2026 02:01

arsenm approved these changes Jan 24, 2026

View reviewed changes

wenju-he merged commit 20c15c7 into llvm:main Jan 26, 2026
10 of 11 checks passed

wenju-he deleted the remquo-use-amdgpu-ocml-remquo-float branch January 26, 2026 00:11

wenju-he added a commit that referenced this pull request Feb 14, 2026

Revert "[libclc] replace float remquo with amd ocml implementation (#…

41d2a0d

…177131)" This reverts commit 20c15c7.

wenju-he mentioned this pull request Feb 14, 2026

Revert "[libclc] replace float remquo with amd ocml implementation" #181443

Merged

wenju-he added a commit that referenced this pull request Feb 14, 2026

Revert "[libclc] replace float remquo with amd ocml implementation" (#…

560e229

…181443) Reverts #177131 It broke SPIRV target: error in backend: unable to legalize instruction: %88:fid(s32) = G_FCANONICALIZE

Maetveis mentioned this pull request Feb 27, 2026

[Question / Feature Request] Translation of floating point canonicalize (@llvm.canonicalize) KhronosGroup/SPIRV-LLVM-Translator#3559

Open

Conversation

wenju-he commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arsenm Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

wenju-he Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kwk commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhuber6 commented Feb 12, 2026

Uh oh!

arsenm commented Feb 12, 2026

Uh oh!

tstellar commented Feb 13, 2026

Uh oh!

wenju-he commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

wenju-he commented Jan 21, 2026 •

edited

Loading

github-actions bot commented Jan 21, 2026 •

edited

Loading

kwk commented Feb 11, 2026 •

edited

Loading