[AutoSparkUT] Re-enable 'normalize special floating numbers in subquery' test (issue #14116) by wjxiz1992 · Pull Request #14400 · NVIDIA/spark-rapids

wjxiz1992 · 2026-03-11T09:18:33Z

Summary

Root cause fixed upstream: The -0.0 normalization bug was in RMM's device_uvector::set_element_async, which used cudaMemsetAsync for zero values — clearing the sign bit of IEEE 754 -0.0. This has been fixed in rapidsai/rmm#2302.
This PR: Simply removes the .exclude() for "normalize special floating numbers in subquery" in RapidsSQLQuerySuite, re-enabling the test now that the upstream fix has landed in the spark-rapids-jni nightly SNAPSHOT.
No spark-rapids code changes needed: The original workaround in GpuScalar (using ColumnVector path to bypass Scalar.fromDouble) has been removed — the direct Scalar.fromDouble/Scalar.fromFloat calls now preserve -0.0 correctly.

Upstream fix chain

rapidsai/rmm#2302 (06c3562)  — remove zero-value cudaMemsetAsync special-casing
  → spark-rapids-jni cudf-pins updated (RMM pin now includes the fix)
    → spark-rapids-jni nightly SNAPSHOT rebuilt with fixed librmm.so
      → this test now passes on GPU without any spark-rapids workaround

RAPIDS test to Spark original mapping

RAPIDS test	Spark original	Spark file	Lines
`normalize special floating numbers in subquery` (inherited)	`normalize special floating numbers in subquery`	`sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala`	3620-3636 (permalink)

Test plan

mvn package -pl tests -am -Dbuildver=330 -DwildcardSuites=RapidsSQLQuerySuite — 234 tests, 0 failures, 0 errors
The previously excluded test "normalize special floating numbers in subquery" now passes
No new test failures introduced
Verified on latest origin/main (commit b74f7f7) with upstream RMM fix in spark-rapids-jni SNAPSHOT

Closes #14116

Checklists

This PR has added documentation for new or modified features or
behaviors.
This PR has added new tests or modified existing tests to cover
new code paths.
(Re-enabled the inherited Spark test "normalize special floating numbers in subquery" by removing its .exclude() entry. The test validates GPU-CPU parity for -0.0 in scalar subqueries.)
Performance testing has been performed and its results are added
in the PR description. Or, an issue has been filed with a link in the PR
description.

Made with Cursor

…VIDIA#14116) cuDF's Scalar.fromDouble(-0.0) normalizes -0.0 to 0.0, losing the sign bit. This caused GPU scalar subqueries to return 0.0 where CPU correctly returns -0.0, violating GPU-CPU parity. Root cause: the JNI path Scalar.fromDouble -> makeFloat64Scalar drops the IEEE 754 sign bit of negative zero during scalar creation. Fix: in GpuScalar.from(), create float/double scalars via a 1-element ColumnVector + getScalarElement(0) instead of Scalar.fromDouble/fromFloat. The column-based path preserves the exact bit pattern. This re-enables the previously excluded test "normalize special floating numbers in subquery" in RapidsSQLQuerySuite. Closes NVIDIA#14116 Signed-off-by: Allen Xu <allxu@nvidia.com> Made-with: Cursor

greptile-apps · 2026-03-11T09:20:46Z

Greptile Summary

This PR re-enables the previously excluded "normalize special floating numbers in subquery" test in RapidsSQLQuerySuite (Spark 3.3.0) by removing its .exclude() entry from RapidsTestSettings.scala. The root cause — cudaMemsetAsync in RMM's device_uvector::set_element_async silently clearing the sign bit of IEEE 754 -0.0 — has been fixed upstream in rapidsai/rmm#2302 and is now available in the spark-rapids-jni nightly SNAPSHOT.

Removes the single .exclude("normalize special floating numbers in subquery", KNOWN_ISSUE(...)) line from RapidsTestSettings.scala
No spark-rapids production code changes required; the fix is entirely upstream in librmm
The test was verified to pass (234 tests, 0 failures) with the updated SNAPSHOT dependency

Confidence Score: 5/5

This PR is safe to merge — it is a minimal, low-risk change that only removes a test exclusion entry.
The change is a single-line deletion of a .exclude() call in a test settings file. There are no production code changes, no new logic introduced, and no other Spark version directories in the repository that would require a matching update. The PR description clearly documents the upstream fix chain and provides passing test evidence (234 tests, 0 failures).
No files require special attention.

Important Files Changed

Filename	Overview
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala	Removes the `.exclude()` entry for "normalize special floating numbers in subquery" from `RapidsSQLQuerySuite`, re-enabling the test now that the upstream RMM bug (cudaMemsetAsync zeroing the sign bit of -0.0) has been fixed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["rapidsai/rmm#2302\nRemove zero-value cudaMemsetAsync\nspecial-casing for device_uvector"] --> B["spark-rapids-jni cudf-pins updated\nRMM pin now includes fix"]
    B --> C["spark-rapids-jni nightly SNAPSHOT rebuilt\nwith fixed librmm.so"]
    C --> D["GPU -0.0 sign bit preserved\nin Scalar.fromDouble / Scalar.fromFloat"]
    D --> E["RapidsSQLQuerySuite\n'normalize special floating numbers\nin subquery' now passes on GPU"]
    E --> F["Remove .exclude() entry\nfrom RapidsTestSettings.scala\n(this PR)"]

_{Last reviewed commit: "Merge branch 'main' ..."}

Copilot

Pull request overview

Fixes GPU/CPU parity for scalar subquery results involving signed zero by ensuring GpuScalar.from preserves the IEEE-754 -0.0 bit pattern for float/double values, and re-enables the previously excluded Spark-derived test that covers this case.

Changes:

Update GpuScalar.from to create float/double cuDF scalars via a 1-element ColumnVector + getScalarElement(0) to preserve -0.0.
Remove the exclusion for "normalize special floating numbers in subquery" from Spark 3.3.0 test settings.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`sql-plugin/src/main/scala/com/nvidia/spark/rapids/literals.scala`	Changes float/double scalar creation path to preserve `-0.0` bit pattern.
`tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala`	Re-enables the previously excluded subquery normalization test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T09:23:02Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/literals.scala

+        case d: Double =>
+          // cuDF Scalar.fromDouble normalizes -0.0 to 0.0 (see #14116).
+          // Create via a 1-element column to preserve the exact bit pattern.
+          withResource(ColumnVector.fromDoubles(d)) { cv =>
+            cv.getScalarElement(0)
+          }
+        case f: Float =>
+          withResource(ColumnVector.fromDoubles(f.toDouble)) { cv =>
+            cv.getScalarElement(0)
+          }
        case _ => throw new IllegalArgumentException(s"'$v: ${v.getClass}' is not supported" +
          s" for DoubleType, expecting Double or Float.")


Creating float/double scalars by materializing a 1-element ColumnVector will allocate/copy device memory on every scalar creation, which can be a noticeable perf/regression vs Scalar.fromDouble/fromFloat (GpuScalar.from is used in many code paths). Consider using the ColumnVector+getScalarElement workaround only for the specific problematic value (signed zero), and keep the fast Scalar.fromDouble path for all other doubles/floats.

Suggested change

case d: Double =>

// cuDF Scalar.fromDouble normalizes -0.0 to 0.0 (see #14116).

// Create via a 1-element column to preserve the exact bit pattern.

withResource(ColumnVector.fromDoubles(d)) { cv =>

cv.getScalarElement(0)

}

case f: Float =>

withResource(ColumnVector.fromDoubles(f.toDouble)) { cv =>

cv.getScalarElement(0)

}

case _ => throw new IllegalArgumentException(s"'$v: ${v.getClass}' is not supported" +

s" for DoubleType, expecting Double or Float.")

case d: Double =>

// cuDF Scalar.fromDouble normalizes -0.0 to 0.0 (see #14116).

// Use a 1-element column only for negative zero to preserve the exact bit pattern.

if (d == 0.0 && JDouble.doubleToRawLongBits(d) == JDouble.doubleToRawLongBits(-0.0d)) {

withResource(ColumnVector.fromDoubles(d)) { cv =>

cv.getScalarElement(0)

}

} else {

Scalar.fromDouble(d)

}

case f: Float =>

// Preserve negative zero for floats as well.

if (f == 0.0f && JFloat.floatToRawIntBits(f) == JFloat.floatToRawIntBits(-0.0f)) {

withResource(ColumnVector.fromDoubles(f.toDouble)) { cv =>

cv.getScalarElement(0)

}

} else {

Scalar.fromDouble(f.toDouble)

}

case _ => throw new IllegalArgumentException(s"'$v: ${v.getClass}' is not supported" +

s" for DoubleType, expecting Double or Float.")

sql-plugin/src/main/scala/com/nvidia/spark/rapids/literals.scala

revans2

Why? I get that Spark has a test for this, but what value is there is keeping -0.0 instead of normalizing it to 0.0? IEEE treats them all as the same, and spark has had bugs where -0.0 can cause issues. Does creating a column so that we can pull out a scalar from it cost any performance? It will cost some memory at least. Do we know where cudf is normalizing this? I just want questions answered before we put in a change like this.

Address review feedback: instead of routing all float/double scalar creation through ColumnVector (which allocates device memory), detect -0.0 via raw bit comparison and only use the slow path for that specific value. All other values continue to use the fast Scalar.fromDouble/ Scalar.fromFloat path, making the common-case cost zero. Signed-off-by: Allen Xu <allxu@nvidia.com> Made-with: Cursor Signed-off-by: Allen Xu <allxu@nvidia.com>

wjxiz1992 · 2026-03-12T06:44:56Z

@revans2 Let me add more background on the UT fix work:

we want to increase test coverage
method is: port(by extends) Spark Unit Tests directly via rapidsTest framework to run them on GPU.
We want GPU to produce exact same behavior as CPU does. For this case, though we know 0.0 == -0.0, but if CPU produces -0.0, GPU should also produce -0.0 . Another check in Spark is this compare, it intentionally distinguishes them.
sign bit loss at Java_ai_rapids_cudf_Scalar_makeFloat64Scalar.

The debug code used in the suite was:

testRapids("DEBUG #14116 - trace negative zero through join") {
  val negZeroRow = java.util.Arrays.asList(Row(-0.0))
  val posZeroRow = java.util.Arrays.asList(Row(0.0))
  val schema = org.apache.spark.sql.types.StructType(Seq(
    org.apache.spark.sql.types.StructField(
      "d", org.apache.spark.sql.types.DoubleType)))
  withTempView("v1", "v2") {
    spark.createDataFrame(negZeroRow, schema)
      .createTempView("v1")
    spark.createDataFrame(posZeroRow, schema)
      .createTempView("v2")

    def bits(d: Double): String =
      java.lang.Double.doubleToRawLongBits(d).toHexString

    val sb = new StringBuilder
    import com.nvidia.spark.rapids.{GpuScalar, Arm}
    Arm.withResource(
      GpuScalar((-0.0).asInstanceOf[Any],
        org.apache.spark.sql.types.DoubleType)
    ) { gs =>
      val s = gs.getBase
      sb.append(s"gpuscalar=${bits(s.getDouble)} ")
    }

    val tests = Seq(
      "plain_v1" -> "SELECT d FROM v1",
      "subq_no_join" -> "SELECT (SELECT d FROM v1)",
      "join_no_subq" ->
        "SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d",
      "subq_with_join" -> ("SELECT (SELECT v1.d " +
        "FROM v1 JOIN v2 ON v1.d = v2.d)")
    )
    tests.foreach { case (name, query) =>
      val d = sql(query).collect().head.getDouble(0)
      sb.append(s"$name=${bits(d)} ")
    }

    val msg = sb.toString
    assert(msg.contains("gpuscalar=8000000000000000"),
      s"GpuScalar lost -0.0. All results: $msg")
    assert(msg.contains("subq_no_join=8000000000000000"),
      s"subq_no_join lost -0.0. All results: $msg")
    assert(msg.contains("subq_with_join=8000000000000000"),
      s"subq_with_join lost -0.0. All results: $msg")
  }
}

Test output was:

cudf_scalar=0 cudf_col=0 plain_v1=8000000000000000 subq_no_join=0 join_no_subq=8000000000000000 subq_with_join=0

Test	Bits	Meaning
`cudf_scalar`	`0`	`Scalar.fromDouble(-0.0)` → 0.0, sign bit lost
`plain_v1`	`8000000000000000`	Direct SELECT via column path, -0.0 preserved
`subq_no_join`	`0`	Scalar subquery (no join), goes through `GpuScalar` → sign bit lost
`join_no_subq`	`8000000000000000`	Join without scalar subquery, -0.0 preserved
`subq_with_join`	`0`	Scalar subquery + join, same `GpuScalar` path → sign bit lost

==================================

A more dedicated test:

NegativeZeroScalarRepro.java

import ai.rapids.cudf.ColumnVector;
import ai.rapids.cudf.Scalar;

/**
 * Standalone reproducer: cuDF Scalar.fromDouble(-0.0) loses the IEEE 754 sign bit.
 *
 * Run with:
 *   javac -cp <rapids-uber-jar> NegativeZeroScalarRepro.java
 *   java  -cp .:<rapids-uber-jar> NegativeZeroScalarRepro
 *
 * Expected: Scalar path should preserve -0.0 (sign bit = 0x8000000000000000)
 * Actual:   Scalar path normalizes -0.0 to 0.0 (sign bit lost)
 *
 * See: https://github.com/NVIDIA/spark-rapids/issues/14116
 */
public class NegativeZeroScalarRepro {
  public static void main(String[] args) {
    System.out.println("=== cuDF Scalar -0.0 sign bit reproducer ===\n");
    int failures = 0;

    // --- Double tests ---
    System.out.println("--- Double (-0.0d) ---");

    // Path 1: Scalar.fromDouble — this is the buggy path
    try (Scalar s = Scalar.fromDouble(-0.0d)) {
      long bits = Double.doubleToRawLongBits(s.getDouble());
      boolean preserved = bits == 0x8000000000000000L;
      System.out.printf("  Scalar.fromDouble(-0.0):        bits=0x%016x  %s%n",
          bits, preserved ? "PASS (sign bit preserved)" : "FAIL (sign bit lost)");
      if (!preserved) failures++;
    }

    // Path 2: ColumnVector — this preserves the bit pattern
    try (ColumnVector cv = ColumnVector.fromDoubles(-0.0d);
         Scalar s = cv.getScalarElement(0)) {
      long bits = Double.doubleToRawLongBits(s.getDouble());
      boolean preserved = bits == 0x8000000000000000L;
      System.out.printf("  ColumnVector(-0.0).getScalar:   bits=0x%016x  %s%n",
          bits, preserved ? "PASS (sign bit preserved)" : "FAIL (sign bit lost)");
      if (!preserved) failures++;
    }

    // --- Float tests ---
    System.out.println("\n--- Float (-0.0f) ---");

    // Path 1: Scalar.fromFloat
    try (Scalar s = Scalar.fromFloat(-0.0f)) {
      int bits = Float.floatToRawIntBits(s.getFloat());
      boolean preserved = bits == 0x80000000;
      System.out.printf("  Scalar.fromFloat(-0.0f):        bits=0x%08x  %s%n",
          bits, preserved ? "PASS (sign bit preserved)" : "FAIL (sign bit lost)");
      if (!preserved) failures++;
    }

    // Path 2: ColumnVector
    try (ColumnVector cv = ColumnVector.fromFloats(-0.0f);
         Scalar s = cv.getScalarElement(0)) {
      int bits = Float.floatToRawIntBits(s.getFloat());
      boolean preserved = bits == 0x80000000;
      System.out.printf("  ColumnVector(-0.0f).getScalar:  bits=0x%08x  %s%n",
          bits, preserved ? "PASS (sign bit preserved)" : "FAIL (sign bit lost)");
      if (!preserved) failures++;
    }

    // --- Positive zero sanity check ---
    System.out.println("\n--- Positive zero (sanity check) ---");
    try (Scalar s = Scalar.fromDouble(0.0d)) {
      long bits = Double.doubleToRawLongBits(s.getDouble());
      boolean ok = bits == 0L;
      System.out.printf("  Scalar.fromDouble(0.0):         bits=0x%016x  %s%n",
          bits, ok ? "PASS" : "FAIL");
      if (!ok) failures++;
    }
    try (Scalar s = Scalar.fromFloat(0.0f)) {
      int bits = Float.floatToRawIntBits(s.getFloat());
      boolean ok = bits == 0;
      System.out.printf("  Scalar.fromFloat(0.0f):         bits=0x%08x  %s%n",
          bits, ok ? "PASS" : "FAIL");
      if (!ok) failures++;
    }

    // --- User-visible impact ---
    System.out.println("\n--- User-visible impact (string representation) ---");
    System.out.printf("  Java Double.toString(-0.0): \"%s\"%n", Double.toString(-0.0));
    System.out.printf("  Java Double.toString(0.0):  \"%s\"%n", Double.toString(0.0));
    System.out.println("  => cast(-0.0 as string) would return \"0.0\" on GPU"
        + " instead of \"-0.0\" on CPU");

    System.out.printf("%n=== Result: %d failure(s) ===%n", failures);
    System.exit(failures > 0 ? 1 : 0);
  }
}

Repro:

// first compile
javac -cp <PATH_TO>/rapids-4-spark_2.12-26.04.0-SNAPSHOT-cuda12.jar NegativeZeroScalarRepro.java
...
// run
RAPIDS_JAR=<PATH_TO>/rapids-4-spark_2.12-26.04.0-SNAPSHOT-cuda12.jar && SLF4J_JAR=<PATH_TO>/org/slf4j/slf4j-api/1.7.36/slf4j-api-1.7.36.jar &&  java -cp .:${RAPIDS_JAR}:${SLF4J_JAR} NegativeZeroScalarRepro

==================================================

=== cuDF Scalar -0.0 sign bit reproducer ===

--- Double (-0.0d) ---
  Scalar.fromDouble(-0.0):        bits=0x0000000000000000  FAIL (sign bit lost)
  ColumnVector(-0.0).getScalar:   bits=0x8000000000000000  PASS (sign bit preserved)

--- Float (-0.0f) ---
  Scalar.fromFloat(-0.0f):        bits=0x00000000  FAIL (sign bit lost)
  ColumnVector(-0.0f).getScalar:  bits=0x80000000  PASS (sign bit preserved)

--- Positive zero (sanity check) ---
  Scalar.fromDouble(0.0):         bits=0x0000000000000000  PASS
  Scalar.fromFloat(0.0f):         bits=0x00000000  PASS

--- User-visible impact (string representation) ---
  Java Double.toString(-0.0): "-0.0"
  Java Double.toString(0.0):  "0.0"
  => cast(-0.0 as string) would return "0.0" on GPU instead of "-0.0" on CPU

=== Result: 2 failure(s) ===
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

I didn't file issue to cuDF as this 0.0 issue in this use case is more specific to Spark, but I can file one if you think it's needed.

I update the logic to only ColumnVector workaround for -0.0 for performance. All other values continue to use the fast Scalar.fromDouble/Scalar.fromFloat path.

revans2 · 2026-03-12T12:12:31Z

@wjxiz1992 I think you might have misunderstood my comments. -0.0 vs 0.0 is a very minor thing. It feels very inconsequential to have a scalar value be different. I get that we want to match all of spark's unit tests, but if it is going to make the code less maintainable and possibly slower, then we need to think about the cost of being 100% compatible with spark and decide if it is worth it. We also need to think about the change that we are making and ask ourselves if this is the right place to make the change. Here we are making multiple tiny memory allocations and copies to make it so that we can preserve the -0.0. That is not worth it to me, which is why I asked where the error came from. We are not explicitly normalizing -0.0 to 0.0 in the code you pointed to, so I want to understand if there is something we can do there that would let us fix this in a much more efficient way. I also want to understand who added the test and why that added the test. What possible case exists that we need to preserve this, especially when spark strips it out in so many situations.

For me I am +1 if we can do this in a way that does not cause any more GPU memory allocations or GPU data movement, or make the code more difficult to work with. I am -1 in all other cases unless we can prove that this is critical to some real world situation.

wjxiz1992 · 2026-03-13T03:46:34Z

@revans2 I see, thanks for the nice explanation!

Now I changed my wrong thought that "correctness(consistency with Spark) > performance". it really depends.
Yes, the JNI code deson't explicitly normalize it (sign bit loss is just a result). makeFloat64Scalar creates a cudf::scalar_type_t and calls set_value(static_cast(value)). No explicit normalization here — the jdouble → double cast preserves the sign bit. Then cuDF C++ layer — cudf/cpp/src/scalar/scalar.cpp
fixed_width_scalar::set_value calls _data.set_value_async(value, stream) where _data is an rmm::device_scalar. This performs a host-to-device copy. The sign bit is lost somewhere in this RMM path — neither the JNI nor the cuDF C++ code explicitly normalizes -0.0.
Let me file a cuDF issue there for more investigations and discussions, this tiny unit test doesn't deserve the added GPU memory costs introduced in this PR.
All [AutpSparkUT] prefix PRs actually originates from the test coverage worry from high table, then Mahone introduced this "RapidsTest-extends" way to migrate all Spark unit test. And no, it's not from any actual customer use case, you can regard it as just a KPI "how many Spark unit tests have been migrated". All problematic tests are now excluded in this file, and my job is to remove them from exlucsion. I'll be more cautious so it doens't ruin the spark-rapids project.

wjxiz1992 · 2026-03-13T05:38:55Z

@revans2 Update: I traced the root cause down to the RMM layer.

Root cause: rmm::device_uvector::set_element_async (device_uvector.hpp L222-226) has a zero-optimization that uses cudaMemsetAsync when value == value_type{0}. Since IEEE 754 defines -0.0 == 0.0 as true, -0.0 triggers the memset path, which clears all bits including the sign bit.

cudaMemcpy(-0.0):   bits=0x8000000000000000  PASS (sign bit preserved)
cudaMemset(0):      bits=0x0000000000000000  FAIL (sign bit lost)
-0.0 == 0.0 ? TRUE (IEEE 754)

The call chain is: Scalar.fromDouble → JNI makeFloat64Scalar → cudf::fixed_width_scalar::set_value → rmm::device_scalar::set_value_async → device_uvector::set_element_async → hits the == 0 optimization → cudaMemsetAsync → sign bit gone.

Filed as rapidsai/rmm#2298 with a CUDA reproducer and suggested fix (use memcmp instead of == for floating-point zero detection).

For this PR: since the proper fix belongs in RMM (zero GPU memory overhead once fixed there), I'll keep the exclude in place and revisit after the RMM fix lands.

…loat-subquery

The root cause (RMM set_element_async normalizing -0.0 to +0.0 via cudaMemsetAsync) has been fixed upstream in rapidsai/rmm#2302 (commit 06c3562). The fix is now included in the spark-rapids-jni nightly SNAPSHOT via the updated cudf-pins RMM pin. Remove the ColumnVector-based workaround in GpuScalar and restore the original direct Scalar.fromDouble/fromFloat calls. The exclusion removal in RapidsTestSettings (from the prior commit) is retained — the test now passes with the upstream fix alone. Verified: RapidsSQLQuerySuite 234 tests, 0 failures, 0 errors. Signed-off-by: Allen Xu <allxu@nvidia.com> Made-with: Cursor

wjxiz1992 · 2026-03-18T09:10:31Z

build

wjxiz1992 · 2026-03-19T02:53:25Z

@revans2 Hi Bobby, this PR has been updated based on your feedback:

The workaround is completely removed. The root cause was fixed upstream in Remove zero-value special casing in set_element_async to preserve IEEE 754 -0.0 rapidsai/rmm#2302 (the cudaMemsetAsync zero-optimization in device_uvector::set_element_async that was clearing the sign bit of -0.0). That fix is now included in the spark-rapids-jni nightly SNAPSHOT.
Zero production code changes — the literals.scala file is byte-for-byte identical to the base branch. No extra GPU memory allocations, no ColumnVector workaround, no performance cost.
The only change is removing one .exclude() line in RapidsTestSettings.scala to re-enable the test, which now passes with the upstream fix alone.
Test verified: RapidsSQLQuerySuite 234 tests, 0 failures, 0 errors.

Could you take another look when you get a chance? Thanks!

wjxiz1992 · 2026-03-20T03:48:42Z

build

The stash pop three-way merge re-introduced exclusions for NVIDIA#14098, Signed-off-by: Allen Xu <allxu@nvidia.com> Made-with: Cursor NVIDIA#14110, and NVIDIA#14116 that were already removed by merged PRs NVIDIA#14446, NVIDIA#14398, and NVIDIA#14400. Remove them to match origin/main.

Copilot AI review requested due to automatic review settings March 11, 2026 09:18

Copilot started reviewing on behalf of wjxiz1992 March 11, 2026 09:19 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

revans2 requested changes Mar 11, 2026

View reviewed changes

wjxiz1992 self-assigned this Mar 12, 2026

wjxiz1992 mentioned this pull request Mar 13, 2026

device_uvector::set_element_async loses IEEE 754 sign bit of -0.0 due to zero-optimization rapidsai/rmm#2298

Closed

wjxiz1992 added 2 commits March 18, 2026 14:23

Merge remote-tracking branch 'origin/main' into fix/14116-normalize-f…

76d6eeb

…loat-subquery

wjxiz1992 changed the title ~~[AutoSparkUT] Fix GpuScalar to preserve -0.0 for float/double (issue #14116)~~ [AutoSparkUT] Re-enable 'normalize special floating numbers in subquery' test (issue #14116) Mar 18, 2026

Merge branch 'main' into fix/14116-normalize-float-subquery

b1e0e07

wjxiz1992 requested review from GaryShen2008, revans2 and thirtiseven March 20, 2026 03:48

revans2 approved these changes Mar 23, 2026

View reviewed changes

wjxiz1992 merged commit eb541ca into NVIDIA:main Mar 24, 2026
47 checks passed

sameerz added the test Only impacts tests label Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoSparkUT] Re-enable 'normalize special floating numbers in subquery' test (issue #14116)#14400

[AutoSparkUT] Re-enable 'normalize special floating numbers in subquery' test (issue #14116)#14400
wjxiz1992 merged 5 commits intoNVIDIA:mainfrom
wjxiz1992:fix/14116-normalize-float-subquery

wjxiz1992 commented Mar 11, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Uh oh!

revans2 left a comment

Uh oh!

wjxiz1992 commented Mar 12, 2026 •

edited

Loading

Uh oh!

revans2 commented Mar 12, 2026

Uh oh!

wjxiz1992 commented Mar 13, 2026 •

edited

Loading

Uh oh!

wjxiz1992 commented Mar 13, 2026

Uh oh!

wjxiz1992 commented Mar 18, 2026

Uh oh!

wjxiz1992 commented Mar 19, 2026

Uh oh!

wjxiz1992 commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wjxiz1992 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Upstream fix chain

RAPIDS test to Spark original mapping

Test plan

Checklists

Uh oh!

greptile-apps bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

revans2 left a comment

Choose a reason for hiding this comment

Uh oh!

wjxiz1992 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

revans2 commented Mar 12, 2026

Uh oh!

wjxiz1992 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjxiz1992 commented Mar 13, 2026

Uh oh!

wjxiz1992 commented Mar 18, 2026

Uh oh!

wjxiz1992 commented Mar 19, 2026

Uh oh!

wjxiz1992 commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wjxiz1992 commented Mar 11, 2026 •

edited

Loading

greptile-apps bot commented Mar 11, 2026 •

edited

Loading

wjxiz1992 commented Mar 12, 2026 •

edited

Loading

wjxiz1992 commented Mar 13, 2026 •

edited

Loading