Skip to content

Build and test with CUDA 13.0.0#19768

Merged
rapids-bot[bot] merged 15 commits intorapidsai:branch-25.10from
jameslamb:cuda-13.0.0
Aug 28, 2025
Merged

Build and test with CUDA 13.0.0#19768
rapids-bot[bot] merged 15 commits intorapidsai:branch-25.10from
jameslamb:cuda-13.0.0

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented Aug 21, 2025

Contributes to rapidsai/build-planning#208

Contributes to rapidsai/build-planning#68

  • updates to CUDA 13 dependencies in fallback entries in dependencies.yaml matrices (i.e., the ones that get written to pyproject.toml in source control)

Notes for Reviewers

This switches GitHub Actions workflows to the cuda13.0 branch from here: rapidsai/shared-workflows#413

A future round of PRs will revert that back to branch-25.10, once all of RAPIDS supports CUDA 13.

This has dependencies

Need these to be merged first:

@jameslamb jameslamb added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Aug 21, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 21, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb
Copy link
Member Author

Expecting this to fail until we have dask-cuda packages (rapidsai/dask-cuda#1536), but triggering CI to get some early feedback on the other changes.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

Builds are failing with the issue described here: #19710

  [55/542] Building CXX object cudf-cpp/CMakeFiles/cudf.dir/src/binaryop/binaryop.cpp.o
  FAILED: [code=1] cudf-cpp/CMakeFiles/cudf.dir/src/binaryop/binaryop.cpp.o
  sccache /opt/rh/gcc-toolset-14/root/usr/bin/g++  -pthread -DBS_THREAD_POOL_ENABLE_PAUSE=1 -DCCCL_AVOID_SORT_UNROLL=1 -DCCCL_DISABLE_PDL -DCUB_DISABLE_NAMESPACE_MAGIC -DCUB_IGNORE_NAMESPACE_MAGIC_ERROR -DCUDF_KVIKIO_REMOTE_IO -DCUDF_LOG_ACTIVE_LEVEL=RAPIDS_LOGGER_LOG_LEVEL_INFO -DJITIFY_PRINT_LOG=0 -DKVIKIO_CUDA_FOUND -DKVIKIO_CUFILE_BATCH_API_FOUND -DKVIKIO_CUFILE_FOUND -DKVIKIO_CUFILE_STREAM_API_FOUND -DKVIKIO_CUFILE_VERSION_API_FOUND -DKVIKIO_LIBCURL_FOUND -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_FORCE_32_BIT_OFFSET_TYPE=1 -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -DZSTD_STATIC_LINKING_ONLY=0N -Dcudf_EXPORTS -D_FILE_OFFSET_BITS=64 -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/dlpack-src/include -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src -I/__w/cudf/cudf/cpp/include -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/cudf-cpp/include -I/__w/cudf/cudf/cpp/src -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/nanoarrow-src/src -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/flatbuffers-src/include -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/zstd-src/lib -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/cccl-src/lib/cmake/thrust/../../../thrust -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/cccl-src/lib/cmake/libcudacxx/../../../libcudacxx/include -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/cccl-src/lib/cmake/cub/../../../cub -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/cuco-src/include -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/nanoarrow-build/src -I/__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/zstd-src/build/cmake/../../lib -isystem /pyenv/versions/3.13.6/lib/python3.13/site-packages/rapids_logger/include -isystem /pyenv/versions/3.13.6/lib/python3.13/site-packages/librmm/include -isystem /usr/local/cuda/targets/sbsa-linux/include -isystem /usr/local/cuda/targets/sbsa-linux/include/cccl -isystem /pyenv/versions/3.13.6/lib/python3.13/site-packages/libkvikio/include -isystem /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/nvcomp_proprietary_binary-src/include -O3 -DNDEBUG -std=gnu++20 -fPIC -fvisibility=hidden -Wall -Werror -Wno-unknown-pragmas -Wno-error=deprecated-declarations -MD -MT cudf-cpp/CMakeFiles/cudf.dir/src/binaryop/binaryop.cpp.o -MF cudf-cpp/CMakeFiles/cudf.dir/src/binaryop/binaryop.cpp.o.d -o cudf-cpp/CMakeFiles/cudf.dir/src/binaryop/binaryop.cpp.o -c /__w/cudf/cudf/cpp/src/binaryop/binaryop.cpp
  In file included from /__w/cudf/cudf/cpp/src/jit/cache.hpp:23,
                   from /__w/cudf/cudf/cpp/src/binaryop/binaryop.cpp:21:
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp: In member function ‘const char* jitify2::LibNvJitLink::get_error_string(nvJitLinkResult) const’:
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_NULL_INPUT’ not handled in switch [-Werror=switch]
   1877 |     switch (result) {
        |            ^
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_INCOMPATIBLE_OPTIONS’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_INCORRECT_INPUT_TYPE’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_ARCH_MISMATCH’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_OUTDATED_LIBRARY’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_MISSING_FATBIN’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_UNRECOGNIZED_ARCH’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_UNSUPPORTED_ARCH’ not handled in switch [-Werror=switch]
  /__w/cudf/cudf/python/libcudf/build/py3-none-linux_aarch64/_deps/jitify-src/jitify2.hpp:1877:12: error: enumeration value ‘NVJITLINK_ERROR_LTO_NOT_ENABLED’ not handled in switch [-Werror=switch]
  cc1plus: all warnings being treated as errors

(build link)

rapids-bot bot pushed a commit that referenced this pull request Aug 26, 2025
Splitting some changes off of  the CUDA 13 support PR (#19768) ... that has gotten too large to review.

Contributes to rapidsai/build-planning#208

* uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536
* replaces hard-coding of CUDA major version in `pandas` diff script
* moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1`
* consolidates some dependency lists with unnecessary `cuda: "12.*"` filters

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: #19794
@jameslamb
Copy link
Member Author

jameslamb commented Aug 26, 2025

Problem 1: failing C++ test

details (click me)

The STREAM_IDENTIFICATION_TEST C++ test is failing with CUDA 13 (and only CUDA 13) like this:

 54/110 Test  #73: STREAM_IDENTIFICATION_TEST .......***Failed    0.77 sec
'./../../..//bin/gtests/libcudf/STREAM_IDENTIFICATION_TEST'
The kernel ran!
terminate called after throwing an instance of 'std::runtime_error'
  what():  No exception raised for kernel on default stream!
CMake Error at run_gpu_test.cmake:35 (execute_process):
  execute_process failed command indexes:

    1: "Abnormal exit with child return code: Subprocess aborted"
...
The following tests FAILED:
	 73 - STREAM_IDENTIFICATION_TEST (Failed)
Errors while running CTest

(conda-cpp-tests link)

Fixed by #19807

Problem 2: conda-python-other-tests tests failing with dask-cuda-related issues

details (click me)

CUDA 13 Python tests (and only CUDA 13) are failing like this:

Unable to start CUDA Context
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.12/site-packages/dask_cuda/initialize.py", line 104, in _create_cuda_context
    ucx_implementation = _get_active_ucx_implementation_name(protocol)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.12/site-packages/dask_cuda/utils.py", line 1028, in _get_active_ucx_implementation_name
    raise ValueError("Protocol is neither UCXX nor UCX-Py")
ValueError: Protocol is neither UCXX nor UCX-Py

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.12/site-packages/dask_cuda/initialize.py", line 48, in _warn_generic
    if not distributed.comm.ucx.cuda_context_created.has_context:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'bool' object has no attribute 'has_context'

(conda-python-other-tests)

It looks like this might be a known issue from dask-cuda, but not sure:

The dask-cuda issues weren't actually causing the test failures, but they were fixed anyway by rapidsai/dask-cuda#1541

Problem 3: conda-python-cudf-tests failing somewhere in numba JIT

details (click me)

Seeing lots of Python tests failing like this:

______________________________ test_transform_udf ______________________________
[gw7] linux -- Python 3.13.5 /opt/conda/envs/test/bin/python3.13
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/runner.py", line 344, in from_call
    result: TResult | None = func()
                             ~~~~^^
...
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/libs.py", line 44, in open_libdevice
    with open(get_libdevice(), "rb") as bcfile:
         ~~~~^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/opt/conda/envs/test/nvvm/libdevice/libdevice.10.bc'
full traceback (click me)
______________________________ test_transform_udf ______________________________
[gw7] linux -- Python 3.13.5 /opt/conda/envs/test/bin/python3.13
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/runner.py", line 344, in from_call
    result: TResult | None = func()
                             ~~~~^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/runner.py", line 246, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
            ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_hooks.py", line 512, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 167, in _multicall
    raise exception
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/logging.py", line 850, in pytest_runtest_call
    yield
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
    return result.get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_result.py", line 103, in get_result
    raise exc.with_traceback(tb)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
    res = yield
          ^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/capture.py", line 900, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
    return result.get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_result.py", line 103, in get_result
    raise exc.with_traceback(tb)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
    res = yield
          ^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/skipping.py", line 263, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 121, in _multicall
    res = hook_impl.function(*args)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/runner.py", line 178, in pytest_runtest_call
    item.runtest()
    ~~~~~~~~~~~~^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/python.py", line 1671, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_hooks.py", line 512, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 167, in _multicall
    raise exception
  File "/opt/conda/envs/test/lib/python3.13/site-packages/pluggy/_callers.py", line 121, in _multicall
    res = hook_impl.function(*args)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/_pytest/python.py", line 157, in pytest_pyfunc_call
    result = testfunction(**testargs)
  File "/__w/cudf/cudf/python/pylibcudf/tests/test_transform.py", line 87, in test_transform_udf
    ptx, _ = cuda.compile_ptx_for_current_device(
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        op, (numba.float64, numba.float64, numba.float64), device=True
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/compiler.py", line 880, in compile_ptx_for_current_device
    return compile_ptx(
        pyfunc,
    ...<10 lines>...
        launch_bounds=launch_bounds,
    )
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/compiler.py", line 847, in compile_ptx
    return compile(
        pyfunc,
    ...<11 lines>...
        launch_bounds=launch_bounds,
    )
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/compiler.py", line 790, in compile
    code = lib.get_asm_str(cc=cc)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/codegen.py", line 224, in get_asm_str
    ptx = nvvm.compile_ir(irs, **options)
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/nvvm.py", line 639, in compile_ir
    libdevice = LibDevice()
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/nvvm.py", line 355, in __init__
    self._cache_ = open_libdevice()
                   ~~~~~~~~~~~~~~^^
  File "/opt/conda/envs/test/lib/python3.13/site-packages/numba_cuda/numba/cuda/cudadrv/libs.py", line 44, in open_libdevice
    with open(get_libdevice(), "rb") as bcfile:
         ~~~~^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/opt/conda/envs/test/nvvm/libdevice/libdevice.10.bc'

The logs were too large to render in my browser... you might want to look at the raw logs

Temporarily fixed by adding a dependency on cuda-nvvm-tools for cuda conda packages here. When NVIDIA/numba-cuda#430 is fixed, we can remove that dependency and update our numba-cuda floors across RAPIDs.

Problem 4: dask-cudf failing with numpy issues

details (click me)

Seeing CUDA 12 and CUDA 13 dask-cudf tests failing like this:

...
ERROR io/tests/test_csv.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_json.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_orc.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_parquet.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_s3.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
... many more ...

(wheel-test-dask-cudf link)

Note that wheel-test-dask-cudf jobs both use the "oldest" dependency set. Maybe that's related.

We were constraining pandas but not numpy for the oldest-deps jobs for dask-cudf tests, and ended up with an incompatible mix (numpy 2.0.2, pandas 2.0.3). Fixed in #19806, whose CI will be passing once #19821 is in.

@jameslamb jameslamb added 2 - In Progress Currently a work in progress DO NOT MERGE Hold off on merging; see PR for details labels Aug 27, 2025
galipremsagar pushed a commit to galipremsagar/cudf that referenced this pull request Aug 27, 2025
Splitting some changes off of  the CUDA 13 support PR (rapidsai#19768) ... that has gotten too large to review.

Contributes to rapidsai/build-planning#208

* uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536
* replaces hard-coding of CUDA major version in `pandas` diff script
* moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1`
* consolidates some dependency lists with unnecessary `cuda: "12.*"` filters

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: rapidsai#19794
@jameslamb jameslamb removed 2 - In Progress Currently a work in progress DO NOT MERGE Hold off on merging; see PR for details labels Aug 27, 2025
Copy link
Contributor

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks solid to me @jameslamb -- I'll leave you an approval so you can merge it in once #19806 lands

rapids-bot bot pushed a commit that referenced this pull request Aug 28, 2025
…ython 12.9.2, cupy 13.6.0, numba 0.60.0) (#19806)

Contributes to rapidsai/build-planning#208

* updates dependency pins:
  - `cuda-python`: >=12.9.2 (CUDA 12)
  - `cupy`: >=13.6.0
  - `numba`: >=0.60.0 (now that NVIDIA/numba-cuda#403 is done)
* ensures that "oldest" `numpy` is pinned in `dask-cudf` tests
  - _the "oldest" pin for `numpy` was previously not used in `dask-cudf` wheel tests, allowing an incompatible mix of packages (`pandas 2.0.3, numpy 2.0.2`) to be installed together_

## Notes for Reviewers

### Why a separate PR?

In #19768 (comment), we saw this set of dependency changes caused failures like this in CUDA 12 and CUDA 13 environments:

```text
...
ERROR io/tests/test_csv.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_json.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_orc.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_parquet.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
ERROR io/tests/test_s3.py - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
... many more ...
```

([wheel-test-dask-cudf link](https://github.com/rapidsai/cudf/actions/runs/17249655997/job/48950898976?pr=19768#step:11:11795))

Opening this more narrowly-scoped PR to investigate that.

### How I tested this

First commit here contained some of the dependency changes from #19768 , and those were enough to reproduce the test failures!

https://github.com/rapidsai/cudf/actions/runs/17271893124/job/49021534507?pr=19806#step:11:11928

#

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Matthew Murray (https://github.com/Matt711)
  - Gil Forsyth (https://github.com/gforsyth)

URL: #19806
@jameslamb
Copy link
Member Author

Thanks very much! I just merged in #19806, hopefully that'll be the last thing and we'll be able to merge this as soon as CI pass.

@jameslamb
Copy link
Member Author

gahhhh we're so close! There's a new issue here.

Problem 5: JSON wheel test failing (only on CUDA 13 + arm)

One wheel test is failing as follows:

_________ test_write_json_basic[100-source_or_sink1-False-100-stream1] _________
[gw3] linux -- Python 3.13.7 /__w/cudf/cudf/env/bin/python
Traceback (most recent call last):
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/runner.py", line 344, in from_call
    result: TResult | None = func()
                             ~~~~^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/runner.py", line 246, in <lambda>
    lambda: runtest_hook(item=item, **kwds), when=when, reraise=reraise
            ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_hooks.py", line 512, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 167, in _multicall
    raise exception
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/logging.py", line 850, in pytest_runtest_call
    yield
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/capture.py", line 900, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 53, in run_old_style_hookwrapper
    return result.get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_result.py", line 103, in get_result
    raise exc.with_traceback(tb)
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 38, in run_old_style_hookwrapper
    res = yield
          ^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 139, in _multicall
    teardown.throw(exception)
    ~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/skipping.py", line 263, in pytest_runtest_call
    return (yield)
            ^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 121, in _multicall
    res = hook_impl.function(*args)
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/runner.py", line 178, in pytest_runtest_call
    item.runtest()
    ~~~~~~~~~~~~^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/python.py", line 1671, in runtest
    self.ihook.pytest_pyfunc_call(pyfuncitem=self)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_hooks.py", line 512, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 167, in _multicall
    raise exception
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/pluggy/_callers.py", line 121, in _multicall
    res = hook_impl.function(*args)
  File "/__w/cudf/cudf/env/lib/python3.13/site-packages/_pytest/python.py", line 157, in pytest_pyfunc_call
    result = testfunction(**testargs)
  File "/__w/cudf/cudf/python/pylibcudf/tests/io/test_json.py", line 54, in test_write_json_basic
    assert str_result == pd_result
AssertionError: assert '\x01{"col_in...92.379533}}}]' == '[{"col_int64...92.379533}}}]'

(build link)

I'll go ask for some help, but if we can't fix it quickly I think we should just skip it, write up an issue, and move on.

@jameslamb
Copy link
Member Author

Problem 5: JSON wheel test failing (only on CUDA 13 + arm)

@Matt711 informed me that this is a known-to-be-flaky test and sure enough... it passed on a re-run.

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 94e0f92 into rapidsai:branch-25.10 Aug 28, 2025
223 of 249 checks passed
@jameslamb
Copy link
Member Author

This was a tough one, thanks so much to everyone for helping get this in!!!

@Matt711 @mroeschke @robertmaynard @gforsyth @bdice @brandon-b-miller @davidwendt

@jameslamb jameslamb deleted the cuda-13.0.0 branch August 28, 2025 16:36
rapids-bot bot pushed a commit that referenced this pull request Aug 29, 2025
CUDA 13 support was initially added here in #19768 

During that work, we faced some runtime issues with conda packages that @brandon-b-miller diagnosed as a missing dependency in `numba-cuda` (NVIDIA/numba-cuda#430).

To get past that, we temporarily introduced a runtime dependency on `cuda-nvvm-tools` in this project. That's no longer necessary, thanks to these:

* conda-forge/numba-cuda-feedstock#47
* conda-forge/numba-cuda-feedstock#46

This removes that workaround.

## Notes for Reviewers

### Don't we need to change the `numba-cuda` pin?

No, the fixes are just in new builds of 0.19.1.

#

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - https://github.com/brandon-b-miller
  - Jake Awe (https://github.com/AyodeAwe)

URL: #19842
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this pull request Aug 29, 2025
Contributes to rapidsai/build-planning#208

#489 temporarily removed the `cudf` test-time dependency here, because there weren't yet CUDA 13 `cudf` packages.

Those now exist (rapidsai/cudf#19768), so this restores that dependency.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #493
rapids-bot bot pushed a commit to rapidsai/ucx-py that referenced this pull request Aug 29, 2025
Contributes to rapidsai/build-planning#208

#1162 temporarily removed the `cudf` test-time dependency here, because there weren't yet CUDA 13 `cudf` packages.

Those now exist (rapidsai/cudf#19768), so this restores that dependency.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1164
rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this pull request Aug 29, 2025
Contributes to rapidsai/build-planning#208

#1536 temporarily removed the `cudf` test-time dependency here, because there weren't yet CUDA 13 `cudf` packages.

Those now exist (rapidsai/cudf#19768), so this restores that dependency.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #1544
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants