Skip to content

Build and test with CUDA 13.0.0#1536

Merged
rapids-bot[bot] merged 11 commits intorapidsai:branch-25.10from
jameslamb:cuda-13.0.0
Aug 26, 2025
Merged

Build and test with CUDA 13.0.0#1536
rapids-bot[bot] merged 11 commits intorapidsai:branch-25.10from
jameslamb:cuda-13.0.0

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented Aug 19, 2025

Contributes to rapidsai/build-planning#208

Contributes to rapidsai/build-planning#68

  • updates to CUDA 13 dependencies in fallback entries in dependencies.yaml matrices (i.e., the ones that get written to pyproject.toml in source control)

Notes for Reviewers

This switches GitHub Actions workflows to the cuda13.0 branch from here: rapidsai/shared-workflows#413

A future round of PRs will revert that back to branch-25.10, once all of RAPIDS supports CUDA 13.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 19, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb jameslamb added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Aug 19, 2025
@jameslamb
Copy link
Member Author

/ok to test

@jameslamb
Copy link
Member Author

We'll need new ucxx packages to move forward here

error    libmamba Could not solve for environment specs
    The following packages are incompatible
    ├─ cuda-version =13.0 * is requested and can be installed;
    └─ ucxx =0.46,>=0.0.0a0 * is not installable because it requires
       └─ cuda-version >=12,<13.0a0 *, which conflicts with any installable versions previously reported.

(build link)

Which will need ucx conda packages on conda-forge an libucx-cu13 wheels from https://github.com/rapidsai/ucx-wheels. I've added tracking items on all of those to rapidsai/build-planning#68

@bdice
Copy link
Contributor

bdice commented Aug 21, 2025

/ok to test

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All seems fine. We may have to skip cuDF tests or request an admin-merge to avoid a circular dependency.

@jameslamb jameslamb changed the title WIP: Build and test with CUDA 13.0.0 Build and test with CUDA 13.0.0 Aug 21, 2025
@jameslamb jameslamb marked this pull request as ready for review August 21, 2025 19:36
@jameslamb jameslamb requested review from a team as code owners August 21, 2025 19:36
@jameslamb jameslamb requested a review from AyodeAwe August 21, 2025 19:36
@jameslamb
Copy link
Member Author

Thanks for looking!

We may have to skip cuDF tests or request an admin-merge to avoid a circular dependency.

I was expecting we'd admin-merge for this one, as I think that's what we usually do for dask-cuda.

- matrix: # Fallback for no matrix
packages:
- *numba_cuda_cu12
- *numba_cuda_cu13
Copy link
Member Author

@jameslamb jameslamb Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blegh, ok we have to do something here.

CUDA 12.9 wheel tests are failing like this:

ERROR: Cannot install -r /tmp/requirements-test.txt (line 3) and numba-cuda[cu13]==0.19.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    numba-cuda[cu13] 0.19.0 depends on cuda-python==13.*; extra == "cu13"
    cudf-cu12 25.10.0a261 depends on cuda-python<13.0a0 and >=12.9.1
    cudf-cu12 25.10.0a260 depends on cuda-python<13.0a0 and >=12.9.1

(build link)

The fallback matrix is used because of this:

disable-cuda = true

We distribute unsuffixed packages here (dask-cuda, not dask-cuda-cu{12,13}). Which has been fine, because this project's code isn't directly CUDA-major-version dependent and neither was its list of dependencies.... until #1531

That PR introduced a CUDA-major-version-specific dependency like numba-cuda[cu12]. When I reviewed that, I forgot dask-cuda isn't suffixed 😭

This is a problem because now we want to depend on numba-cuda[cu12] when targeting CUDA 12 and numba-cuda[cu13] when targeting CUDA 13.

The options I see are:

  • depend on numba-cuda without extras
  • start distributing suffixed packages like dask-cuda-cu{12,13}

It looks like those extras are probably necessary for correctness:

https://github.com/NVIDIA/numba-cuda/blob/d2827b7701289932ba390cdbf90642dcc7b2eeab/pyproject.toml#L26-L44

But I'm not sure. What do you think @bdice @brandon-b-miller @pentschev @TomAugspurger ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked offline and @bdice mentioned it might be possible to factor the numba.cuda usage here out / down in a way that allows us to continue publishing unsuffixed dask-cuda wheels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying with this change: #1537

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #1537, in more offline discussion we agreed to try adding [cu12] and [cu13] extras for wheels, to ensure that users can get an environment with a compatible set of packages based on CUDA version.

@jakirkham
Copy link
Member

jakirkham commented Aug 21, 2025

We may have to skip cuDF tests or request an admin-merge to avoid a circular dependency.

I was expecting we'd admin-merge for this one, as I think that's what we usually do for dask-cuda.

Typically have disabled the relevant tests and then come back with a follow up PR to re-enable them

That way any relevant test failures that come up are scoped to a PR that is addressing them (and are not broadly effecting development)

@jameslamb
Copy link
Member Author

Typically have disabled the relevant tests and then come back with a follow up PR to re-enable them

Ok I have definitely seen dask-cuda PRs just admin-merged, but we can try this. I do think what you're suggesting is preferable to every PR needing admin merges until cudf is done.

@jameslamb jameslamb requested a review from a team as a code owner August 25, 2025 15:27
# TODO: add 'cudf' and 'dask-cudf' back to this dependency list once there are CUDA 13 packages for those
# ref: https://github.com/rapidsai/dask-cuda/pull/1536#issuecomment-3212474898
# - cudf-cu12==25.10.*,>=0.0.0a0
# - dask-cudf-cu12==25.10.*,>=0.0.0a0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making cudf an optional dependency for tests revealed something... dask.dataframe codepaths used here require pyarrow, and cudf must have been pulling it in for us.

/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:117: in import_optional_dependency
    raise ImportError(msg) from err
E   ImportError: Missing optional dependency 'pyarrow'.  Use pip or conda to install pyarrow.
full traceback (click me)
________________________ ERROR collecting test_proxy.py ________________________
ImportError while importing test module '/__w/dask-cuda/dask-cuda/dask_cuda/tests/test_proxy.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:114: in import_optional_dependency
    module = importlib.import_module(name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1004: in _find_and_load_unlocked
    ???
E   ModuleNotFoundError: No module named 'pyarrow'

The above exception was the direct cause of the following exception:
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/python.py:498: in importtestmodule
    mod = import_path(
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/pathlib.py:587: in import_path
    importlib.import_module(module_name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
    exec(co, module.__dict__)
tests/test_proxy.py:18: in <module>
    from dask.dataframe.core import has_parallel_type
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
    return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/__init__.py:24: in <module>
    from dask.dataframe import backends, dispatch
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
    return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/backends.py:14: in <module>
    from dask.dataframe._compat import PANDAS_GE_220, is_any_real_numeric_dtype
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
    return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/_compat.py:11: in <module>
    import_optional_dependency("pyarrow")
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:117: in import_optional_dependency
    raise ImportError(msg) from err
E   ImportError: Missing optional dependency 'pyarrow'.  Use pip or conda to install pyarrow.

(build link)

Notice that nothing in those import paths involves dask-cuda code, so I guess this is really just a test-time dependency.

I see a few options for this:

There's also an old private thread about this, I'll go revive that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, dask.dataframe is NOT a dependency of dask_cuda, using dask.array is a completely valid use case that doesn’t require dask.dataframe at all. Therefore, I think it’s fine for us to add it for now as a dependency to Dask-CUDA’s tests only, meaning we can add it and remove it later when rollback to the original cudf-cu*/dask-cudf-cu* dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Posting for the record here on GitHub... in an offline conversation, we decided to do the following

  • add a temporary test-time-only dependency on dask[dataframe]
  • remove that in the future when there are cudf CUDA 13 packages
  • make dask-cudf depend on dask[dataframe]

@jameslamb jameslamb requested a review from pentschev August 25, 2025 19:11
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one comment, otherwise LGTM. Thanks for all the effort here @jameslamb !

Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything passing, great work! Thanks @jameslamb !

@jameslamb
Copy link
Member Author

🎉 I've asked for one more ci-codeowners / packaging-codeowners review, will merge this as soon as we get that. We have enough with @bdice 's approval but a lot of changes were made since that was first given.

Copy link
Contributor

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @jameslamb ! This looks good to go.

@jameslamb
Copy link
Member Author

Thanks everyone!

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit b48fe2b into rapidsai:branch-25.10 Aug 26, 2025
40 checks passed
@jameslamb jameslamb deleted the cuda-13.0.0 branch August 26, 2025 14:12
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Aug 26, 2025
Splitting some changes off of  the CUDA 13 support PR (#19768) ... that has gotten too large to review.

Contributes to rapidsai/build-planning#208

* uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536
* replaces hard-coding of CUDA major version in `pandas` diff script
* moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1`
* consolidates some dependency lists with unnecessary `cuda: "12.*"` filters

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: #19794
galipremsagar pushed a commit to galipremsagar/cudf that referenced this pull request Aug 27, 2025
Splitting some changes off of  the CUDA 13 support PR (rapidsai#19768) ... that has gotten too large to review.

Contributes to rapidsai/build-planning#208

* uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536
* replaces hard-coding of CUDA major version in `pandas` diff script
* moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1`
* consolidates some dependency lists with unnecessary `cuda: "12.*"` filters

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: rapidsai#19794
rapids-bot bot pushed a commit that referenced this pull request Aug 29, 2025
Contributes to rapidsai/build-planning#208

#1536 temporarily removed the `cudf` test-time dependency here, because there weren't yet CUDA 13 `cudf` packages.

Those now exist (rapidsai/cudf#19768), so this restores that dependency.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #1544
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Sep 2, 2025
Follow-up to #2787

Starting with rapidsai/dask-cuda#1536, `dask-cuda` wheels now have extras like `[cu12]` and `[cu13]` to ensure a consistent set of CUDA-major-version-specific dependencies are installed.

This proposes using those extras in this project's wheel dependencies on `dask-cuda`.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Jake Awe (https://github.com/AyodeAwe)

URL: #2797
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Sep 2, 2025
Follow-up to #5236

Starting with rapidsai/dask-cuda#1536, `dask-cuda` wheels now have extras like `[cu12]` and `[cu13]` to ensure a consistent set of CUDA-major-version-specific dependencies are installed.

This proposes using those extras in this project's wheel dependencies on `dask-cuda`.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Jake Awe (https://github.com/AyodeAwe)

URL: #5243
rapids-bot bot pushed a commit to rapidsai/cuml that referenced this pull request Sep 3, 2025
…cy pins (#7164)

Contributes to rapidsai/build-planning#208 (breaking some changes off of #7128 to help with review and debugging there)

* switches to using `dask-cuda[cu12]` extra for wheels (added in rapidsai/dask-cuda#1536)
* bumps pins on some dependencies to match the rest of RAPIDS
  - `cuda-python`: >=12.9.2 (CUDA 12)
  - `cupy`: >=13.6.0
  - `numba`: >=0.60.0
* adds explicit runtime dependency on `numba-cuda`
  - *`cuml` uses this unconditionally but does not declare runtime dependency on it today*

Contributes to rapidsai/build-infra#293

* replaces dependency on `pynvml` package with `nvidia-ml-py` package (see that issue for details)

## Notes for Reviewers

### These dependency pin changes should be low-risk

All of these pins and requirements are already coming through `cuml`'s dependencies, e.g. `cudf` carries most of them via rapidsai/cudf#19806

So they shouldn't change much about the test environments in CI.

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Simon Adorf (https://github.com/csadorf)
  - Gil Forsyth (https://github.com/gforsyth)

URL: #7164
rapids-bot bot pushed a commit that referenced this pull request Sep 5, 2025
#1536 introduced `[cu12]` and `[cu13]` extras for `dask-cuda` wheels, so that you could do something like

```shell
pip install 'dask-cuda[cu13]'
```

To ensure that `numba-cuda[cu13]` (and appropriately-pinned dependencies of it) are installed. In that PR, I'd also added RAPIDS CUDA-version-suffixed optional runtime dependencies to those extras, thinking:

* this is an extra... it's already optional
* one (and in my view, the main) reason to keep those out of the runtime dependencies before was to avoid needing a `-cu{12,13}` suffix here, but with the introduction of these extras that constraint was removed

However, that caused some problems for RAPIDS devcontainers. For example, `cudf` repo-specific devcontainers are now installing `cudf-cu{12,13}` from remote indices instead of building it from source (rapidsai/devcontainers#568 was an earlier attempt to fix that).

To avoid these circular-dependency problems, this proposes removing RAPIDS libraries from the `[cu12]` / `[cu13]` extras. That restores `dask-cuda` to the state it was in before, where using optional dependencies like `cudf` will require installing those separately.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #1549
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants