Build and test with CUDA 13.0.0#1536
Build and test with CUDA 13.0.0#1536rapids-bot[bot] merged 11 commits intorapidsai:branch-25.10from jameslamb:cuda-13.0.0
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
We'll need new Which will need |
|
/ok to test |
bdice
left a comment
There was a problem hiding this comment.
All seems fine. We may have to skip cuDF tests or request an admin-merge to avoid a circular dependency.
|
Thanks for looking!
I was expecting we'd admin-merge for this one, as I think that's what we usually do for |
dependencies.yaml
Outdated
| - matrix: # Fallback for no matrix | ||
| packages: | ||
| - *numba_cuda_cu12 | ||
| - *numba_cuda_cu13 |
There was a problem hiding this comment.
Blegh, ok we have to do something here.
CUDA 12.9 wheel tests are failing like this:
ERROR: Cannot install -r /tmp/requirements-test.txt (line 3) and numba-cuda[cu13]==0.19.0 because these package versions have conflicting dependencies.
The conflict is caused by:
numba-cuda[cu13] 0.19.0 depends on cuda-python==13.*; extra == "cu13"
cudf-cu12 25.10.0a261 depends on cuda-python<13.0a0 and >=12.9.1
cudf-cu12 25.10.0a260 depends on cuda-python<13.0a0 and >=12.9.1
The fallback matrix is used because of this:
Line 127 in 0c83e51
We distribute unsuffixed packages here (dask-cuda, not dask-cuda-cu{12,13}). Which has been fine, because this project's code isn't directly CUDA-major-version dependent and neither was its list of dependencies.... until #1531
That PR introduced a CUDA-major-version-specific dependency like numba-cuda[cu12]. When I reviewed that, I forgot dask-cuda isn't suffixed 😭
This is a problem because now we want to depend on numba-cuda[cu12] when targeting CUDA 12 and numba-cuda[cu13] when targeting CUDA 13.
The options I see are:
- depend on
numba-cudawithout extras - start distributing suffixed packages like
dask-cuda-cu{12,13}
It looks like those extras are probably necessary for correctness:
But I'm not sure. What do you think @bdice @brandon-b-miller @pentschev @TomAugspurger ?
There was a problem hiding this comment.
We talked offline and @bdice mentioned it might be possible to factor the numba.cuda usage here out / down in a way that allows us to continue publishing unsuffixed dask-cuda wheels.
There was a problem hiding this comment.
After #1537, in more offline discussion we agreed to try adding [cu12] and [cu13] extras for wheels, to ensure that users can get an environment with a compatible set of packages based on CUDA version.
Typically have disabled the relevant tests and then come back with a follow up PR to re-enable them That way any relevant test failures that come up are scoped to a PR that is addressing them (and are not broadly effecting development) |
Ok I have definitely seen |
dependencies.yaml
Outdated
| # TODO: add 'cudf' and 'dask-cudf' back to this dependency list once there are CUDA 13 packages for those | ||
| # ref: https://github.com/rapidsai/dask-cuda/pull/1536#issuecomment-3212474898 | ||
| # - cudf-cu12==25.10.*,>=0.0.0a0 | ||
| # - dask-cudf-cu12==25.10.*,>=0.0.0a0 |
There was a problem hiding this comment.
Making cudf an optional dependency for tests revealed something... dask.dataframe codepaths used here require pyarrow, and cudf must have been pulling it in for us.
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:117: in import_optional_dependency
raise ImportError(msg) from err
E ImportError: Missing optional dependency 'pyarrow'. Use pip or conda to install pyarrow.
full traceback (click me)
________________________ ERROR collecting test_proxy.py ________________________
ImportError while importing test module '/__w/dask-cuda/dask-cuda/dask_cuda/tests/test_proxy.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:114: in import_optional_dependency
module = importlib.import_module(name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
???
<frozen importlib._bootstrap>:1027: in _find_and_load
???
<frozen importlib._bootstrap>:1004: in _find_and_load_unlocked
???
E ModuleNotFoundError: No module named 'pyarrow'
The above exception was the direct cause of the following exception:
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/python.py:498: in importtestmodule
mod = import_path(
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/pathlib.py:587: in import_path
importlib.import_module(module_name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
???
<frozen importlib._bootstrap>:1027: in _find_and_load
???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:688: in _load_unlocked
???
/pyenv/versions/3.10.18/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
exec(co, module.__dict__)
tests/test_proxy.py:18: in <module>
from dask.dataframe.core import has_parallel_type
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/__init__.py:24: in <module>
from dask.dataframe import backends, dispatch
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/backends.py:14: in <module>
from dask.dataframe._compat import PANDAS_GE_220, is_any_real_numeric_dtype
/pyenv/versions/3.10.18/lib/python3.10/site-packages/rapids_dask_dependency/dask_loader.py:36: in create_module
return importlib.import_module(spec.name)
/pyenv/versions/3.10.18/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/dataframe/_compat.py:11: in <module>
import_optional_dependency("pyarrow")
/pyenv/versions/3.10.18/lib/python3.10/site-packages/dask/_compatibility.py:117: in import_optional_dependency
raise ImportError(msg) from err
E ImportError: Missing optional dependency 'pyarrow'. Use pip or conda to install pyarrow.
Notice that nothing in those import paths involves dask-cuda code, so I guess this is really just a test-time dependency.
I see a few options for this:
- have
rapids-dask-dependencywheels depend ondask[dataframe](link to what that pulls in) - depend on
dask[dataframe]here indask-cuda(relying onrapids-dask-dependencyto pin to specific versions ofdaskanddistributed) - make
data.dataframeoptional again (as it was as of Makedask.dataframeoptional #1439) - manually add
pyarrowto testing dependencies here indask-cuda - other Python code changes in
rapids-dask-dependencyloading mechanism
There's also an old private thread about this, I'll go revive that.
There was a problem hiding this comment.
Generally speaking, dask.dataframe is NOT a dependency of dask_cuda, using dask.array is a completely valid use case that doesn’t require dask.dataframe at all. Therefore, I think it’s fine for us to add it for now as a dependency to Dask-CUDA’s tests only, meaning we can add it and remove it later when rollback to the original cudf-cu*/dask-cudf-cu* dependency.
There was a problem hiding this comment.
Thank you!
Posting for the record here on GitHub... in an offline conversation, we decided to do the following
- add a temporary test-time-only dependency on
dask[dataframe] - remove that in the future when there are
cudfCUDA 13 packages - make
dask-cudfdepend ondask[dataframe]
pentschev
left a comment
There was a problem hiding this comment.
Left one comment, otherwise LGTM. Thanks for all the effort here @jameslamb !
pentschev
left a comment
There was a problem hiding this comment.
Everything passing, great work! Thanks @jameslamb !
|
🎉 I've asked for one more |
gforsyth
left a comment
There was a problem hiding this comment.
Nice work, @jameslamb ! This looks good to go.
|
Thanks everyone! |
|
/merge |
Splitting some changes off of the CUDA 13 support PR (#19768) ... that has gotten too large to review. Contributes to rapidsai/build-planning#208 * uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536 * replaces hard-coding of CUDA major version in `pandas` diff script * moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1` * consolidates some dependency lists with unnecessary `cuda: "12.*"` filters Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #19794
Splitting some changes off of the CUDA 13 support PR (rapidsai#19768) ... that has gotten too large to review. Contributes to rapidsai/build-planning#208 * uses the new `[cu12, cu13]` extras added to `dask-cuda` for wheels: rapidsai/dask-cuda#1536 * replaces hard-coding of CUDA major version in `pandas` diff script * moves `numba-cuda` floor from `>=0.19.0` to `>=0.19.1` * consolidates some dependency lists with unnecessary `cuda: "12.*"` filters Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: rapidsai#19794
Contributes to rapidsai/build-planning#208 #1536 temporarily removed the `cudf` test-time dependency here, because there weren't yet CUDA 13 `cudf` packages. Those now exist (rapidsai/cudf#19768), so this restores that dependency. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) - Peter Andreas Entschev (https://github.com/pentschev) URL: #1544
Follow-up to #2787 Starting with rapidsai/dask-cuda#1536, `dask-cuda` wheels now have extras like `[cu12]` and `[cu13]` to ensure a consistent set of CUDA-major-version-specific dependencies are installed. This proposes using those extras in this project's wheel dependencies on `dask-cuda`. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: #2797
Follow-up to #5236 Starting with rapidsai/dask-cuda#1536, `dask-cuda` wheels now have extras like `[cu12]` and `[cu13]` to ensure a consistent set of CUDA-major-version-specific dependencies are installed. This proposes using those extras in this project's wheel dependencies on `dask-cuda`. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Brad Rees (https://github.com/BradReesWork) - Jake Awe (https://github.com/AyodeAwe) URL: #5243
…cy pins (#7164) Contributes to rapidsai/build-planning#208 (breaking some changes off of #7128 to help with review and debugging there) * switches to using `dask-cuda[cu12]` extra for wheels (added in rapidsai/dask-cuda#1536) * bumps pins on some dependencies to match the rest of RAPIDS - `cuda-python`: >=12.9.2 (CUDA 12) - `cupy`: >=13.6.0 - `numba`: >=0.60.0 * adds explicit runtime dependency on `numba-cuda` - *`cuml` uses this unconditionally but does not declare runtime dependency on it today* Contributes to rapidsai/build-infra#293 * replaces dependency on `pynvml` package with `nvidia-ml-py` package (see that issue for details) ## Notes for Reviewers ### These dependency pin changes should be low-risk All of these pins and requirements are already coming through `cuml`'s dependencies, e.g. `cudf` carries most of them via rapidsai/cudf#19806 So they shouldn't change much about the test environments in CI. Authors: - James Lamb (https://github.com/jameslamb) - Simon Adorf (https://github.com/csadorf) Approvers: - Simon Adorf (https://github.com/csadorf) - Gil Forsyth (https://github.com/gforsyth) URL: #7164
#1536 introduced `[cu12]` and `[cu13]` extras for `dask-cuda` wheels, so that you could do something like ```shell pip install 'dask-cuda[cu13]' ``` To ensure that `numba-cuda[cu13]` (and appropriately-pinned dependencies of it) are installed. In that PR, I'd also added RAPIDS CUDA-version-suffixed optional runtime dependencies to those extras, thinking: * this is an extra... it's already optional * one (and in my view, the main) reason to keep those out of the runtime dependencies before was to avoid needing a `-cu{12,13}` suffix here, but with the introduction of these extras that constraint was removed However, that caused some problems for RAPIDS devcontainers. For example, `cudf` repo-specific devcontainers are now installing `cudf-cu{12,13}` from remote indices instead of building it from source (rapidsai/devcontainers#568 was an earlier attempt to fix that). To avoid these circular-dependency problems, this proposes removing RAPIDS libraries from the `[cu12]` / `[cu13]` extras. That restores `dask-cuda` to the state it was in before, where using optional dependencies like `cudf` will require installing those separately. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Peter Andreas Entschev (https://github.com/pentschev) URL: #1549
Contributes to rapidsai/build-planning#208
numba-cuda:>=0.19.1,<0.20.0a0[cu12]and[cu13]extras for wheels (Build and test with CUDA 13.0.0 #1536 (comment))cudfanddask_cudfoptional dependencies in tests (tests that need them are skipped if they're not present), and temporarily removes them from test environments for CUDA 13 CI (Build and test with CUDA 13.0.0 #1536 (comment))dask[dataframe]for wheels (Build and test with CUDA 13.0.0 #1536 (comment))Contributes to rapidsai/build-planning#68
dependencies.yamlmatrices (i.e., the ones that get written topyproject.tomlin source control)Notes for Reviewers
This switches GitHub Actions workflows to the
cuda13.0branch from here: rapidsai/shared-workflows#413A future round of PRs will revert that back to
branch-25.10, once all of RAPIDS supports CUDA 13.