Build and test with CUDA 13.0.0 by jameslamb · Pull Request #7128 · rapidsai/cuml

jameslamb · 2025-08-22T04:45:55Z

Contributes to rapidsai/build-planning#208

uses CUDA 13.0.0 to build and test
adds CUDA 13 devcontainers
moves some dependency pins (most others were done in use 'nvidia-ml-py' instead of 'pynvml', declare 'numba-cuda' dependency pins #7164)
- cuda-python: >=13.0.1 (CUDA 13)

Contributes to rapidsai/build-planning#68

updates to CUDA 13 dependencies in fallback entries in dependencies.yaml matrices (i.e., the ones that get written to pyproject.toml in source control)

Notes for Reviewers

This switches GitHub Actions workflows to the cuda13.0 branch from here: rapidsai/shared-workflows#413

A future round of PRs will revert that back to branch-25.10, once all of RAPIDS supports CUDA 13.

robertmaynard · 2025-08-28T17:58:09Z

#6823 was merged yesterday and added a dependency to pynvml constrainted to the 12.X series. We will need to update this PR with latest 25.10 and update that logic for CTK 12 and 13 @jameslamb

dependencies.yaml

jameslamb · 2025-09-02T14:23:50Z

Problem 1: missing "treebank" dataset

update: seemed to be a network error, not seen on re-runs.

details (click me)

One CUDA 12.0.1 conda-python-tests-singlegpu job failed like this:

[gw1] linux -- Python 3.12.11 /opt/conda/envs/test/bin/python
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.12/site-packages/nltk/corpus/util.py", line 84, in __load
    root = nltk.data.find(f"{self.subdir}/{zip_name}")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/test/lib/python3.12/site-packages/nltk/data.py", line 579, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource treebank not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk

(conda-python-tests-singlegpu build link)

Hopefully just a temporary issue that's be resolved by a re-run.

Problem 2: Failing Naive Bayes tests (conda)

update: Related to scikit-learn dataset downloads interacting in unexpected way with tests run in parallel. Fixed by #7169

details (click me)

CUDA 13 conda-python-tests jobs both failed like this:

FAILED test_naive_bayes.py::test_gaussian_parameters[1e-05-balanced] - ValueError: Number of priors must match number of classes.
FAILED test_naive_bayes.py::test_gaussian_parameters[1e-05-unbalanced] - ValueError: Number of priors must match number of classes.
..
FAILED test_naive_bayes.py::test_categorical_partial_fit[True-int32-float32] - assert 0.1452 <= (0.104 + 0.0001)
FAILED test_naive_bayes.py::test_categorical_partial_fit[True-int32-float64] - assert 0.1452 <= (0.104 + 0.0001)
..
FAILED test_naive_bayes.py::test_categorical_parameters[False-False-0.1-balanced] - ValueError: Number of classes must match number of priors
FAILED test_naive_bayes.py::test_categorical_parameters[False-False-0.1-unbalanced] - ValueError: Number of classes must match number of priors
..
= 40 failed, 14374 passed, 6142 skipped, 1225 xfailed, 24 xpassed, 1225 warnings in 4949.54s (1:22:29) =

(conda-python-tests build link)

tracked in #7152

Problem 3: Failing Naive Bayes and Logistic Regression tests (wheels)

update: same as above, fixed by #7169

details (click me)

All of the amd64 CUDA 13 wheels tests passed, but on arm64 there were some test failures.

FAILED test_naive_bayes.py::test_multinomial_partial_fit[int32-float32] - assert 0.9238375200427579 >= 0.924
 +  where 0.9238375200427579 = accuracy_score(array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,), dtype=int32), array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,), dtype=int32))
FAILED test_naive_bayes.py::test_multinomial_partial_fit[int32-float64] - assert 0.9238375200427579 >= 0.924
 +  where 0.9238375200427579 = accuracy_score(array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,), dtype=int32), array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,)))
...
FAILED test_naive_bayes.py::test_multinomial[int32-float32] - assert 0.9238375200427579 >= 0.924
 +  where 0.9238375200427579 = accuracy_score(array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,)), array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,), dtype=int32))
FAILED test_naive_bayes.py::test_multinomial[int32-float64] - assert 0.9238375200427579 >= 0.924
 +  where 0.9238375200427579 = accuracy_score(array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,)), array([ 5, 16,  6, ..., 14,  7,  3], shape=(11226,)))
...
FAILED test_naive_bayes.py::test_gaussian_partial_fit - assert 0.988 >= 0.99
 +  where 0.988 = accuracy_score(array([ 5, 16,  6, ...,  8, 13, 11], shape=(1500,), dtype=int32), array([ 5, 16,  6, ...,  8, 13, 11], shape=(1500,), dtype=int32))
...
FAILED test_naive_bayes.py::test_categorical_partial_fit[False-int64-float32] - assert 0.1082 <= (0.104 + 0.0001)
FAILED test_naive_bayes.py::test_categorical_partial_fit[False-int64-float64] - assert 0.1082 <= (0.104 + 0.0001)
FAILED test_linear_model.py::test_logistic_regression_sparse_only - ExceptionGroup: Hypothesis found 2 distinct failures in explicit examples. (2 sub-exceptions)
= 22 failed, 14354 passed, 6150 skipped, 1223 xfailed, 24 xpassed, 1217 warnings in 1611.55s (0:26:51) =

(wheel-tests-cuml build link)

tracked in #7152 and #7162

jameslamb · 2025-09-02T14:34:38Z

/ok to test

…cy pins (#7164) Contributes to rapidsai/build-planning#208 (breaking some changes off of #7128 to help with review and debugging there) * switches to using `dask-cuda[cu12]` extra for wheels (added in rapidsai/dask-cuda#1536) * bumps pins on some dependencies to match the rest of RAPIDS - `cuda-python`: >=12.9.2 (CUDA 12) - `cupy`: >=13.6.0 - `numba`: >=0.60.0 * adds explicit runtime dependency on `numba-cuda` - *`cuml` uses this unconditionally but does not declare runtime dependency on it today* Contributes to rapidsai/build-infra#293 * replaces dependency on `pynvml` package with `nvidia-ml-py` package (see that issue for details) ## Notes for Reviewers ### These dependency pin changes should be low-risk All of these pins and requirements are already coming through `cuml`'s dependencies, e.g. `cudf` carries most of them via rapidsai/cudf#19806 So they shouldn't change much about the test environments in CI. Authors: - James Lamb (https://github.com/jameslamb) - Simon Adorf (https://github.com/csadorf) Approvers: - Simon Adorf (https://github.com/csadorf) - Gil Forsyth (https://github.com/gforsyth) URL: #7164

jameslamb · 2025-09-03T19:16:48Z

/ok to test

csadorf

LGTM! Thanks a lot!

jameslamb · 2025-09-04T14:52:39Z

/merge

Build and test with CUDA 13.0.0

c0bd4df

jameslamb added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Aug 22, 2025

This comment was marked as resolved.

Sign in to view

github-actions bot added the conda conda issue label Aug 22, 2025

github-actions bot assigned jameslamb Aug 22, 2025

jameslamb mentioned this pull request Aug 22, 2025

Add CUDA 13.0 Support across all of RAPIDS rapidsai/build-planning#208

Closed

update dependencies.yaml, docs

6c87db8

github-actions bot added the Cython / Python Cython or Python issue label Aug 22, 2025

jameslamb mentioned this pull request Aug 26, 2025

Build and test with CUDA 13.0.0 rapidsai/integration#795

Merged

jameslamb added 3 commits August 29, 2025 13:15

Merge branch 'branch-25.10' into cuda-13.0.0

24c6b6c

replace pynvml with nvidia-ml-py, other changes

ee63bcc

fix comment

0a6bc43

jameslamb commented Aug 29, 2025

View reviewed changes

dependencies.yaml Outdated Show resolved Hide resolved

loosen nvidia-ml-py pin

c53346c

declare numba-cuda dependency

7a67b84

This was referenced Sep 2, 2025

use 'nvidia-ml-py' instead of 'pynvml', declare 'numba-cuda' dependency pins #7164

Merged

[CI] test_linear_model.py::test_logistic_regression_sparse_only failure #7162

Closed

jameslamb added 2 commits September 3, 2025 14:11

merge branch-25.10 changes

3d251a6

numba version in docs

3b6d576

jameslamb changed the title ~~WIP: Build and test with CUDA 13.0.0~~ Build and test with CUDA 13.0.0 Sep 3, 2025

jameslamb marked this pull request as ready for review September 3, 2025 21:44

jameslamb requested review from a team as code owners September 3, 2025 21:44

jameslamb requested review from gforsyth and jcrist September 3, 2025 21:44

gforsyth approved these changes Sep 4, 2025

View reviewed changes

csadorf approved these changes Sep 4, 2025

View reviewed changes

jcrist approved these changes Sep 4, 2025

View reviewed changes

rapids-bot bot merged commit 0fa871d into rapidsai:branch-25.10 Sep 4, 2025
124 checks passed

jameslamb deleted the cuda-13.0.0 branch September 4, 2025 14:53

jameslamb mentioned this pull request Jan 12, 2026

Update input_utils CI tests to account for cuDF update #7660

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build and test with CUDA 13.0.0#7128

Build and test with CUDA 13.0.0#7128
rapids-bot[bot] merged 9 commits intorapidsai:branch-25.10from
jameslamb:cuda-13.0.0

jameslamb commented Aug 22, 2025 •

edited

Loading

Uh oh!

This comment was marked as resolved.

robertmaynard commented Aug 28, 2025

Uh oh!

Uh oh!

jameslamb commented Sep 2, 2025 •

edited

Loading

Uh oh!

jameslamb commented Sep 2, 2025

Uh oh!

jameslamb commented Sep 3, 2025

Uh oh!

csadorf left a comment

Uh oh!

jameslamb commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jameslamb commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes for Reviewers

Uh oh!

This comment was marked as resolved.

robertmaynard commented Aug 28, 2025

Uh oh!

Uh oh!

jameslamb commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem 1: missing "treebank" dataset

Problem 2: Failing Naive Bayes tests (conda)

Problem 3: Failing Naive Bayes and Logistic Regression tests (wheels)

Uh oh!

jameslamb commented Sep 2, 2025

Uh oh!

jameslamb commented Sep 3, 2025

Uh oh!

csadorf left a comment

Choose a reason for hiding this comment

Uh oh!

jameslamb commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jameslamb commented Aug 22, 2025 •

edited

Loading

jameslamb commented Sep 2, 2025 •

edited

Loading