Skip to content

Conversation

@brandon-b-miller
Copy link
Contributor

@brandon-b-miller brandon-b-miller commented Jun 25, 2025

numba-cuda 0.15.0 switched numba to use the NVIDIA bindings by default, and 0.15.2 includes a few patches that ameliorated knock-on effects. We should update things to leverage these changes and if anything is broken, fix. This PR now bumps the version to >=0.16.0 to include the latest release.

@brandon-b-miller brandon-b-miller requested a review from a team as a code owner June 25, 2025 15:08
@brandon-b-miller brandon-b-miller requested a review from bdice June 25, 2025 15:08
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jun 25, 2025
@brandon-b-miller brandon-b-miller added feature request New feature or request non-breaking Non-breaking change numba Numba issue labels Jun 25, 2025
@GPUtester GPUtester moved this to In Progress in cuDF Python Jun 25, 2025
@bdice
Copy link
Contributor

bdice commented Jun 25, 2025

Seems like this change will require cuda-version>=12.3 at runtime, in order to have a supported libnvjitlink version. Perhaps there is a way we could statically link some newer libnvjitlink libraries that would work for users with cuda-version>=12.0,<12.3?

@bdice
Copy link
Contributor

bdice commented Jun 25, 2025

Here's the solution we need: conda-forge/libnvjitlink-feedstock#28

@brandon-b-miller brandon-b-miller changed the title Require numba-cuda>=0.15.2 Require numba-cuda>=0.16.0 Jul 7, 2025
@brandon-b-miller
Copy link
Contributor Author

Tests currently fail because cuda.core isn't present in the testing environment. This makes sense, we're installing base numba-cuda here. Currently we manually install it's runtime dependencies, but we could get everything we need by adding the extra [cu11]/[cu12] from here.

@brandon-b-miller
Copy link
Contributor Author

@bdice is this a rapids-dependency-file-generator bug or a user error on my side? I think we might need to bifurcate the numba-cuda-dep in our dependencies.yaml file here to handle the cu11/cu12 extra. Here's the patch I'm trying to apply:

Details
diff --git a/dependencies.yaml b/dependencies.yaml
index 4f23e2ebcb..965e569cc4 100644
--- a/dependencies.yaml
+++ b/dependencies.yaml
@@ -665,7 +665,6 @@ dependencies:
       - output_types: [conda, requirements, pyproject]
         packages:
           - cachetools
-          - &numba-cuda-dep numba-cuda>=0.16.0,<0.17.0a0
           - &numba-dep numba>=0.59.1,<0.62.0a0
           - nvtx>=0.2.1
           - packaging
@@ -689,8 +688,12 @@ dependencies:
               cuda: "12.*"
               cuda_suffixed: "true"
             packages:
-              - nvidia-cuda-nvcc-cu12
-              - nvidia-cuda-nvrtc-cu12
+              - &numba-cuda-cu12-dep numba-cuda[cu12]>=0.16.0,<0.17.0a0
+          - matrix:
+              cuda: "11.*"
+              cuda_suffixed: "true"
+            packages:
+              - &numba-cuda-cu11-dep numba-cuda[cu11]>=0.16.0,<0.17.0a0
           - {matrix: null, packages: []}
   run_cudf_polars:
     common:
@@ -772,12 +775,10 @@ dependencies:
         matrices:
           - matrix: {dependencies: "oldest"}
             packages:
-              - numba-cuda==0.16.0
               - numba==0.59.1
               - pandas==2.0.*
           - matrix: {dependencies: "latest"}
             packages:
-              - *numba-cuda-dep
               - *numba-dep
               - pandas==2.3.0
           - matrix:
@@ -787,6 +788,7 @@ dependencies:
           - matrix: {dependencies: "oldest", arch: "aarch64", cuda: "12.*"}
             packages:
               - cupy==12.2.0 # cupy 12.2.0 is the earliest with CUDA 12 ARM packages.
+              - numba-cuda==0.16.0
           - matrix: {dependencies: "oldest"}
             packages:
               - cupy==12.0.0
@@ -800,6 +802,18 @@ dependencies:
               - cupy-cuda12x==12.0.0
           - matrix:
             packages:
+      - output_types: [requirements, pyproject]
+        matrices:
+          - matrix:
+              cuda: "12.*"
+              packages:
+                - *numba-cuda-cu12-dep
+          - matrix:
+              cuda: "11.*"
+              packages:
+                - *numba-cuda-cu11-dep
+          - matrix:
+            packages:
   test_python_pylibcudf:
     common:
       - output_types: [conda, requirements, pyproject]
@@ -864,7 +878,6 @@ dependencies:
       - output_types: [conda, requirements, pyproject]
         packages:
           - dask-cuda==25.8.*,>=0.0.0a0
-          - *numba-cuda-dep
           - *numba-dep
     specific:
       - output_types: [conda, requirements]
@@ -878,6 +891,18 @@ dependencies:
               - pyarrow==15.*
           - matrix:
             packages:
+      - output_types: [requirements, pyproject]
+        matrices:
+          - matrix:
+              cuda: "12.*"
+              packages:
+                - *numba-cuda-cu12-dep
+          - matrix:
+              cuda: "11.*"
+              packages:
+                - *numba-cuda-cu11-dep
+          - matrix:
+            packages:
   test_python_cudf_polars:
     common:
       - output_types: [conda, requirements, pyproject]

But I seem to be hitting

The provided dependency file contains schema errors.                                                                                                                                                                                                      
                                                                                                                                                                                                                                                          
        ['numba-cuda[cu11]>=0.16.0,<0.17.0a0'] is not of type 'string'  

@bdice
Copy link
Contributor

bdice commented Jul 9, 2025

I don't see any cu11 code here. We dropped CUDA 11 for the 25.08 release.

The failures I see in CI are like ModuleNotFoundError: No module named 'cuda.core' because numba-cuda doesn't have a dependency on cuda.core but seems to need it.

@bdice
Copy link
Contributor

bdice commented Jul 9, 2025

Nevermind. I see that numba-cuda requires an extras specification like numba-cuda[cu12]. The errors with cu11 aren't relevant but we do need some tweaking in dependencies.yaml. I am pushing a change for that.

@bdice bdice requested a review from a team as a code owner July 9, 2025 19:15
Comment on lines -878 to -881
- *numba-cuda-dep
- *numba-dep
Copy link
Contributor

@bdice bdice Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can eliminate these test dependencies because dask-cudf tests no longer need numba / numba-cuda. I did a very small change in dask_cudf/tests/test_distributed.py to eliminate the sole use of numba there.

@rapidsai/cudf-dask-codeowners, can you approve this?

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for the dask-cudf changes.

Note that dask-cuda still has a dependency on numba (https://github.com/rapidsai/dask-cuda/blob/7d9e28af0b08d9819c254eb70ef3a3b16dc7e448/dependencies.yaml#L145), so if you were hoping this would avoid the need for it in the test environment we aren't quite there yet unfortunately.

@bdice bdice requested a review from Matt711 July 10, 2025 04:43
@github-actions github-actions bot added the cudf.pandas Issues specific to cudf.pandas label Jul 10, 2025
@bdice
Copy link
Contributor

bdice commented Jul 10, 2025

I misunderstood this at first. I thought the third-party tests were defined externally (upstream in stumpy). The source code for that third-party test is in this repo. I went ahead and fixed it in 746ede2.

@brandon-b-miller
Copy link
Contributor Author

There's a couple pandas tests failing here still. I haven't managed to reproduce locally yet but I'm going to try with local CI next.

@bdice
Copy link
Contributor

bdice commented Jul 10, 2025

@brandon-b-miller These pandas tests are failing on the nightly run, too: https://github.com/rapidsai/cudf/actions/runs/16187163471

This will probably block other PRs. We can xfail them, perhaps?

@bdice
Copy link
Contributor

bdice commented Jul 10, 2025

Comparing to a recently successful job, the main difference I would suspect is xarray 2025.7.0 vs. xarray 2025.7.1.

@Matt711
Copy link
Contributor

Matt711 commented Jul 10, 2025

@brandon-b-miller These pandas tests are failing on the nightly run, too: https://github.com/rapidsai/cudf/actions/runs/16187163471

This will probably block other PRs. We can xfail them, perhaps?

They're XPASS'ing, so I think we should remove them from the list of xfails.

@brandon-b-miller
Copy link
Contributor Author

That seems right. If the issue is truly between third party libraries it shouldn't block.

@brandon-b-miller
Copy link
Contributor Author

NVVM now seems to be missing on the conda side. Looking in to why

@brandon-b-miller
Copy link
Contributor Author

@bdice I think the compiler dependencies might still be needed on the conda side here.

@bdice
Copy link
Contributor

bdice commented Jul 10, 2025

@brandon-b-miller There seems to be a bug in cuda-python packaging. Tracking it here: conda-forge/cuda-python-feedstock#129

I will push a fix shortly.

)
all_gpu_devices = [
device.id for device in cuda.list_devices()
int(device.id) for device in cuda.list_devices()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int(device.id) for device in cuda.list_devices()
# TODO: Revert `int(device.id)` to `device.id` once
# https://github.com/NVIDIA/numba-cuda/pull/319 is released and
# included in the minimum required numba-cuda version.
int(device.id) for device in cuda.list_devices()

@brandon-b-miller
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit a9c395d into rapidsai:branch-25.08 Jul 15, 2025
102 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Jul 15, 2025
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this pull request Jul 15, 2025
Update to numba-cuda>=0.16 [to match cuDF](rapidsai/cudf#19213), and update stream synchronization code in distributed-ucxx.

Closes rapidsai/dask-upstream-testing#71.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Tom Augspurger (https://github.com/TomAugspurger)

URL: #462
bdice added a commit to bdice/cudf that referenced this pull request Jul 17, 2025
@bdice bdice mentioned this pull request Jul 17, 2025
3 tasks
rapids-bot bot pushed a commit that referenced this pull request Jul 17, 2025
- Partially reverts #19213
- Use numba-cuda>=0.15.2,<0.16

We are investigating an issue where the new CUDA bindings in numba-cuda 0.16 cause segfaults, blocking our CI. This temporarily downgrades to numba-cuda >=0.15.2,<0.16 until a fix can be made.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - https://github.com/brandon-b-miller
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Gil Forsyth (https://github.com/gforsyth)

URL: #19413
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cudf.pandas Issues specific to cudf.pandas feature request New feature or request non-breaking Non-breaking change numba Numba issue Python Affects Python cuDF API.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants