Skip to content

Conversation

@leofang
Copy link
Member

@leofang leofang commented Nov 20, 2025

TODO: Rebase and write up descriptions

See WIP report here #604 (comment) and here #604 (comment).

UPDATE: This PR brings in a new CI infra that is a clone of what cuda-python uses today. The new CI is fully VM-based instead of container-based, except for

  • cibuildwheel launching a manylinux container
  • we launch a vanilla Ubuntu container for testing due to the requirement of nv-gha-runners

This is desirable because containers are the major bottleneck that we should seriously consider moving away from:

  • it takes time to pull on a per-PR basis
  • it blocks us from performing Day 1 rollout (to support new CUDA or new Python versions) which is now a requirement for CUDA Python
    • related: it takes nontrivial amount of efforts in maintaining our own containers

Furthermore, my opinion is that we really need to make sure the CUDA Python CI infrastructure is "copy-paste-able" (with some caveats discussed internally, which I am not going to repeat here). It was designed with future application to CuPy and numba-cuda in mind, since at the Python level lots of our projects have similar/same support matrix, and there is no reason for each project to rebuild the wheel from scratch and suffer from maintenance issues

Currently this PR is made such that it runs in parallel with the old CI. We can have a separate PR to follow up and hook more old CI pieces with the new CI if we decide to move forward.

Below is a detailed breakdown of what this PR entails.

  • c1f1cec: copy/paste minimal CI infra from cuda-python, with zero change
  • 85566c7: CI changes needed to tailor for numba-cuda needs
  • 1b939c3: Enable the cibuildwheel GHA
  • 950078e: Disable Python 3.14 for now
    • This shows how easy it is to add support when a new Python version is out. For example, once Support python 3.14 #599 is merged we can revert this commit to start testing.
  • 6fa44fc: a drive-by fix to update the warning when cuda-bindings is not installed
  • 593fcf3: Ensure Linux executables can be found when installed by our custom fetch_ctk action
  • f183ebe: Ensure cuobjdump is installed by fetch_ctk (whose default does not include it)
  • 72da422: Fixes to ensure the Makefile can be run in the new CI env (fully Bash-based on both Linux and Windows)
  • a973726: Suppress NVRTC warnings on V100 + CUDA 12 (otherwise they are turned into error by pytest)
  • df8f583: Fix to ensure libcudadevrt.a installed by fetch_ctk can be found
  • 861eede: Ensure the tests that need cuobjdump can be skipped in a pure-wheel test env.

Commits 593fcf3 and df8f583 are bug fixes to fetch_ctk that we should backport to cuda-python.

@copy-pr-bot

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@leofang

This comment was marked as outdated.

@brandon-b-miller
Copy link
Contributor

/ok to test 8e96a09

@brandon-b-miller
Copy link
Contributor

/ok to test 64068b7

@brandon-b-miller
Copy link
Contributor

/ok to test 48cca14

@brandon-b-miller
Copy link
Contributor

/ok to test ccd8382

@brandon-b-miller
Copy link
Contributor

/ok to test 476082b

@brandon-b-miller
Copy link
Contributor

/ok to test 822b37d

@brandon-b-miller brandon-b-miller marked this pull request as ready for review December 17, 2025 18:20
@brandon-b-miller brandon-b-miller added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Dec 17, 2025
@brandon-b-miller
Copy link
Contributor

/ok to test cafdf68

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 17, 2025

Greptile Summary

This PR introduces a new VM-based CI infrastructure cloned from cuda-python, replacing the container-based approach to enable faster builds and Day 1 rollout support for new CUDA/Python versions. The new CI runs in parallel with the existing infrastructure.

Key Changes:

  • Added new GitHub Actions workflows (ci-new.yaml, build-wheel.yml, test-wheel-linux.yml, test-wheel-windows.yml) with VM-based runners
  • Implemented test matrix configuration in JSON for flexible platform/Python/CUDA version combinations
  • Created custom actions for fetching CUDA toolkit, installing dependencies, and extracting PR numbers
  • Added CI utility scripts for environment setup, test execution, and artifact management
  • Updated pyproject.toml to use cuda-toolkit metapackage instead of individual CUDA components
  • Fixed test skips for wheel-only environments and NVVM bugs on newer compute capabilities
  • Updated error messages to reference correct package names (cuda-bindings)

Issues Found:

  • Line 202 in ci-new.yaml references non-existent needs.doc job that will cause workflow failure

Confidence Score: 4/5

  • Safe to merge after fixing the needs.doc reference bug in the checks job
  • The PR is well-structured with comprehensive CI infrastructure changes. However, there's a critical bug on line 202 of ci-new.yaml that references a non-existent job, which will cause the checks job to fail. Once fixed, the infrastructure appears solid with proper error handling, caching, and test coverage.
  • .github/workflows/ci-new.yaml requires immediate attention to fix the needs.doc reference

Important Files Changed

Filename Overview
.github/workflows/ci-new.yaml New main CI workflow added - references non-existent needs.doc job on line 202
.github/workflows/build-wheel.yml Reusable workflow for building wheels across platforms with sccache support
.github/workflows/test-wheel-linux.yml Linux test workflow with dynamic matrix computation from JSON config
.github/workflows/test-wheel-windows.yml Windows test workflow with GPU driver mode switching and verification
.github/actions/fetch_ctk/action.yml Action for fetching mini CUDA toolkit with caching and PATH setup
ci/test-matrix.json Test matrix configuration defining Python/CUDA/GPU combinations for PR and nightly tests
pyproject.toml Switched to cuda-toolkit metapackage and added cibuildwheel configuration

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. .github/workflows/ci-new.yaml, line 202 (link)

    logic: references non-existent job needs.doc

23 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@cpcloud cpcloud merged commit 43284df into NVIDIA:main Dec 17, 2025
121 checks passed
@leofang leofang deleted the new_ci branch December 17, 2025 19:54
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Jan 12, 2026
- Add arch specific target support (NVIDIA#549)
- chore: disable `locked` flag to bypass prefix-dev/pixi#5256 (NVIDIA#714)
- ci: relock pixi (NVIDIA#712)
- ci: remove redundant conda build in ci (NVIDIA#711)
- chore(deps): bump numba-cuda version and relock pixi (NVIDIA#707)
- Dropping bits in the old CI & Propagating recent changes from cuda-python (NVIDIA#683)
- Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and `nvrtc` packages for CUDA 13 (NVIDIA#701)
- perf: remove some exception control flow and buffer-exception penalization for arrays (NVIDIA#700)
- perf: let CAI fall through instead of calling from_cuda_array_interface (NVIDIA#694)
- chore: perf lint (NVIDIA#697)
- chore(deps): bump deps in pixi lockfile (NVIDIA#693)
- fix: use freethreading-supported `_PySet_NextItemRef` where possible (NVIDIA#682)
- Support python `3.14` (NVIDIA#599)
- Remove customized address space tracking and address class emission in debug info (NVIDIA#669)
- Drop `experimental` from cuda.core namespace imports (NVIDIA#676)
- Remove dangling references to NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (NVIDIA#675)
- Use `rapidsai/sccache` in CI (NVIDIA#674)
- chore(dev-deps): remove ipython and pyinstrument (NVIDIA#670)
- Set up a new VM-based CI infrastructure  (NVIDIA#604)
@gmarkall gmarkall mentioned this pull request Jan 12, 2026
gmarkall added a commit that referenced this pull request Jan 12, 2026
- Add arch specific target support (#549)
- chore: disable `locked` flag to bypass
prefix-dev/pixi#5256 (#714)
- ci: relock pixi (#712)
- ci: remove redundant conda build in ci (#711)
- chore(deps): bump numba-cuda version and relock pixi (#707)
- Dropping bits in the old CI & Propagating recent changes from
cuda-python (#683)
- Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and
`nvrtc` packages for CUDA 13 (#701)
- perf: remove some exception control flow and buffer-exception
penalization for arrays (#700)
- perf: let CAI fall through instead of calling
from_cuda_array_interface (#694)
- chore: perf lint (#697)
- chore(deps): bump deps in pixi lockfile (#693)
- fix: use freethreading-supported `_PySet_NextItemRef` where possible
(#682)
- Support python `3.14` (#599)
- Remove customized address space tracking and address class emission in
debug info (#669)
- Drop `experimental` from cuda.core namespace imports (#676)
- Remove dangling references to
NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (#675)
- Use `rapidsai/sccache` in CI (#674)
- chore(dev-deps): remove ipython and pyinstrument (#670)
- Set up a new VM-based CI infrastructure  (#604)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants