-
Notifications
You must be signed in to change notification settings - Fork 54
Set up a new VM-based CI infrastructure #604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
/ok to test 8e96a09 |
|
/ok to test 64068b7 |
|
/ok to test 48cca14 |
|
/ok to test ccd8382 |
|
/ok to test 476082b |
|
/ok to test 822b37d |
|
/ok to test cafdf68 |
Greptile SummaryThis PR introduces a new VM-based CI infrastructure cloned from cuda-python, replacing the container-based approach to enable faster builds and Day 1 rollout support for new CUDA/Python versions. The new CI runs in parallel with the existing infrastructure. Key Changes:
Issues Found:
Confidence Score: 4/5
Important Files Changed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
.github/workflows/ci-new.yaml, line 202 (link)logic: references non-existent job
needs.doc
23 files reviewed, 1 comment
- Add arch specific target support (NVIDIA#549) - chore: disable `locked` flag to bypass prefix-dev/pixi#5256 (NVIDIA#714) - ci: relock pixi (NVIDIA#712) - ci: remove redundant conda build in ci (NVIDIA#711) - chore(deps): bump numba-cuda version and relock pixi (NVIDIA#707) - Dropping bits in the old CI & Propagating recent changes from cuda-python (NVIDIA#683) - Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and `nvrtc` packages for CUDA 13 (NVIDIA#701) - perf: remove some exception control flow and buffer-exception penalization for arrays (NVIDIA#700) - perf: let CAI fall through instead of calling from_cuda_array_interface (NVIDIA#694) - chore: perf lint (NVIDIA#697) - chore(deps): bump deps in pixi lockfile (NVIDIA#693) - fix: use freethreading-supported `_PySet_NextItemRef` where possible (NVIDIA#682) - Support python `3.14` (NVIDIA#599) - Remove customized address space tracking and address class emission in debug info (NVIDIA#669) - Drop `experimental` from cuda.core namespace imports (NVIDIA#676) - Remove dangling references to NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (NVIDIA#675) - Use `rapidsai/sccache` in CI (NVIDIA#674) - chore(dev-deps): remove ipython and pyinstrument (NVIDIA#670) - Set up a new VM-based CI infrastructure (NVIDIA#604)
- Add arch specific target support (#549) - chore: disable `locked` flag to bypass prefix-dev/pixi#5256 (#714) - ci: relock pixi (#712) - ci: remove redundant conda build in ci (#711) - chore(deps): bump numba-cuda version and relock pixi (#707) - Dropping bits in the old CI & Propagating recent changes from cuda-python (#683) - Fix `test_wheel_deps_wheels.sh` to actually uninstall `nvvm` and `nvrtc` packages for CUDA 13 (#701) - perf: remove some exception control flow and buffer-exception penalization for arrays (#700) - perf: let CAI fall through instead of calling from_cuda_array_interface (#694) - chore: perf lint (#697) - chore(deps): bump deps in pixi lockfile (#693) - fix: use freethreading-supported `_PySet_NextItemRef` where possible (#682) - Support python `3.14` (#599) - Remove customized address space tracking and address class emission in debug info (#669) - Drop `experimental` from cuda.core namespace imports (#676) - Remove dangling references to NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY (#675) - Use `rapidsai/sccache` in CI (#674) - chore(dev-deps): remove ipython and pyinstrument (#670) - Set up a new VM-based CI infrastructure (#604)
TODO: Rebase and write up descriptionsSee WIP report here #604 (comment) and here #604 (comment).
UPDATE: This PR brings in a new CI infra that is a clone of what cuda-python uses today. The new CI is fully VM-based instead of container-based, except for
cibuildwheellaunching a manylinux containerThis is desirable because containers are the major bottleneck that we should seriously consider moving away from:
Furthermore, my opinion is that we really need to make sure the CUDA Python CI infrastructure is "copy-paste-able" (with some caveats discussed internally, which I am not going to repeat here). It was designed with future application to CuPy and numba-cuda in mind, since at the Python level lots of our projects have similar/same support matrix, and there is no reason for each project to rebuild the wheel from scratch and suffer from maintenance issues
Currently this PR is made such that it runs in parallel with the old CI. We can have a separate PR to follow up and hook more old CI pieces with the new CI if we decide to move forward.
Below is a detailed breakdown of what this PR entails.
3.14#599 is merged we can revert this commit to start testing.cuda-bindingsis not installedfetch_ctkactioncuobjdumpis installed byfetch_ctk(whose default does not include it)libcudadevrt.ainstalled byfetch_ctkcan be foundcuobjdumpcan be skipped in a pure-wheel test env.Commits 593fcf3 and df8f583 are bug fixes to
fetch_ctkthat we should backport to cuda-python.