Merged
Conversation
Changes in the compute capability support matrix in nvvm.py will continue to be needed with new CUDA versions if we maintain a list of explicitly-supported compute capabilities. NVRTC supports retrieving the supported list programmatically, so we switch to using it instead. This does assume that the user's environment has a consistent set of components (NVVM, NVRTC, etc.) - this is generally expected to be the case with recent developments in package management, and there's little we can do about an inconsistent environment anyway. Changes outside of nvvm.py / nvrtc.py are to accommodate the movement of this functionality. A major side effect is that we no longer need to initialize the list of supported CCs prior to forking, because we don't need to use the CUDA runtime to populate the supported CC list.
We only used the CUDA runtime library to get the runtime version so that we could populate the list of supported compute capabilities in nvvm.py. Now that we don't do this, and that NVRTC provides the CUDA toolkit version, there is no need to use the CUDA runtime API at all. The Numba API for the runtime version is not deleted in case it was used by external code - instead, it uses NVRTC to obtain the toolkit version. Because NVRTC used the runtime version to determine what prototypes to bind, we need to stop doing that to avoid a circular dependency / deadlock - instead of checking the runtime version and creating the list of prototypes, we try to add all known prototypes, and ignore errors in those related to LTOIR, which can occur with CUDA 11 where they were not present. The `runtime.is_supported_version()` API and its test is removed - it would always have been `False` on CUDA 12 (incorrectly) and this has never been reported as an issue, so it seems very unlikely that anyone was using it.
Recent toolkits move the CCCL headers into their own subdirectory, so we need to add this subdirectory to the include path so that headers such as `cuda/atomic` etc. can be located successfully in all cases.
The most recent `cuCtxCreate()` API in the CUDA bindings will require an additional optional parameter. We don't have to supply a value for it (other than `None`), but we do need to provide the argument on binding versions where it is required.
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Contributor
Author
|
/ok to test |
The change to use NVRTC for the supported compute capabilities also had the implicit effect of making the default compute capability the lowest supported by the installed NVRTC version. We need it to default to at least 7.5 (unless specified higher by the user) to maintain the behaviour of the compute capability logic from nvvm.py that was replaced.
Contributor
Author
|
/ok to test |
kkraus14
approved these changes
Jul 2, 2025
Contributor
Author
|
The wheels-deps-wheels timeout is due to a deadlock: |
Contributor
Author
|
Simple reproducer: |
Contributor
Author
|
Unfortunately, I'm relying on NVRTC to get the toolkit version to determine the name of the library to use to load NVRTC. 🤦 |
Contributor
Author
|
/ok to test |
We use NVRTC to get the CUDA version, so we can't use the CUDA version to determine the NVRTC DLL / SO anymore. Instead, check for the presence of each version, preferring the highest.
Contributor
Author
|
/ok to test |
kkraus14
approved these changes
Jul 2, 2025
gmarkall
added a commit
to gmarkall/numba-cuda
that referenced
this pull request
Jul 2, 2025
- Updates for recent API changes (NVIDIA#313) - Fix lineinfo generation when compile_internal used (NVIDIA#271) (NVIDIA#287) - Build docs with NVIDIA Sphinx theme (NVIDIA#312) - Don't skip debug tests when LTO enabled by default (NVIDIA#311) - Use `cuda.bindings` and `cuda.core` for `Linker` (NVIDIA#133) - Enable LTO by default when pynvjitlink is available (NVIDIA#310)
Merged
gmarkall
added a commit
that referenced
this pull request
Jul 2, 2025
- Updates for recent API changes (#313) - Fix lineinfo generation when compile_internal used (#271) (#287) - Build docs with NVIDIA Sphinx theme (#312) - Don't skip debug tests when LTO enabled by default (#311) - Use `cuda.bindings` and `cuda.core` for `Linker` (#133) - Enable LTO by default when pynvjitlink is available (#310)
gmarkall
added a commit
to gmarkall/numba-cuda
that referenced
this pull request
Nov 3, 2025
PR NVIDIA#313 removed the `runtime.is_supported_version()` API, but it is used by the `cuda.is_supported_version()` public API. This commit restores the `cuda.is_supported_version()` API by checking whether the CUDA runtime major version is 12 or 13. The version number check will need bumping as appropriate when future toolkit major versions are added and existing toolkit major version are dropped. This situation will be caught by the test that is added to exercise this API.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR updates Numba-CUDA for recent driver API changes. Some related changes and simplifications are required; these are detailed in individual commit messages.
Fixes #281.