Vendor in types and datamodel for CUDA-specific changes #533

VijayKandiah · 2025-10-16T21:45:13Z

This is a large PR that vendors in types and datamodel from Numba for future CUDA-specific customizations. This PR includes a convert_to_cuda_type API in numba.cuda.core.sigutils which is called by default inside normalize_signature to map numba.core.types (if any) to equivalent numba.cuda.types if possible. This mapping enables the typing system in numba-cuda to work on numba.core.types.

copy-pr-bot · 2025-10-16T21:45:16Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

VijayKandiah · 2025-10-16T21:49:10Z

/ok to test

VijayKandiah · 2025-10-16T22:02:13Z

/ok to test

VijayKandiah · 2025-10-16T23:24:23Z

Working on the cuDF failures.

VijayKandiah · 2025-10-17T00:36:11Z

/ok to test

VijayKandiah · 2025-10-17T04:35:20Z

/ok to test

numba_cuda/numba/cuda/typing/arraydecl.py

numba_cuda/numba/cuda/typing/collections.py

numba_cuda/numba/cuda/typing/npdatetime.py

VijayKandiah · 2025-10-17T16:39:24Z

/ok to test

VijayKandiah · 2025-10-21T16:50:39Z

/ok to test

VijayKandiah · 2025-10-22T18:23:43Z

/ok to test

brandon-b-miller · 2025-10-22T18:27:09Z

numba_cuda/numba/cuda/core/sigutils.py


 try:
    from numba.core.typing import Signature as CoreSignature
+    from numba.core import types as core_types


Not blocking for this PR but have we considered determining the presence of numba in a central way up front that we can just reference instead of proliferating try/catches? if HAVE_NUMBA: etc

I like that idea. Yes, we should be doing that. A centralized HAVE_NUMBA is much cleaner than all the try imports we have in typing and other modules. This can be in a future PR.

Also, since we have the redirector in place, this type conversion/mapping in sigutils is not really needed. If we have numba in the env, the redirector will ensure that numba.core.types are used everywhere. So we would not encounter a case where there'd be a mix of numba.cuda.types and numba.core.types needing this conversion. Should we just remove this?

I removed the type mapping.

…yping registry

VijayKandiah · 2025-10-22T19:41:42Z

/ok to test

gmarkall · 2025-10-24T08:45:22Z

numba_cuda/numba/cuda/types.py

+import sys
+from numba.cuda.utils import _RedirectSubpackage
+
+if importlib.util.find_spec("numba.core.types"):


This redirection method does not work - if I change this to simulate what would happen if Numba were not found, like:

diff --git a/numba_cuda/numba/cuda/types.py b/numba_cuda/numba/cuda/types.py index 94e30c17..ba81757e 100644 --- a/numba_cuda/numba/cuda/types.py +++ b/numba_cuda/numba/cuda/types.py @@ -5,7 +5,7 @@ import importlib import sys from numba.cuda.utils import _RedirectSubpackage -if importlib.util.find_spec("numba.core.types"): +if importlib.util.find_spec("numba.core.types") and False: sys.modules[__name__] = _RedirectSubpackage(locals(), "numba.core.types") else: sys.modules[__name__] = _RedirectSubpackage(

then importing cuda from numba no longer works:

$ python -c "from numba import cuda" Traceback (most recent call last): File "<string>", line 1, in <module> File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/__init__.py", line 11, in <module> import numba.cuda.types as types File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/types.py", line 11, in <module> sys.modules[__name__] = _RedirectSubpackage( ^^^^^^^^^^^^^^^^^^^^ File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/utils.py", line 592, in __init__ new_mod_obj = import_module(new_module) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gmarkall/miniforge3/envs/numbadev/lib/python3.12/importlib/__init__.py", line 98, in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cuda_types/__init__.py", line 10, in <module> from .containers import * File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cuda_types/containers.py", line 26, in <module> from .misc import Undefined, unliteral, Optional, NoneType File "/home/gmarkall/numbadev/numba-cuda/numba_cuda/numba/cuda/cuda_types/misc.py", line 4, in <module> from numba.cuda.types.abstract import Callable, Literal, Type, Hashable ModuleNotFoundError: No module named 'numba.cuda.types.abstract'; 'numba.cuda.types' is not a package

Just realised I forgot to clarify: the problem is you can't use a module to redirect a package, only another module - this is why when I prototyped this I had individual redirectors like cuda_abstract.py -> abstract.py, etc.

gmarkall · 2025-10-24T13:27:55Z

numba_cuda/numba/cuda/ext_types.py

I don't think any of the globals in here (e.g. Dim3, GridGroup), etc., have been used like public API under numba.cuda.types before, so this should be a safe move. If this turns out to be incorrect we could always move the ones that have been used publicly into types/__init__.py.

gmarkall

I think in general the changes here are good, apart from the specific issue with the redirector not being able to support using a module to redirect to a package. I believe I have a fix for this in #545; with that change, the CI seems to pass, apart from a couple of runners getting stuck with network connection issues. When I locally apply:

diff --git a/numba_cuda/numba/cuda/utils.py b/numba_cuda/numba/cuda/utils.py
index 13203103..b5bf83ca 100644
--- a/numba_cuda/numba/cuda/utils.py
+++ b/numba_cuda/numba/cuda/utils.py
@@ -615,7 +615,7 @@ class _RedirectSubpackage(ModuleType):
 
 
 def redirect_numba_module(old_module_locals, numba_module, numba_cuda_module):
-    if find_spec("numba"):
+    if find_spec("numba") and False:
         return _RedirectSubpackage(old_module_locals, numba_module)
     else:
         return _RedirectSubpackage(old_module_locals, numba_cuda_module)

to that, to simulate if Numba were not found, then I see that the redirection works in the sense that the tests run, but they don't all pass because we still aren't fully isolated between Numba types and Numba-CUDA types (the errors we've seen before like TypeError: cannot augment Function(<built-in function truth>) with Function(<built-in function truth>) appear).

For the purposes of this PR I think that is OK - if the code were exclusively using Numba-CUDA code / types and not also including Numba types, there would be no duplication, but we're not at that point until subsequent PRs have gone in.

VijayKandiah · 2025-10-24T16:52:09Z

/ok to test

- Add support for cache-hinted load and store operations (NVIDIA#587) - Add more thirdparty tests (NVIDIA#586) - Add sphinx-lint to pre-commit and fix errors (NVIDIA#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (NVIDIA#544) - chore: clean up dead workaround for unavailable `lru_cache` (NVIDIA#598) - chore(docs): format types docs (NVIDIA#596) - refactor: decouple `Context` from `Stream` and `Event` objects (NVIDIA#579) - Fix freezing in of constant arrays with negative strides (NVIDIA#589) - Update tests to accept variants of generated PTX (NVIDIA#585) - refactor: replace device functionality with `cuda.core` APIs (NVIDIA#581) - Move frontend tests to `cudapy` namespace (NVIDIA#558) - Generalize the concurrency group for main merges (NVIDIA#582) - ci: move pre-commit checks to pre commit action (NVIDIA#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (NVIDIA#574) - ci: ensure that python version in ci matches matrix (NVIDIA#575) - Fix the `cuda.is_supported_version()` API (NVIDIA#571) - Fix checks on main (NVIDIA#576) - feat: add `math.nextafter` (NVIDIA#543) - ci: replace conda testing with pixi (NVIDIA#554) - [CI] Run PR workflow on merge to main (NVIDIA#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (NVIDIA#569) - test: enable fail-on-warn and clean up resulting failures (NVIDIA#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (NVIDIA#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (NVIDIA#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (NVIDIA#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (NVIDIA#550) - test: revert back to ipc futures that await each iteration (NVIDIA#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (NVIDIA#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (NVIDIA#534) - Remove dependencies on target_extension for CUDA target (NVIDIA#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (NVIDIA#559) - [WIP] Port numpy reduction tests to CUDA (NVIDIA#523) - ci: add timeout to avoid blocking the job queue (NVIDIA#556) - Handle `cuda.core.Stream` in driver operations (NVIDIA#401) - feat: add support for `math.exp2` (NVIDIA#541) - Vendor in types and datamodel for CUDA-specific changes (NVIDIA#533) - refactor: cleanup device constructor (NVIDIA#548) - bench: add cupy to array constructor kernel launch benchmarks (NVIDIA#547) - perf: cache dimension computations (NVIDIA#542) - perf: remove duplicated size computation (NVIDIA#537) - chore(perf): add torch to benchmark (NVIDIA#539) - test: speed up ipc tests by ~6.5x (NVIDIA#527) - perf: speed up kernel launch (NVIDIA#510) - perf: remove context threading in various pointer abstractions (NVIDIA#536) - perf: reduce the number of `__cuda_array_interface__` accesses (NVIDIA#538) - refactor: remove unnecessary custom map and set implementations (NVIDIA#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (NVIDIA#513) - test: add benchmarks for kernel launch for reproducibility (NVIDIA#528) - test(pixi): update pixi testing command to work with the new `testing` directory (NVIDIA#522) - refactor: fully remove `USE_NV_BINDING` (NVIDIA#525) - Draft: Vendor in the IR module (NVIDIA#439) - pyproject.toml: add search path for Pyrefly (NVIDIA#524) - Vendor in numba.core.typing for CUDA-specific changes (NVIDIA#473) - Use numba.config when available, otherwise use numba.cuda.config (NVIDIA#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (NVIDIA#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (NVIDIA#502) - build: allow parallelization of nvcc testing builds (NVIDIA#521) - chore(dev-deps): add pixi (NVIDIA#505) - Vendor the imputils module for CUDA refactoring (NVIDIA#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (NVIDIA#519) - Switch back to stable cuDF release in thirdparty tests (NVIDIA#518) - Updating .gitignore with binaries in the `testing` folder (NVIDIA#516) - Remove some unnecessary uses of ContextResettingTestCase (NVIDIA#507) - Vendor in _helperlib cext for CUDA-specific changes (NVIDIA#512) - Vendor in typeconv for future CUDA-specific changes (NVIDIA#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (NVIDIA#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (NVIDIA#494) - Make the CUDA target the default for CUDA overload decorators (NVIDIA#511) - Remove C extension loading hacks (NVIDIA#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (NVIDIA#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (NVIDIA#433) - Fix Bf16 Test OB Error (NVIDIA#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (NVIDIA#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (NVIDIA#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (NVIDIA#488) - Improve debug value range coverage (NVIDIA#461) - Add `compile_all` API (NVIDIA#484) - Vendor in core.registry for CUDA-specific changes (NVIDIA#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (NVIDIA#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (NVIDIA#476) - [test] Remove dependency on cpu_target (NVIDIA#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (NVIDIA#475) - [test] Use numpy's tolerance for float16 (NVIDIA#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (NVIDIA#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (NVIDIA#478)

- Add support for cache-hinted load and store operations (#587) - Add more thirdparty tests (#586) - Add sphinx-lint to pre-commit and fix errors (#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (#544) - chore: clean up dead workaround for unavailable `lru_cache` (#598) - chore(docs): format types docs (#596) - refactor: decouple `Context` from `Stream` and `Event` objects (#579) - Fix freezing in of constant arrays with negative strides (#589) - Update tests to accept variants of generated PTX (#585) - refactor: replace device functionality with `cuda.core` APIs (#581) - Move frontend tests to `cudapy` namespace (#558) - Generalize the concurrency group for main merges (#582) - ci: move pre-commit checks to pre commit action (#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (#574) - ci: ensure that python version in ci matches matrix (#575) - Fix the `cuda.is_supported_version()` API (#571) - Fix checks on main (#576) - feat: add `math.nextafter` (#543) - ci: replace conda testing with pixi (#554) - [CI] Run PR workflow on merge to main (#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (#569) - test: enable fail-on-warn and clean up resulting failures (#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (#550) - test: revert back to ipc futures that await each iteration (#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (#534) - Remove dependencies on target_extension for CUDA target (#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (#559) - [WIP] Port numpy reduction tests to CUDA (#523) - ci: add timeout to avoid blocking the job queue (#556) - Handle `cuda.core.Stream` in driver operations (#401) - feat: add support for `math.exp2` (#541) - Vendor in types and datamodel for CUDA-specific changes (#533) - refactor: cleanup device constructor (#548) - bench: add cupy to array constructor kernel launch benchmarks (#547) - perf: cache dimension computations (#542) - perf: remove duplicated size computation (#537) - chore(perf): add torch to benchmark (#539) - test: speed up ipc tests by ~6.5x (#527) - perf: speed up kernel launch (#510) - perf: remove context threading in various pointer abstractions (#536) - perf: reduce the number of `__cuda_array_interface__` accesses (#538) - refactor: remove unnecessary custom map and set implementations (#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (#513) - test: add benchmarks for kernel launch for reproducibility (#528) - test(pixi): update pixi testing command to work with the new `testing` directory (#522) - refactor: fully remove `USE_NV_BINDING` (#525) - Draft: Vendor in the IR module (#439) - pyproject.toml: add search path for Pyrefly (#524) - Vendor in numba.core.typing for CUDA-specific changes (#473) - Use numba.config when available, otherwise use numba.cuda.config (#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (#502) - build: allow parallelization of nvcc testing builds (#521) - chore(dev-deps): add pixi (#505) - Vendor the imputils module for CUDA refactoring (#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (#519) - Switch back to stable cuDF release in thirdparty tests (#518) - Updating .gitignore with binaries in the `testing` folder (#516) - Remove some unnecessary uses of ContextResettingTestCase (#507) - Vendor in _helperlib cext for CUDA-specific changes (#512) - Vendor in typeconv for future CUDA-specific changes (#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (#494) - Make the CUDA target the default for CUDA overload decorators (#511) - Remove C extension loading hacks (#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (#433) - Fix Bf16 Test OB Error (#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (#488) - Improve debug value range coverage (#461) - Add `compile_all` API (#484) - Vendor in core.registry for CUDA-specific changes (#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (#476) - [test] Remove dependency on cpu_target (#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (#475) - [test] Use numpy's tolerance for float16 (#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (#478)

Vendor in types, datamodel, target_extension for CUDA-specific changes

c4e6c4a

VijayKandiah self-assigned this Oct 16, 2025

VijayKandiah added the 3 - Ready for Review Ready for review by team label Oct 16, 2025

Merge 'main' into vk/types

8095d5e

VijayKandiah mentioned this pull request Oct 16, 2025

Vendor the datamodel modules for CUDA-specific refactoring #465

Closed

Vendor in generators and removerefctpass for CUDA-specific changes

f5c5552

VijayKandiah requested a review from gmarkall October 16, 2025 23:23

Fix types imports in nrt

6722536

Add type-mapping for Records

244a38e

gmarkall reviewed Oct 17, 2025

View reviewed changes

numba_cuda/numba/cuda/typing/arraydecl.py Show resolved Hide resolved

gmarkall reviewed Oct 17, 2025

View reviewed changes

numba_cuda/numba/cuda/typing/collections.py Outdated Show resolved Hide resolved

gmarkall reviewed Oct 17, 2025

View reviewed changes

numba_cuda/numba/cuda/typing/npdatetime.py Outdated Show resolved Hide resolved

VijayKandiah added 2 commits October 17, 2025 09:34

Remove duplicate registry declarations in typing submodules

d87d44e

Merge 'main' into vk/types

528b6d0

This was referenced Oct 17, 2025

[Refactor][NFC] Vendor-in errors for future CUDA-specific changes #534

Merged

[Testing] Add some numpy array testing for the CUDA target #535

Open

Revert target_extension vendoring

00ec227

VijayKandiah changed the title ~~Vendor in types, datamodel, target_extension for CUDA-specific changes~~ Vendor in types and datamodel for CUDA-specific changes Oct 21, 2025

VijayKandiah added 2 commits October 22, 2025 11:23

Redirect to numba.types and numba.core.datamodel if they are available

0b6601c

Merge branch 'main' into vk/types

b9e9109

brandon-b-miller reviewed Oct 22, 2025

View reviewed changes

VijayKandiah added 2 commits October 22, 2025 12:07

Remove mapping between numba and numba-cuda types

cc80df3

Fix filtering for registering third-party declarations from numba's t…

d7ac874

…yping registry

gmarkall reviewed Oct 24, 2025

View reviewed changes

gmarkall added 3 commits October 24, 2025 09:48

Delete module-based package redirectors

e956b92

Move cuda_types and cuda_datamodel back to types and datamodel

b9106f6

Redirect types and datamodel modules

3827cee

gmarkall mentioned this pull request Oct 24, 2025

[CI TEST] Modified redirector for #533 #545

Closed

gmarkall reviewed Oct 24, 2025

View reviewed changes

gmarkall requested changes Oct 24, 2025

View reviewed changes

Merge branch 'main' into vk/types

70b4167

gmarkall approved these changes Oct 27, 2025

View reviewed changes

gmarkall merged commit 3282e93 into NVIDIA:main Oct 27, 2025
139 of 140 checks passed

gmarkall added 5 - Ready to merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 27, 2025

gmarkall mentioned this pull request Nov 20, 2025

Bump version to 0.21.0 #602

Merged

Vendor in types and datamodel for CUDA-specific changes #533

Vendor in types and datamodel for CUDA-specific changes #533

Uh oh!

Conversation

VijayKandiah commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Oct 16, 2025

Uh oh!

VijayKandiah commented Oct 16, 2025

Uh oh!

VijayKandiah commented Oct 16, 2025

Uh oh!

VijayKandiah commented Oct 16, 2025

Uh oh!

VijayKandiah commented Oct 17, 2025

Uh oh!

VijayKandiah commented Oct 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VijayKandiah commented Oct 17, 2025

Uh oh!

VijayKandiah commented Oct 21, 2025

Uh oh!

VijayKandiah commented Oct 22, 2025

Uh oh!

brandon-b-miller Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VijayKandiah Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

VijayKandiah Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

VijayKandiah Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

VijayKandiah commented Oct 22, 2025

Uh oh!

gmarkall Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gmarkall Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

VijayKandiah commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

VijayKandiah commented Oct 16, 2025 •

edited

Loading

brandon-b-miller Oct 22, 2025 •

edited

Loading

gmarkall Oct 24, 2025 •

edited

Loading