[NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization by VijayKandiah · Pull Request #338 · NVIDIA/numba-cuda

VijayKandiah · 2025-07-22T00:18:36Z

This PR vendors in some necessary pieces of Dispatcher class into CUDADispatcher for future CUDA-specific changes, making the latter inherit directly from _DispatcherBase. This change also vendors in supporting classes like _MemoMixin and _FunctionCompiler for future customizations. This is an NFC, so no additional unit tests are needed.

copy-pr-bot · 2025-07-22T00:18:40Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

VijayKandiah · 2025-07-22T00:18:52Z

/ok to test 04efabd

copy-pr-bot · 2025-07-24T17:46:20Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

VijayKandiah · 2025-07-24T18:25:26Z

/ok to test c288169

gmarkall · 2025-07-28T10:24:07Z

numba_cuda/numba/cuda/dispatcher.py



-class CUDADispatcher(Dispatcher, serialize.ReduceMixin):
+class _MemoMixin:


I don't think this is used / relevant to the CUDA target - if it is removed, does everything still appear to work?

Actually, don't worry about this comment too much - I think it will show up when we run coverage if it's not needed, so let's not think about it now.

Sounds good!

gmarkall · 2025-07-28T11:58:34Z

numba_cuda/numba/cuda/dispatcher.py

+        locals: dict, optional
+            Mapping of local variable names to Numba types.  Used to override


There is no locals parameter.

gmarkall · 2025-07-28T12:00:01Z

numba_cuda/numba/cuda/dispatcher.py

+        functools.update_wrapper(self, py_func)
+
+        self.targetoptions = targetoptions
+        self.locals = locals


It took me a while to figure out how this could possibly be working, then I realised this is assigning the built-in locals to self.locals. Although it doesn't seem to have disrupted the tests, this seems dangerous - can you remove the locals assignment please?

(locals is an option that's not supported by the CUDA target).

I removed locals from CUDADispatcher and _FunctionCompiler. We are now passing locals={} to compiler.compile_extra similar to how we invoke compiler.compile_extra in numba/cuda/compiler.py.

gmarkall

There is some stuff I think is not needed in this PR, but I think it will be easier to have a large-scale cleanup following a run of the tests with coverage reporting, after other PRs have gone in too.

The only thing that really needs to change here is setting the CUDADispatcher's locals to the locals builtin function, which seems likely to have an unintended and hard-to-diagnose consequence.

VijayKandiah · 2025-07-28T16:58:51Z

/ok to test b02356d

…ompiler

VijayKandiah · 2025-07-28T22:08:33Z

/ok to test fbcd2d9

- [NFC] FileCheck tests check all overloads (NVIDIA#354) - [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific refactoring and changes (NVIDIA#349) - Vendor in usecases used in testing (NVIDIA#359) - Add thirdparty tests of numba extensions (NVIDIA#348) - Support running tests in parallel (NVIDIA#350) - Add more debuginfo tests (NVIDIA#358) - [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and CUDACacheImpl to allow for future CUDA-specific refactoring and changes (NVIDIA#334) - [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization (NVIDIA#338) - Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations (NVIDIA#329) - [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for future CUDA-specific refactoring and changes (NVIDIA#322) - Vendor in Codegen and CodeLibrary for CUDA-specific customization (NVIDIA#327) - Disable tests that deadlock due to NVIDIA#317 (NVIDIA#356) - FIX: Add type check for shape elements in DeviceNDArrayBase constructor (NVIDIA#352) - Merge pull request NVIDIA#265 from lakshayg/fp16-support - Add performance warning - Fix tests - Create and register low++ bindings for float16 - Create typing/target registries for float16 - Replace Numbast generated lower_casts - Replace Numbast generated operators - Alias __half to numba.core.types.float16 - Generate fp16 bindings using numbast - Remove existing fp16 logic - [REVIEW][NFC] Vendor in the utils and cgutils to allow for future CUDA-specific refactoring and changes (NVIDIA#340) - [RFC,TESTING] Add filecheck test infrastructure (NVIDIA#342) - Migrate test infra to pytest (NVIDIA#347) - Add .vscode to gitignore (NVIDIA#344) - [NFC] Add dev dependencies to project config (NVIDIA#341) - Allow Inspection of Link-Time Optimized PTX (NVIDIA#326) - [NFC] Vendor in DIBuilder used by CUDADIBuilder (NVIDIA#332) - Add guidance on setting up pre-commit (NVIDIA#339) - [Refactor][NFC] Vendor in MinimalCallConv (NVIDIA#333) - [Refactor][NFC] Vendor in BaseCallConv (NVIDIA#324) - [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for future CUDA-specific customizations (NVIDIA#325)

- [NFC] FileCheck tests check all overloads (#354) - [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific refactoring and changes (#349) - Vendor in usecases used in testing (#359) - Add thirdparty tests of numba extensions (#348) - Support running tests in parallel (#350) - Add more debuginfo tests (#358) - [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and CUDACacheImpl to allow for future CUDA-specific refactoring and changes (#334) - [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization (#338) - Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations (#329) - [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for future CUDA-specific refactoring and changes (#322) - Vendor in Codegen and CodeLibrary for CUDA-specific customization (#327) - Disable tests that deadlock due to #317 (#356) - FIX: Add type check for shape elements in DeviceNDArrayBase constructor (#352) - Merge pull request #265 from lakshayg/fp16-support - Add performance warning - Fix tests - Create and register low++ bindings for float16 - Create typing/target registries for float16 - Replace Numbast generated lower_casts - Replace Numbast generated operators - Alias __half to numba.core.types.float16 - Generate fp16 bindings using numbast - Remove existing fp16 logic - [REVIEW][NFC] Vendor in the utils and cgutils to allow for future CUDA-specific refactoring and changes (#340) - [RFC,TESTING] Add filecheck test infrastructure (#342) - Migrate test infra to pytest (#347) - Add .vscode to gitignore (#344) - [NFC] Add dev dependencies to project config (#341) - Allow Inspection of Link-Time Optimized PTX (#326) - [NFC] Vendor in DIBuilder used by CUDADIBuilder (#332) - Add guidance on setting up pre-commit (#339) - [Refactor][NFC] Vendor in MinimalCallConv (#333) - [Refactor][NFC] Vendor in BaseCallConv (#324) - [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for future CUDA-specific customizations (#325)

We had initially removed the `locals` attribute from `_FunctionCompiler` while vendoring it in for CUDA-Specific changes in #338. This seemed to have caused `AttributeError` for a nvmath-python test https://github.com/NVIDIA/nvmath-python/blob/main/tests/nvmath_tests/device/test_cublasdx_numba.py. This PR restores this `locals` attribute and initializes it to `{}` --------- Co-authored-by: Graham Markall <gmarkall@nvidia.com>

VijayKandiah added the 2 - In Progress Currently a work in progress label Jul 22, 2025

VijayKandiah force-pushed the vk/cuda_dispatcher branch from 04efabd to c288169 Compare July 24, 2025 17:46

gmarkall added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jul 28, 2025

gmarkall reviewed Jul 28, 2025

View reviewed changes

gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Jul 28, 2025

VijayKandiah added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Jul 28, 2025

VijayKandiah force-pushed the vk/cuda_dispatcher branch from b02356d to 9ed1d65 Compare July 28, 2025 22:06

VijayKandiah added 3 commits July 28, 2025 15:07

[Refactor] Vendor in Dispatcher as CUDADispatcher

8c318f1

[Refactor][NFC] Formatting Fixes

92e4283

[Refactor][NFC] Remove self.locals from CUDADispatcher and _FunctionC…

fbcd2d9

…ompiler

VijayKandiah force-pushed the vk/cuda_dispatcher branch from 9ed1d65 to fbcd2d9 Compare July 28, 2025 22:07

gmarkall approved these changes Jul 29, 2025

View reviewed changes

gmarkall merged commit 4a781e1 into NVIDIA:main Jul 29, 2025
39 checks passed

VijayKandiah self-assigned this Jul 29, 2025

gmarkall mentioned this pull request Jul 31, 2025

Bump version to 0.18.0 #365

Merged

VijayKandiah mentioned this pull request Aug 6, 2025

Add locals attribute to _FunctionCompiler #381

Merged



		class CUDADispatcher(Dispatcher, serialize.ReduceMixin):
		class _MemoMixin:

		locals: dict, optional
		Mapping of local variable names to Numba types. Used to override

Conversation

VijayKandiah commented Jul 22, 2025

Uh oh!

copy-pr-bot bot commented Jul 22, 2025

Uh oh!

VijayKandiah commented Jul 22, 2025

Uh oh!

copy-pr-bot bot commented Jul 24, 2025

Uh oh!

VijayKandiah commented Jul 24, 2025

Uh oh!

gmarkall Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

VijayKandiah Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

VijayKandiah Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

VijayKandiah commented Jul 28, 2025

Uh oh!

VijayKandiah commented Jul 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants