[NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization#338
Conversation
|
/ok to test 04efabd |
04efabd to
c288169
Compare
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test c288169 |
|
|
||
|
|
||
| class CUDADispatcher(Dispatcher, serialize.ReduceMixin): | ||
| class _MemoMixin: |
There was a problem hiding this comment.
I don't think this is used / relevant to the CUDA target - if it is removed, does everything still appear to work?
There was a problem hiding this comment.
Actually, don't worry about this comment too much - I think it will show up when we run coverage if it's not needed, so let's not think about it now.
numba_cuda/numba/cuda/dispatcher.py
Outdated
| locals: dict, optional | ||
| Mapping of local variable names to Numba types. Used to override |
There was a problem hiding this comment.
There is no locals parameter.
numba_cuda/numba/cuda/dispatcher.py
Outdated
| functools.update_wrapper(self, py_func) | ||
|
|
||
| self.targetoptions = targetoptions | ||
| self.locals = locals |
There was a problem hiding this comment.
It took me a while to figure out how this could possibly be working, then I realised this is assigning the built-in locals to self.locals. Although it doesn't seem to have disrupted the tests, this seems dangerous - can you remove the locals assignment please?
(locals is an option that's not supported by the CUDA target).
There was a problem hiding this comment.
I removed locals from CUDADispatcher and _FunctionCompiler. We are now passing locals={} to compiler.compile_extra similar to how we invoke compiler.compile_extra in numba/cuda/compiler.py.
gmarkall
left a comment
There was a problem hiding this comment.
There is some stuff I think is not needed in this PR, but I think it will be easier to have a large-scale cleanup following a run of the tests with coverage reporting, after other PRs have gone in too.
The only thing that really needs to change here is setting the CUDADispatcher's locals to the locals builtin function, which seems likely to have an unintended and hard-to-diagnose consequence.
|
/ok to test b02356d |
b02356d to
9ed1d65
Compare
9ed1d65 to
fbcd2d9
Compare
|
/ok to test fbcd2d9 |
- [NFC] FileCheck tests check all overloads (NVIDIA#354) - [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific refactoring and changes (NVIDIA#349) - Vendor in usecases used in testing (NVIDIA#359) - Add thirdparty tests of numba extensions (NVIDIA#348) - Support running tests in parallel (NVIDIA#350) - Add more debuginfo tests (NVIDIA#358) - [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and CUDACacheImpl to allow for future CUDA-specific refactoring and changes (NVIDIA#334) - [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization (NVIDIA#338) - Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations (NVIDIA#329) - [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for future CUDA-specific refactoring and changes (NVIDIA#322) - Vendor in Codegen and CodeLibrary for CUDA-specific customization (NVIDIA#327) - Disable tests that deadlock due to NVIDIA#317 (NVIDIA#356) - FIX: Add type check for shape elements in DeviceNDArrayBase constructor (NVIDIA#352) - Merge pull request NVIDIA#265 from lakshayg/fp16-support - Add performance warning - Fix tests - Create and register low++ bindings for float16 - Create typing/target registries for float16 - Replace Numbast generated lower_casts - Replace Numbast generated operators - Alias __half to numba.core.types.float16 - Generate fp16 bindings using numbast - Remove existing fp16 logic - [REVIEW][NFC] Vendor in the utils and cgutils to allow for future CUDA-specific refactoring and changes (NVIDIA#340) - [RFC,TESTING] Add filecheck test infrastructure (NVIDIA#342) - Migrate test infra to pytest (NVIDIA#347) - Add .vscode to gitignore (NVIDIA#344) - [NFC] Add dev dependencies to project config (NVIDIA#341) - Allow Inspection of Link-Time Optimized PTX (NVIDIA#326) - [NFC] Vendor in DIBuilder used by CUDADIBuilder (NVIDIA#332) - Add guidance on setting up pre-commit (NVIDIA#339) - [Refactor][NFC] Vendor in MinimalCallConv (NVIDIA#333) - [Refactor][NFC] Vendor in BaseCallConv (NVIDIA#324) - [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for future CUDA-specific customizations (NVIDIA#325)
- [NFC] FileCheck tests check all overloads (#354) - [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific refactoring and changes (#349) - Vendor in usecases used in testing (#359) - Add thirdparty tests of numba extensions (#348) - Support running tests in parallel (#350) - Add more debuginfo tests (#358) - [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and CUDACacheImpl to allow for future CUDA-specific refactoring and changes (#334) - [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization (#338) - Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations (#329) - [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for future CUDA-specific refactoring and changes (#322) - Vendor in Codegen and CodeLibrary for CUDA-specific customization (#327) - Disable tests that deadlock due to #317 (#356) - FIX: Add type check for shape elements in DeviceNDArrayBase constructor (#352) - Merge pull request #265 from lakshayg/fp16-support - Add performance warning - Fix tests - Create and register low++ bindings for float16 - Create typing/target registries for float16 - Replace Numbast generated lower_casts - Replace Numbast generated operators - Alias __half to numba.core.types.float16 - Generate fp16 bindings using numbast - Remove existing fp16 logic - [REVIEW][NFC] Vendor in the utils and cgutils to allow for future CUDA-specific refactoring and changes (#340) - [RFC,TESTING] Add filecheck test infrastructure (#342) - Migrate test infra to pytest (#347) - Add .vscode to gitignore (#344) - [NFC] Add dev dependencies to project config (#341) - Allow Inspection of Link-Time Optimized PTX (#326) - [NFC] Vendor in DIBuilder used by CUDADIBuilder (#332) - Add guidance on setting up pre-commit (#339) - [Refactor][NFC] Vendor in MinimalCallConv (#333) - [Refactor][NFC] Vendor in BaseCallConv (#324) - [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for future CUDA-specific customizations (#325)
We had initially removed the `locals` attribute from `_FunctionCompiler` while vendoring it in for CUDA-Specific changes in #338. This seemed to have caused `AttributeError` for a nvmath-python test https://github.com/NVIDIA/nvmath-python/blob/main/tests/nvmath_tests/device/test_cublasdx_numba.py. This PR restores this `locals` attribute and initializes it to `{}` --------- Co-authored-by: Graham Markall <gmarkall@nvidia.com>
We had initially removed the `locals` attribute from `_FunctionCompiler` while vendoring it in for CUDA-Specific changes in #338. This seemed to have caused `AttributeError` for a nvmath-python test https://github.com/NVIDIA/nvmath-python/blob/main/tests/nvmath_tests/device/test_cublasdx_numba.py. This PR restores this `locals` attribute and initializes it to `{}` --------- Co-authored-by: Graham Markall <gmarkall@nvidia.com>
This PR vendors in some necessary pieces of
Dispatcherclass intoCUDADispatcherfor future CUDA-specific changes, making the latter inherit directly from_DispatcherBase. This change also vendors in supporting classes like_MemoMixinand_FunctionCompilerfor future customizations. This is an NFC, so no additional unit tests are needed.