Skip to content

Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations#329

Merged
gmarkall merged 5 commits intoNVIDIA:mainfrom
VijayKandiah:vk/cuda_native_lowering
Jul 29, 2025
Merged

Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations#329
gmarkall merged 5 commits intoNVIDIA:mainfrom
VijayKandiah:vk/cuda_native_lowering

Conversation

@VijayKandiah
Copy link
Contributor

This PR vendors in BaseNativeLowering to be inherited by CUDANativeLowering, and BaseLower, Lower to be inherited by CUDALower. This is a refactoring change to allow for future CUDA-specific customizations. No new unit tests are needed.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@VijayKandiah
Copy link
Contributor Author

/ok to test 4cc0d01

@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Jul 18, 2025
@gmarkall
Copy link
Contributor

/ok to test 4cc0d01

@VijayKandiah VijayKandiah force-pushed the vk/cuda_native_lowering branch from 4cc0d01 to bb7163b Compare July 24, 2025 16:42
@VijayKandiah
Copy link
Contributor Author

/ok to test bb7163b

@gmarkall
Copy link
Contributor

I think this is probably making Numba-CUDA incompatible with Numba 0.60, which is why the CUDA 11 / Python 3.9 tests are failing (Numba 0.61 is not available for Python 3.9).

@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 2 - In Progress Currently a work in progress labels Jul 25, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 25, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@VijayKandiah
Copy link
Contributor Author

/ok to test b412a4e

@VijayKandiah
Copy link
Contributor Author

This PR also raises the minimum numba version required to be 0.60.0 in pyproject.toml
dependencies = ["numba>=0.60.0"]

@gmarkall gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Jul 28, 2025
Comment on lines 36 to 37
numba_version = get_versions()["version"].split(".")
numba_minor_version = int(numba_version[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, we can get the version info with numba.version_info.major, numba.version_info.minor, etc.

@VijayKandiah
Copy link
Contributor Author

/ok to test c11c73d

@VijayKandiah VijayKandiah force-pushed the vk/cuda_native_lowering branch from c11c73d to 9ed1d65 Compare July 28, 2025 22:07
@VijayKandiah
Copy link
Contributor Author

/ok to test 9ed1d65

@gmarkall gmarkall merged commit 2a46811 into NVIDIA:main Jul 29, 2025
39 checks passed
gmarkall added a commit to atmnp/numba-cuda that referenced this pull request Jul 29, 2025
…nager

There were conflicts in:

- `numba_cuda/numba/cuda/compiler.py`
- `numba_cuda/numba/cuda/core/typed_passes.py`

There was some overlap with the changes in NVIDIA#329, which I tried to
resolve.

Now a couple of debug tests are failing.
@VijayKandiah VijayKandiah self-assigned this Jul 29, 2025
gmarkall added a commit to gmarkall/numba-cuda that referenced this pull request Jul 31, 2025
- [NFC] FileCheck tests check all overloads (NVIDIA#354)
- [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific refactoring and changes (NVIDIA#349)
- Vendor in usecases used in testing (NVIDIA#359)
- Add thirdparty tests of numba extensions (NVIDIA#348)
- Support running tests in parallel (NVIDIA#350)
- Add more debuginfo tests (NVIDIA#358)
- [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and CUDACacheImpl to allow for future CUDA-specific refactoring and changes (NVIDIA#334)
- [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future CUDA-specific customization (NVIDIA#338)
- Vendor in BaseNativeLowering and BaseLower for CUDA-specific customizations (NVIDIA#329)
- [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for future CUDA-specific refactoring and changes (NVIDIA#322)
- Vendor in Codegen and CodeLibrary for CUDA-specific customization (NVIDIA#327)
- Disable tests that deadlock due to NVIDIA#317 (NVIDIA#356)
- FIX: Add type check for shape elements in DeviceNDArrayBase constructor (NVIDIA#352)
- Merge pull request NVIDIA#265 from lakshayg/fp16-support
- Add performance warning
- Fix tests
- Create and register low++ bindings for float16
- Create typing/target registries for float16
- Replace Numbast generated lower_casts
- Replace Numbast generated operators
- Alias __half to numba.core.types.float16
- Generate fp16 bindings using numbast
- Remove existing fp16 logic
- [REVIEW][NFC] Vendor in the utils and cgutils to allow for future CUDA-specific refactoring and changes (NVIDIA#340)
- [RFC,TESTING] Add filecheck test infrastructure (NVIDIA#342)
- Migrate test infra to pytest (NVIDIA#347)
- Add .vscode to gitignore (NVIDIA#344)
- [NFC] Add dev dependencies to project config (NVIDIA#341)
- Allow Inspection of Link-Time Optimized PTX (NVIDIA#326)
- [NFC] Vendor in DIBuilder used by CUDADIBuilder (NVIDIA#332)
- Add guidance on setting up pre-commit (NVIDIA#339)
- [Refactor][NFC] Vendor in MinimalCallConv (NVIDIA#333)
- [Refactor][NFC] Vendor in BaseCallConv (NVIDIA#324)
- [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for future CUDA-specific customizations (NVIDIA#325)
@gmarkall gmarkall mentioned this pull request Jul 31, 2025
gmarkall added a commit that referenced this pull request Jul 31, 2025
- [NFC] FileCheck tests check all overloads (#354)
- [REVIEW][NFC] Vendor in serialize to allow for future CUDA-specific
refactoring and changes (#349)
- Vendor in usecases used in testing (#359)
- Add thirdparty tests of numba extensions (#348)
- Support running tests in parallel (#350)
- Add more debuginfo tests (#358)
- [REVIEW][NFC] Vendor in the Cache, CacheImpl used by CUDACache and
CUDACacheImpl to allow for future CUDA-specific refactoring and changes
(#334)
- [NFC] Vendor in Dispatcher as CUDADispatcher to allow for future
CUDA-specific customization (#338)
- Vendor in BaseNativeLowering and BaseLower for CUDA-specific
customizations (#329)
- [REVIEW] Vendor in the CompilerBase used by CUDACompiler to allow for
future CUDA-specific refactoring and changes (#322)
- Vendor in Codegen and CodeLibrary for CUDA-specific customization
(#327)
- Disable tests that deadlock due to #317 (#356)
- FIX: Add type check for shape elements in DeviceNDArrayBase
constructor (#352)
- Merge pull request #265 from lakshayg/fp16-support
- Add performance warning
- Fix tests
- Create and register low++ bindings for float16
- Create typing/target registries for float16
- Replace Numbast generated lower_casts
- Replace Numbast generated operators
- Alias __half to numba.core.types.float16
- Generate fp16 bindings using numbast
- Remove existing fp16 logic
- [REVIEW][NFC] Vendor in the utils and cgutils to allow for future
CUDA-specific refactoring and changes (#340)
- [RFC,TESTING] Add filecheck test infrastructure (#342)
- Migrate test infra to pytest (#347)
- Add .vscode to gitignore (#344)
- [NFC] Add dev dependencies to project config (#341)
- Allow Inspection of Link-Time Optimized PTX (#326)
- [NFC] Vendor in DIBuilder used by CUDADIBuilder (#332)
- Add guidance on setting up pre-commit (#339)
- [Refactor][NFC] Vendor in MinimalCallConv (#333)
- [Refactor][NFC] Vendor in BaseCallConv (#324)
- [REVIEW] Vendor in CompileResult as CUDACompileResult to allow for
future CUDA-specific customizations (#325)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Waiting on reviewer Waiting for reviewer to respond to author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants