[Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes #433

atmnp · 2025-08-27T21:01:25Z

There are a few core utilities from Numba that we use during analysis to compute use-def chains, live maps, dead maps, etc. This change vendors those in for CUDA-specific changes.

copy-pr-bot · 2025-08-27T21:01:28Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

atmnp · 2025-08-27T21:01:33Z

/ok to test

atmnp · 2025-08-27T21:05:54Z

/ok to test

atmnp · 2025-08-27T21:42:10Z

/ok to test

atmnp · 2025-08-27T21:55:08Z

/ok to test

atmnp · 2025-08-27T23:56:05Z

/ok to test

atmnp · 2025-08-29T16:19:57Z

/ok to test

gmarkall · 2025-09-02T18:47:59Z

/ok to test

atmnp · 2025-09-02T19:07:11Z

/ok to test

atmnp · 2025-09-02T20:04:55Z

/ok to test

atmnp · 2025-09-02T21:17:39Z

/ok to test

gmarkall

I started reviewing this and ended up with a few comments on the diff. I wanted to post the review now so that they become visible, and also to enquire about the tests in test_analysis - these are using Numba's CPU target for the tests. Did you plan to modify them to use the CUDA target?

numba_cuda/numba/cuda/tests/support.py

numba_cuda/numba/cuda/tests/test_analysis.py

… changes

…to be made

return-related func

atmnp · 2025-09-23T19:46:33Z

/ok to test

atmnp · 2025-09-24T17:37:08Z

/ok to test

atmnp · 2025-09-25T15:31:51Z

The only tests failing right now in CI are with Python 3.9, and we intend on removing support for that Python version.

atmnp · 2025-10-03T18:00:00Z

/ok to test

atmnp · 2025-10-07T16:41:57Z

/ok to test

brandon-b-miller · 2025-10-07T18:29:21Z

numba_cuda/numba/cuda/core/byteflow.py

+# SPDX-License-Identifier: BSD-2-Clause
+
+"""
+Implement python 3.8+ bytecode analysis


I noticed that numba main has some specific python 3.14 handling that isn't included in the version of this file being vendored. Should we add that in just to avoid having to do so later?

We're pinned to v0.61, I'd prefer to avoid partial vendorings of main to make the break cleaner, and to make future rebases cleaner.

brandon-b-miller · 2025-10-07T18:32:40Z

numba_cuda/numba/cuda/core/controlflow.py

+        self._curblock.terminating = True
+        self._force_new_block = True
+
+    if PYVERSION in ((3, 12), (3, 13)):


Same question here about 3.14 updates

Same as above.

numba_cuda/numba/cuda/tests/test_flow_control.py

numba_cuda/numba/cuda/tests/test_analysis.py

brandon-b-miller

Few questions otherwise LGTM

atmnp · 2025-10-07T19:27:21Z

/ok to test

brandon-b-miller

LGTM

- Add support for cache-hinted load and store operations (NVIDIA#587) - Add more thirdparty tests (NVIDIA#586) - Add sphinx-lint to pre-commit and fix errors (NVIDIA#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (NVIDIA#544) - chore: clean up dead workaround for unavailable `lru_cache` (NVIDIA#598) - chore(docs): format types docs (NVIDIA#596) - refactor: decouple `Context` from `Stream` and `Event` objects (NVIDIA#579) - Fix freezing in of constant arrays with negative strides (NVIDIA#589) - Update tests to accept variants of generated PTX (NVIDIA#585) - refactor: replace device functionality with `cuda.core` APIs (NVIDIA#581) - Move frontend tests to `cudapy` namespace (NVIDIA#558) - Generalize the concurrency group for main merges (NVIDIA#582) - ci: move pre-commit checks to pre commit action (NVIDIA#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (NVIDIA#574) - ci: ensure that python version in ci matches matrix (NVIDIA#575) - Fix the `cuda.is_supported_version()` API (NVIDIA#571) - Fix checks on main (NVIDIA#576) - feat: add `math.nextafter` (NVIDIA#543) - ci: replace conda testing with pixi (NVIDIA#554) - [CI] Run PR workflow on merge to main (NVIDIA#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (NVIDIA#569) - test: enable fail-on-warn and clean up resulting failures (NVIDIA#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (NVIDIA#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (NVIDIA#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (NVIDIA#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (NVIDIA#550) - test: revert back to ipc futures that await each iteration (NVIDIA#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (NVIDIA#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (NVIDIA#534) - Remove dependencies on target_extension for CUDA target (NVIDIA#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (NVIDIA#559) - [WIP] Port numpy reduction tests to CUDA (NVIDIA#523) - ci: add timeout to avoid blocking the job queue (NVIDIA#556) - Handle `cuda.core.Stream` in driver operations (NVIDIA#401) - feat: add support for `math.exp2` (NVIDIA#541) - Vendor in types and datamodel for CUDA-specific changes (NVIDIA#533) - refactor: cleanup device constructor (NVIDIA#548) - bench: add cupy to array constructor kernel launch benchmarks (NVIDIA#547) - perf: cache dimension computations (NVIDIA#542) - perf: remove duplicated size computation (NVIDIA#537) - chore(perf): add torch to benchmark (NVIDIA#539) - test: speed up ipc tests by ~6.5x (NVIDIA#527) - perf: speed up kernel launch (NVIDIA#510) - perf: remove context threading in various pointer abstractions (NVIDIA#536) - perf: reduce the number of `__cuda_array_interface__` accesses (NVIDIA#538) - refactor: remove unnecessary custom map and set implementations (NVIDIA#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (NVIDIA#513) - test: add benchmarks for kernel launch for reproducibility (NVIDIA#528) - test(pixi): update pixi testing command to work with the new `testing` directory (NVIDIA#522) - refactor: fully remove `USE_NV_BINDING` (NVIDIA#525) - Draft: Vendor in the IR module (NVIDIA#439) - pyproject.toml: add search path for Pyrefly (NVIDIA#524) - Vendor in numba.core.typing for CUDA-specific changes (NVIDIA#473) - Use numba.config when available, otherwise use numba.cuda.config (NVIDIA#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (NVIDIA#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (NVIDIA#502) - build: allow parallelization of nvcc testing builds (NVIDIA#521) - chore(dev-deps): add pixi (NVIDIA#505) - Vendor the imputils module for CUDA refactoring (NVIDIA#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (NVIDIA#519) - Switch back to stable cuDF release in thirdparty tests (NVIDIA#518) - Updating .gitignore with binaries in the `testing` folder (NVIDIA#516) - Remove some unnecessary uses of ContextResettingTestCase (NVIDIA#507) - Vendor in _helperlib cext for CUDA-specific changes (NVIDIA#512) - Vendor in typeconv for future CUDA-specific changes (NVIDIA#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (NVIDIA#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (NVIDIA#494) - Make the CUDA target the default for CUDA overload decorators (NVIDIA#511) - Remove C extension loading hacks (NVIDIA#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (NVIDIA#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (NVIDIA#433) - Fix Bf16 Test OB Error (NVIDIA#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (NVIDIA#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (NVIDIA#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (NVIDIA#488) - Improve debug value range coverage (NVIDIA#461) - Add `compile_all` API (NVIDIA#484) - Vendor in core.registry for CUDA-specific changes (NVIDIA#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (NVIDIA#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (NVIDIA#476) - [test] Remove dependency on cpu_target (NVIDIA#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (NVIDIA#475) - [test] Use numpy's tolerance for float16 (NVIDIA#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (NVIDIA#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (NVIDIA#478)

- Add support for cache-hinted load and store operations (#587) - Add more thirdparty tests (#586) - Add sphinx-lint to pre-commit and fix errors (#597) - Add DWARF variant part support for polymorphic variables in CUDA debug info (#544) - chore: clean up dead workaround for unavailable `lru_cache` (#598) - chore(docs): format types docs (#596) - refactor: decouple `Context` from `Stream` and `Event` objects (#579) - Fix freezing in of constant arrays with negative strides (#589) - Update tests to accept variants of generated PTX (#585) - refactor: replace device functionality with `cuda.core` APIs (#581) - Move frontend tests to `cudapy` namespace (#558) - Generalize the concurrency group for main merges (#582) - ci: move pre-commit checks to pre commit action (#577) - chore(pixi): set up doc builds; remove most `build-conda` dependencies (#574) - ci: ensure that python version in ci matches matrix (#575) - Fix the `cuda.is_supported_version()` API (#571) - Fix checks on main (#576) - feat: add `math.nextafter` (#543) - ci: replace conda testing with pixi (#554) - [CI] Run PR workflow on merge to main (#572) - Propose Alternative Module Path for `ext_types` and Maintain `numba.cuda.types.bfloat16` Import API (#569) - test: enable fail-on-warn and clean up resulting failures (#529) - [Refactor][NFC] Vendor-in compiler_lock for future CUDA-specific changes (#565) - Fix registration with Numba, vendor MakeFunctionToJITFunction tests (#566) - [Refactor][NFC][Cleanups] Update imports to upstream numba to use the numba.cuda modules (#561) - test: refactor process-based tests to use concurrent futures in order to simplify tests (#550) - test: revert back to ipc futures that await each iteration (#564) - chore(deps): move to self-contained pixi.toml to avoid mixed-pypi-pixi environments (#551) - [Refactor][NFC] Vendor-in errors for future CUDA-specific changes (#534) - Remove dependencies on target_extension for CUDA target (#555) - Relax the pinning to `cuda-core` to allow it floating across minor releases (#559) - [WIP] Port numpy reduction tests to CUDA (#523) - ci: add timeout to avoid blocking the job queue (#556) - Handle `cuda.core.Stream` in driver operations (#401) - feat: add support for `math.exp2` (#541) - Vendor in types and datamodel for CUDA-specific changes (#533) - refactor: cleanup device constructor (#548) - bench: add cupy to array constructor kernel launch benchmarks (#547) - perf: cache dimension computations (#542) - perf: remove duplicated size computation (#537) - chore(perf): add torch to benchmark (#539) - test: speed up ipc tests by ~6.5x (#527) - perf: speed up kernel launch (#510) - perf: remove context threading in various pointer abstractions (#536) - perf: reduce the number of `__cuda_array_interface__` accesses (#538) - refactor: remove unnecessary custom map and set implementations (#530) - [Refactor][NFC] Vendor-in vectorize decorators for future CUDA-specific changes (#513) - test: add benchmarks for kernel launch for reproducibility (#528) - test(pixi): update pixi testing command to work with the new `testing` directory (#522) - refactor: fully remove `USE_NV_BINDING` (#525) - Draft: Vendor in the IR module (#439) - pyproject.toml: add search path for Pyrefly (#524) - Vendor in numba.core.typing for CUDA-specific changes (#473) - Use numba.config when available, otherwise use numba.cuda.config (#497) - [MNT] Drop NUMBA_CUDA_USE_NVIDIA_BINDING; always use cuda.core and cuda.bindings as fallback (#479) - Vendor in dispatcher, entrypoints, pretty_annotate for CUDA-specific changes (#502) - build: allow parallelization of nvcc testing builds (#521) - chore(dev-deps): add pixi (#505) - Vendor the imputils module for CUDA refactoring (#448) - Don't use `MemoryLeakMixin` for tests that don't use NRT (#519) - Switch back to stable cuDF release in thirdparty tests (#518) - Updating .gitignore with binaries in the `testing` folder (#516) - Remove some unnecessary uses of ContextResettingTestCase (#507) - Vendor in _helperlib cext for CUDA-specific changes (#512) - Vendor in typeconv for future CUDA-specific changes (#499) - [Refactor][NFC] Vendor-in numba.cpython modules for future CUDA-specific changes (#493) - [Refactor][NFC] Vendor-in numba.np modules for future CUDA-specific changes (#494) - Make the CUDA target the default for CUDA overload decorators (#511) - Remove C extension loading hacks (#506) - Ensure NUMBA can manipulate memory from CUDA graphs before the graph is launched (#437) - [Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes (#433) - Fix Bf16 Test OB Error (#509) - Vendor in components from numba.core.runtime for CUDA-specific changes (#498) - [Refactor] Vendor in _dispatcher, _devicearray, mviewbuf C extension for CUDA-specific customization (#373) - [MNT] Managed UM memset fallback and skip CUDA IPC tests on WSL2 (#488) - Improve debug value range coverage (#461) - Add `compile_all` API (#484) - Vendor in core.registry for CUDA-specific changes (#485) - [Refactor][NFC] Vendor in numba.misc for CUDA-specific changes (#457) - Vendor in optional, boxing for CUDA-specific changes, fix dangling imports (#476) - [test] Remove dependency on cpu_target (#490) - Change dangling imports of numba.core.lowering to numba.cuda.lowering (#475) - [test] Use numpy's tolerance for float16 (#491) - [Refactor][NFC] Vendor-in numba.extending for future CUDA-specific changes (#466) - [Refactor][NFC] Vendor-in more cpython registries for future CUDA-specific changes (#478)

atmnp self-assigned this Aug 27, 2025

atmnp added the 3 - Ready for Review Ready for review by team label Aug 27, 2025

atmnp added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Aug 28, 2025

atmnp added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on author Waiting for author to respond to review labels Sep 2, 2025

gmarkall reviewed Sep 4, 2025

View reviewed changes

gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Sep 4, 2025

atmnp added 10 commits September 23, 2025 08:45

[Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific…

67759e9

… changes

[Refactor][NFC] Vendor-in core Numba byteflow utils for CUDA-specific…

2daa3bf

… changes

add python 3.9 support back, there's no functional change that needs …

e8ade49

…to be made

fix dangling import to upstream utils

c4b09f1

fix import of unsupported bytecode error

fcf02fc

add support back for python 3.9

49615ee

adds tests

48adce8

update tests to target cuda

4f0e444

revert tests to run on python 3.9

510c8b4

don't run analysis pass tests in sim + remove phi tests that test

9d45d13

return-related func

atmnp added 2 commits September 23, 2025 08:56

remove unsupported/untested tests

93ade5c

port tests to gpu

516d306

atmnp force-pushed the atmn/vendor-in-analysis branch from cfeca44 to 516d306 Compare September 23, 2025 19:44

atmnp requested review from brandon-b-miller and removed request for brandon-b-miller September 23, 2025 21:25

Merge branch 'main' into atmn/vendor-in-analysis

34dbcd7

atmnp added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on author Waiting for author to respond to review labels Sep 25, 2025

atmnp requested a review from brandon-b-miller September 25, 2025 15:32

atmnp added 2 commits October 3, 2025 10:59

Merge branch 'main' into atmn/vendor-in-analysis

3cd6e3a

Merge branch 'main' into atmn/vendor-in-analysis

a0c1c47

Merge branch 'main' into atmn/vendor-in-analysis

349b43a

brandon-b-miller reviewed Oct 7, 2025

View reviewed changes

numba_cuda/numba/cuda/tests/test_flow_control.py Outdated Show resolved Hide resolved

brandon-b-miller reviewed Oct 7, 2025

View reviewed changes

numba_cuda/numba/cuda/tests/test_analysis.py Outdated Show resolved Hide resolved

brandon-b-miller previously approved these changes Oct 7, 2025

View reviewed changes

xfail two tests, add global skips on analysis tests for cudasim

877684c

atmnp dismissed brandon-b-miller’s stale review via 877684c October 7, 2025 19:26

brandon-b-miller approved these changes Oct 7, 2025

View reviewed changes

atmnp merged commit 3468e9a into NVIDIA:main Oct 7, 2025
76 checks passed

gmarkall mentioned this pull request Nov 20, 2025

Bump version to 0.21.0 #602

Merged

[Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes #433

[Refactor][NFC] Vendor-in core Numba analysis utils for CUDA-specific changes #433

Uh oh!

Conversation

atmnp commented Aug 27, 2025

Uh oh!

copy-pr-bot bot commented Aug 27, 2025

Uh oh!

atmnp commented Aug 27, 2025

Uh oh!

atmnp commented Aug 27, 2025

Uh oh!

atmnp commented Aug 27, 2025

Uh oh!

atmnp commented Aug 27, 2025

Uh oh!

atmnp commented Aug 27, 2025

Uh oh!

atmnp commented Aug 29, 2025

Uh oh!

gmarkall commented Sep 2, 2025

Uh oh!

atmnp commented Sep 2, 2025

Uh oh!

atmnp commented Sep 2, 2025

Uh oh!

atmnp commented Sep 2, 2025

Uh oh!

gmarkall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

atmnp commented Sep 23, 2025

Uh oh!

atmnp commented Sep 24, 2025

Uh oh!

atmnp commented Sep 25, 2025

Uh oh!

atmnp commented Oct 3, 2025

Uh oh!

atmnp commented Oct 7, 2025

Uh oh!

brandon-b-miller Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

atmnp Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

brandon-b-miller Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

atmnp Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

brandon-b-miller left a comment

Choose a reason for hiding this comment

Uh oh!

atmnp commented Oct 7, 2025

Uh oh!

brandon-b-miller left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants