-
Notifications
You must be signed in to change notification settings - Fork 54
Fix #624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected #643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There were some specific isinstance checks in `ir.py` that were modified to accept Numba IR nodes that could have been generated when transformations in Numba or other areas produce them. These specific locations appear to have been insufficient, leading to NVIDIA#624. In order to ensure that Numba IR nodes are accepted in general, all isinstance checks are updated to accept either the Numba-CUDA type or the Numba type. Rather than implementing a lot of switching in the implementations, the sets of acceptable IR nodes are defined in the module global scope based on the presence of Numba, and these globals are referred to in the instance checks. Note that: - Checks on function types are not modified, and Numba and Numba-CUDA are sharing types when co-installed. - Some checks also accepted `None` - these have been slightly modified so that the assertion is only used when the variable is not `None`. - Other checks, e.g. for `int` and `str` types, require no modification.
Auditing script generated with
```
grep "^class [A-Z]" numba_cuda/numba/cuda/core/ir.py | awk '{print $2}' | awk 'BEGIN { FS="(" } { print "grep --color \"ir\\." $1 "\" numba_cuda -R" }' > audit.sh
```
and run with
```
bash audit.sh | grep -v "= ir\.[A-Z]" | sort
```
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
/ok to test |
Greptile SummaryThis PR fixes issue #624 by enabling Numba-CUDA to accept IR nodes from both Key Changes
Implementation QualityThe changes are highly systematic and comprehensive. The script-based approach with manual verification ensures consistency. The normalization logic in Confidence Score: 5/5
Important Files Changed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (4)
-
numba_cuda/numba/cuda/core/ir.py, line 571 (link)logic: inconsistent type checking - uses
Vardirectly instead ofvar_types -
numba_cuda/numba/cuda/core/ir.py, line 795 (link)logic: inconsistent type checking - uses
Vardirectly instead ofvar_types -
numba_cuda/numba/cuda/core/ir.py, line 877 (link)logic: inconsistent type checking - uses
Vardirectly instead ofvar_types -
numba_cuda/numba/cuda/core/analysis.py, line 46-51 (link)style: The multiple
isinstancechecks could be simplified using the same tuple concatenation pattern used elsewhere in the fileNote: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
19 files reviewed, 4 comments
Also use the Numba/Numba-CUDA types within ir.py
|
/ok to test |
1 similar comment
|
/ok to test |
|
/ok to test |
1 similar comment
|
/ok to test |
|
/ok to test |
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
numba_cuda/numba/cuda/serialize.py, line 73 (link)logic: This file imports
irfromllvmlite(line 16), not fromnumba.cuda.core. Thellvmlite.irmodule has aValueclass but does not have avalue_typestuple. This will causeAttributeError: module 'llvmlite.ir' has no attribute 'value_types'at runtime.This change should be reverted since it's operating on a different
irmodule than the other files in this PR. -
numba_cuda/numba/cuda/core/postproc.py, line 10 (link)logic:
block_typesis not defined inir.py. The file only definespopblock_typesfor thePopBlockclass. This will causeAttributeError: module 'numba.cuda.core.ir' has no attribute 'block_types'.You need to either:
- Add
block_types = (Block, numba.core.ir.Block)toir.py, or - Keep the original
ir.Blockcheck here
- Add
21 files reviewed, 2 comments
replace_checks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this intended to be checked in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not intended to be in the code that gets merged. I did have it checked in whilst I was working on the branch but it needs removing.
|
@gmarkall changes LGTM other than checking in the script to automate the replacement |
- In `serialize`, the `ir` module is from llvmlite, so it should not have been altered. - I never added `block_types` as I mistakenly thought `ir.Block` was never instance checked in the code base. There is a usage of it like this. It may be dead code in Numba-CUDA, but I am adding `block_types` just in case it is not.
|
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 files reviewed, 3 comments
rparolin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, as long as the script that @kkraus14 highlighted is removed before submitting.
- Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643) - Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591) - feat: allow printing nested tuples (NVIDIA#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655) - build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652) - Test RAPIDS 25.12 (NVIDIA#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662) - feat: add print support for int64 tuples (NVIDIA#663) - Only run dependabot monthly and open fewer PRs (NVIDIA#658) - test: fix bogus `self` argument to `Context` (NVIDIA#656) - Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650) - Add support for dependabot (NVIDIA#647) - refactor: cull dead linker objects (NVIDIA#649) - Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609) - feat: add set_shared_memory_carveout (NVIDIA#629) - chore: bump version in pixi.toml (NVIDIA#641) - refactor: remove devicearray code to reduce complexity (NVIDIA#600)
- Capture global device arrays in kernels and device functions (#666) - Fix #624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (#643) - Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo (#591) - feat: allow printing nested tuples (#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (#655) - build(deps): bump actions/upload-artifact from 4 to 5 (#652) - Test RAPIDS 25.12 (#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (#662) - feat: add print support for int64 tuples (#663) - Only run dependabot monthly and open fewer PRs (#658) - test: fix bogus `self` argument to `Context` (#656) - Fix false negative NRT link decision when NRT was previously toggled on (#650) - Add support for dependabot (#647) - refactor: cull dead linker objects (#649) - Migrate numba-cuda driver to use cuda.core.launch API (#609) - feat: add set_shared_memory_carveout (#629) - chore: bump version in pixi.toml (#641) - refactor: remove devicearray code to reduce complexity (#600)
v0.23.0 - Capture global device arrays in kernels and device functions (NVIDIA#666) - Fix NVIDIA#624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected (NVIDIA#643) - Fix Issue NVIDIA#588: separate compilation of NVVM IR modules when generating debuginfo (NVIDIA#591) - feat: allow printing nested tuples (NVIDIA#667) - build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 (NVIDIA#655) - build(deps): bump actions/upload-artifact from 4 to 5 (NVIDIA#652) - Test RAPIDS 25.12 (NVIDIA#661) - Do not manually set DUMP_ASSEMBLY in `nvjitlink` tests (NVIDIA#662) - feat: add print support for int64 tuples (NVIDIA#663) - Only run dependabot monthly and open fewer PRs (NVIDIA#658) - test: fix bogus `self` argument to `Context` (NVIDIA#656) - Fix false negative NRT link decision when NRT was previously toggled on (NVIDIA#650) - Add support for dependabot (NVIDIA#647) - refactor: cull dead linker objects (NVIDIA#649) - Migrate numba-cuda driver to use cuda.core.launch API (NVIDIA#609) - feat: add set_shared_memory_carveout (NVIDIA#629) - chore: bump version in pixi.toml (NVIDIA#641) - refactor: remove devicearray code to reduce complexity (NVIDIA#600)
This PR attempts to fix Issue #624. The root cause there is that Numba's inline pass is used when
numba.extending.overload(inline="always")is used, and it creates IR nodes fromnumba.core.irinstead ofnumba.cuda.core.ir, which are not recognised in instance checks in Numba-CUDA.Rather than redirecting the IR module to Numba, this PR aims to modify all instance checks to accept both Numba and Numba-CUDA IR nodes, to allow for flexibility in editing the
numba.cuda.core.irmodule and the changes co-existing with uses of thenumba.core.irmodule.These changes are mostly performed by a script. The majority of these were done with:
Which replaces uses like
isinstance(<thing>, ir.<Classname>)withisinstance(<thing>, ir.<classname>_types). The<classname>_typesare initialized based on whether Numba is present - if it is then e.g.(Arg, numba.core.ir.Arg)would be inarg_types, and similar for all the other node types.The script above did not change everything - some manual edits were made after those changes, which were detected using the following:
The bash script above can also be used to check all the remaining uses of the nodes, to ensure that none that need changing were missed.
A test based on the reproducer in #624 is also added. The test did not work initially - this is because the test case adds an implementation of
get_42to Numba's builtin registry, which was ignored by the CUDA target. The CUDA target would normally install registrations from the Numba builtin registry, but skips those that appear to be internal, as determined by the module name beginning with"numba.". Implementations defined in test code should not be considered internal (they provide a model of the use of Numba-CUDA from outside) so theis_external()method is updated to consider implementations in thenumba.cuda.testsmodule to be external.