-
Notifications
You must be signed in to change notification settings - Fork 54
Implement alignment support for local and shared arrays. #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
24a02ef to
b44dc91
Compare
|
I've removed the dependency on altering the underlying types.Array in numba, it wasn't necessary, as @gmarkall pointed out. |
gmarkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I think this is a good start with some functionality working as expected - however there are some other cases to cover and a few observations on the diff. We might need another iteration afterwards once things have shaped up a bit (and the docs might need an update if they didn't get generated from the source, I will have to check).
5703a87 to
833a9ce
Compare
gmarkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixups! I think there are a couple of changes that are needed:
- The
ptx_lmem_alloc_array()function needs deleting now as it is duplicated - I think there's still not a test of passing an invalid type for the alignment (e.g. 1.0, or "1") - it would be good to check we correctly error out rather than silently doing the wrong thing.
The other comments are thoughts / informational.
|
@gmarkall added some invalid type alignment tests, as well as tweaking the tests to use a common set of DTYPES that also include a bunch of record types (with and without alignment). |
384a505 to
9eaca8b
Compare
|
Hi @gmarkall, I believe this one is ready to go with all requested changes made. |
|
These changes conflicts with #176 , but these changes are more important for nvmath-python. I'll fix my branch after this one is merged. |
gmarkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates - there are a couple of minor questions on the added test - I don't think they necessarily need to be addressed for this to be merged, so just let me know what you want to do (i.e. merge as-is or modify based on the comments).
Remove erroneous alignment= kwarg to types.Array(). Cosmetic: fix docstring typo. PR Feedback: Clarify pointer size. We don't support 32-bit x86. PR Feedback: Improve alignment handling in Cuda_array_decl. Co-authored-by: Graham Markall <gmarkall@nvidia.com> PR Feedback: Improve alignment handling in cudaimpl. - Reduce fiddly boilerplate code in each array helper routine with a single call to `_try_extract_and_validate_alignment`. - Simplify the decorators using `types.BaseTuple` where possible. Add missing `cuda_local_array_tuple` implementation. This ensures multi-dim shapes can be handled by `cuda.local.array`. PR Feedback: Improve comment. Co-authored-by: Graham Markall <gmarkall@nvidia.com> PR Feedback: Improve alignment tests. - Verify the align attribute in the LLVM IR. - Add multi-dimensional tests. - Remove dead code. PR Feedback: Remove ptx_lmem_alloc_array. This functionality is now provided by cuda_local_array_tuple, whose name better fits with the other three cuda_(local|shared)_array_(tuple|integer) routines. COSMETIC: Relocate `test_invalid_alignments()`. No code changes are in this commit. I'm relocating the function in anticipation of some refactoring in the next commit. It makes sense to have the `_do_test()` implementation come immediately after the three test functions that use it (`test_array_alignment_[123]d()`). PR Feedback: Add tests for invalid alignment types. Add some record dtypes to the alignment tests. PR Feedback: Improve _try_extract_and_validate_alignment() docstring. Fix test failures on CI. CI was showing error messages containing ANSI color codes, e.g.: RequireLiteralValue: \x1b[1malignment must be a constant integer\x1b[0m\x1b[0m\n
Fixes were easy, all good on my side for the merge! |
This PR adds support for specifying an
alignment=Nkeyword argument to thecuda.local.array()andcuda.shared.array()helpers that can be used within JIT'd CUDA kernels (i.e. functions annotated with@numba.cuda.jit.