Skip to content

Conversation

@tpn
Copy link
Contributor

@tpn tpn commented Mar 3, 2025

This PR adds support for specifying an alignment=N keyword argument to the cuda.local.array() and cuda.shared.array() helpers that can be used within JIT'd CUDA kernels (i.e. functions annotated with @numba.cuda.jit.

@tpn tpn force-pushed the 140-array-alignment branch from 721f88c to 12ea962 Compare March 3, 2025 03:09
@tpn tpn force-pushed the 140-array-alignment branch 3 times, most recently from 24a02ef to b44dc91 Compare March 3, 2025 21:55
@tpn
Copy link
Contributor Author

tpn commented Mar 3, 2025

I've removed the dependency on altering the underlying types.Array in numba, it wasn't necessary, as @gmarkall pointed out.

@gmarkall gmarkall added the 2 - In Progress Currently a work in progress label Mar 4, 2025
@tpn tpn added this to CCCL Mar 5, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 5, 2025
@tpn tpn force-pushed the 140-array-alignment branch from b44dc91 to 84d10c0 Compare March 6, 2025 19:55
@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 2 - In Progress Currently a work in progress labels Mar 7, 2025
@tpn tpn force-pushed the 140-array-alignment branch from 84d10c0 to 96e0d81 Compare March 9, 2025 23:26
@gmarkall gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Mar 10, 2025
Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think this is a good start with some functionality working as expected - however there are some other cases to cover and a few observations on the diff. We might need another iteration afterwards once things have shaped up a bit (and the docs might need an update if they didn't get generated from the source, I will have to check).

@github-project-automation github-project-automation bot moved this from Todo to In Progress in CCCL Mar 18, 2025
@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Mar 18, 2025
@tpn tpn force-pushed the 140-array-alignment branch from 5703a87 to 833a9ce Compare March 21, 2025 17:28
Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixups! I think there are a couple of changes that are needed:

  • The ptx_lmem_alloc_array() function needs deleting now as it is duplicated
  • I think there's still not a test of passing an invalid type for the alignment (e.g. 1.0, or "1") - it would be good to check we correctly error out rather than silently doing the wrong thing.

The other comments are thoughts / informational.

@tpn
Copy link
Contributor Author

tpn commented Mar 24, 2025

@gmarkall added some invalid type alignment tests, as well as tweaking the tests to use a common set of DTYPES that also include a bunch of record types (with and without alignment).

@gmarkall gmarkall added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Mar 24, 2025
@tpn tpn force-pushed the 140-array-alignment branch from 8fd0b1a to c33931d Compare April 1, 2025 22:26
@tpn tpn requested a review from gmarkall April 1, 2025 22:26
@tpn tpn force-pushed the 140-array-alignment branch from c33931d to fa9ecb9 Compare April 2, 2025 23:30
@tpn tpn force-pushed the 140-array-alignment branch 2 times, most recently from 384a505 to 9eaca8b Compare April 29, 2025 00:15
@tpn
Copy link
Contributor Author

tpn commented Apr 29, 2025

Hi @gmarkall, I believe this one is ready to go with all requested changes made.

@ZzEeKkAa
Copy link
Contributor

ZzEeKkAa commented Apr 30, 2025

These changes conflicts with #176 , but these changes are more important for nvmath-python. I'll fix my branch after this one is merged.

Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates - there are a couple of minor questions on the added test - I don't think they necessarily need to be addressed for this to be merged, so just let me know what you want to do (i.e. merge as-is or modify based on the comments).

@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels May 1, 2025
tpn added 2 commits May 5, 2025 14:59
Remove erroneous alignment= kwarg to types.Array().

Cosmetic: fix docstring typo.

PR Feedback: Clarify pointer size.

We don't support 32-bit x86.

PR Feedback: Improve alignment handling in Cuda_array_decl.

Co-authored-by: Graham Markall <gmarkall@nvidia.com>

PR Feedback: Improve alignment handling in cudaimpl.

 - Reduce fiddly boilerplate code in each array helper routine with a
   single call to `_try_extract_and_validate_alignment`.

 - Simplify the decorators using `types.BaseTuple` where possible.

Add missing `cuda_local_array_tuple` implementation.

This ensures multi-dim shapes can be handled by `cuda.local.array`.

PR Feedback: Improve comment.

Co-authored-by: Graham Markall <gmarkall@nvidia.com>

PR Feedback: Improve alignment tests.

 - Verify the align attribute in the LLVM IR.

 - Add multi-dimensional tests.

 - Remove dead code.

PR Feedback: Remove ptx_lmem_alloc_array.

This functionality is now provided by cuda_local_array_tuple, whose name
better fits with the other three cuda_(local|shared)_array_(tuple|integer)
routines.

COSMETIC: Relocate `test_invalid_alignments()`.

No code changes are in this commit.  I'm relocating the function in
anticipation of some refactoring in the next commit.  It makes sense to
have the `_do_test()` implementation come immediately after the three test
functions that use it (`test_array_alignment_[123]d()`).

PR Feedback: Add tests for invalid alignment types.

Add some record dtypes to the alignment tests.

PR Feedback: Improve _try_extract_and_validate_alignment() docstring.

Fix test failures on CI.

CI was showing error messages containing ANSI color codes, e.g.:

 RequireLiteralValue: \x1b[1malignment must be a constant integer\x1b[0m\x1b[0m\n
@tpn tpn force-pushed the 140-array-alignment branch from 9eaca8b to faebec3 Compare May 5, 2025 22:33
@tpn
Copy link
Contributor Author

tpn commented May 5, 2025

Thanks for the updates - there are a couple of minor questions on the added test - I don't think they necessarily need to be addressed for this to be merged, so just let me know what you want to do (i.e. merge as-is or modify based on the comments).

Fixes were easy, all good on my side for the merge!

@gmarkall gmarkall added 5 - Ready to merge Testing and reviews complete, ready to merge and removed 4 - Waiting on author Waiting for author to respond to review labels May 6, 2025
@gmarkall gmarkall merged commit 2653de2 into NVIDIA:main May 6, 2025
37 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in CCCL May 6, 2025
@tpn tpn deleted the 140-array-alignment branch May 6, 2025 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to merge Testing and reviews complete, ready to merge

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants