Skip to content

Conversation

@adstraw
Copy link
Contributor

@adstraw adstraw commented Sep 1, 2023

Adds CUDA codegen support for bulk asynchronous copy which are new instructions for Hopper. Also includes some cleanup of PR #15616 in the form of comments and tests. Notably this PR does not include any TIR transform work for lowering to new bulk asynchronous copy instructions; this will come in a future PR. Also note the "workaround" and TODO regarding lack of CUDA codegen support for allocation alignment.

@masahi masahi merged commit d26fdcf into apache:main Sep 5, 2023
@adstraw adstraw deleted the straw-cp-async-bulk branch September 6, 2023 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants