[CK_TILE] Update CK and add RDNA build support#178
Merged
rocking5566 merged 10 commits intock_improve_mainfrom Mar 26, 2026
Merged
[CK_TILE] Update CK and add RDNA build support#178rocking5566 merged 10 commits intock_improve_mainfrom
rocking5566 merged 10 commits intock_improve_mainfrom
Conversation
78af732 to
4a318f7
Compare
376c625 to
3f99c2a
Compare
There was a problem hiding this comment.
Pull request overview
Updates the ROCm CK backend to a newer composable_kernel revision and extends build/runtime support for RDNA (gfx11/gfx12), including updated FMHA argument wiring and targeted backward guards.
Changes:
- Bump
composable_kernelsubmodule and adjust CK FMHA args to match the updated interface. - Add gfx11/gfx12 build targeting and default
CK_TILE_FLOAT_TO_BFLOAT16_DEFAULTbehavior for gfx11 targets. - Gate/skip CK backward in tests and runtime for unsupported/unstable gfx1x backward paths; add optional “LLC Head Grouping” forward dispatch.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_flash_attn_ck.py | Adds gfx11/gfx12 detection helpers and skips/guards for unsupported CK backward on gfx1x. |
| setup.py | Extends supported GPU_ARCHS, adds target detection / forwarding to CK kernel generator, and adjusts CK_TILE defaults for gfx11. |
| csrc/flash_attn_ck/mha_varlen_fwd.cpp | Updates CK args and adds optional head-grouped forward dispatch with logging. |
| csrc/flash_attn_ck/mha_varlen_bwd.cpp | Adds gfx1x backward support checks before allocating/intermediate work. |
| csrc/flash_attn_ck/mha_fwd.cpp | Updates CK args and adds optional head-grouped forward dispatch with logging. |
| csrc/flash_attn_ck/mha_bwd.cpp | Adds gfx1x backward support checks before allocating/intermediate work. |
| csrc/flash_attn_ck/flash_common.hpp | Adds ROCm arch detection helpers and a gfx1x deterministic/backward guard. |
| csrc/composable_kernel | Updates submodule pointer to the newer CK commit. |
| README.md | Documents RDNA 3/4 support and current backward limitations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
rocking5566
reviewed
Mar 25, 2026
Collaborator
rocking5566
left a comment
There was a problem hiding this comment.
Overall looks good — the RDNA support and backward guards are well-structured. A few items worth addressing beyond what Copilot already flagged (I've replied on those threads separately).
541d5a2 to
12b0f2a
Compare
12b0f2a to
d16038e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Update CK to a gfx1x FMHA-capable version and align FlashAttention CK argument wiring accordingly.
Technical Details
composable_kerneltoe5683e2CK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=0for gfx11 targetsTest Plan
pytest tests/test_flash_attn_ck.py
Test Result
No failures on both gfx11 and gfx12
Submission Checklist