[ROCm] Track latest `composable_kernel` by a4lg · Pull Request #2038 · Dao-AILab/flash-attention

a4lg · 2025-12-02T04:50:22Z

The latest version of composable_kernel supports more versatile architectures and no longer assumes wavefront size of 64 (where many compiler errors come from).

This commit updates the composable_kernel submodule along with necessary interface changes.

This is a part of my attempt to support Flash Attention for Strix Halo (gfx1151) and it seems... CK portions used by Flash Attention already supports this architecture (not entire CK, though). I believe that my interface changes are fine (only adds/removes defaults).

p.s.
If someone has an AMD hardware already supported by Flash Attention (i.e. AMD Instinct), can you check the test results before and after this PR? In my Strix Halo environment, about half of the tests fail due to high arithmetic errors and I'd like to see whether this behavior is Strix Halo-specific.
If that's not Strix Halo-specific, I'll submit a follow-up PR to allow versatile AMD GPU architectures (to setup.py; possibly RDNA 2 or later?).

The latest version of `composable_kernel` supports more versatile architectures and no longer assumes wavefront size of 64. This commit updates the `composable_kernel` submodule along with necessary interface changes.

tridao · 2025-12-02T21:34:03Z

@rocking5566 does this interface change affect any existing case?

rocking5566 · 2025-12-06T12:24:49Z

@a4lg could you also change c++ version in setup.py from c++17 to c++20?
Because lastest CK use c++20 by default

rocking5566 · 2025-12-06T12:49:41Z

Actually, we are doing the similiar thing.
just finish the testing and about to send the PR from this branch.
https://github.com/ROCm/flash-attention/tree/ck_improve_v0.1.8

a4lg · 2025-12-06T13:10:21Z

@rocking5566
As long as CK is correctly updated, I don't stick to my changes.
BTW, my updated branch with suggested changes will be ready tomorrow (because I'm on a trip).

rocking5566 · 2025-12-06T13:57:28Z

@rocking5566 As long as CK is correctly updated, I don't stick to my changes. BTW, my updated branch with suggested changes will be ready tomorrow (because I'm on a trip).

I submit the PR and change the c++ version.
#2052

We also test correctness and performance in this version (commit id) of CK in both MI300 and MI350.

a4lg · 2025-12-07T03:35:06Z

Closing in favor of #2052.

[ROCm] Track latest composable_kernel

29b80d6

The latest version of `composable_kernel` supports more versatile architectures and no longer assumes wavefront size of 64. This commit updates the `composable_kernel` submodule along with necessary interface changes.

a4lg mentioned this pull request Dec 7, 2025

[AMD ROCm] Update to latest composable_kernel to improve performance #2052

Merged

a4lg closed this Dec 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] Track latest `composable_kernel`#2038

[ROCm] Track latest `composable_kernel`#2038
a4lg wants to merge 1 commit intoDao-AILab:mainfrom
a4lg:update-ck-for-versatile-amd-support-1

a4lg commented Dec 2, 2025

Uh oh!

tridao commented Dec 2, 2025

Uh oh!

rocking5566 commented Dec 6, 2025 •

edited

Loading

Uh oh!

rocking5566 commented Dec 6, 2025 •

edited

Loading

Uh oh!

a4lg commented Dec 6, 2025 •

edited

Loading

Uh oh!

rocking5566 commented Dec 6, 2025

Uh oh!

a4lg commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

a4lg commented Dec 2, 2025

Uh oh!

tridao commented Dec 2, 2025

Uh oh!

rocking5566 commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocking5566 commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a4lg commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocking5566 commented Dec 6, 2025

Uh oh!

a4lg commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rocking5566 commented Dec 6, 2025 •

edited

Loading

rocking5566 commented Dec 6, 2025 •

edited

Loading

a4lg commented Dec 6, 2025 •

edited

Loading