[ROCm] Attention selector reordering by gshtras · Pull Request #36702 · vllm-project/vllm

gshtras · 2026-03-10T21:05:45Z

With the unit tests now able to handle this change following
#36025 #35334 and others
Changing the priorities of ROCm attention backends to

ROCM_ATTN - when applicable it is the most performant backend today
AITER_MHA - when explicitly selected
AITER_UNIFIED - a variation of TRITON_ATTN, specifically tuned for ROCm. Will fall back here when the model requires sinks (GPT-OSS)
TRITON_ATTN - the most versatile and as of now the least performant backend

Additionally, even though ROCM_ATTN supports sinks, it would fall back from the custom attention HIP kernel to a triton implementation, so even though technically the support is there, changing the backend to report that it doesn't have support. Actual triton backends (aiter and unified) are better.

Removing VLLM_ROCM_CUSTOM_PAGED_ATTN. If ROCM_ATTN is selected, the idea is to use this kernel anyway.

As a bonus, fixing the AITER supported condition. AITER is not built for and doesn't support gfx90a

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

mergify · 2026-03-10T21:06:29Z

Documentation preview: https://vllm--36702.org.readthedocs.build/en/36702/

gemini-code-assist

Code Review

This pull request refactors the attention backend selection for ROCm to prioritize the ROCM_ATTN backend, which is now considered the most performant. It also removes the VLLM_ROCM_CUSTOM_PAGED_ATTN environment variable, simplifying configuration. As part of these changes, ROCM_ATTN now correctly reports that it does not support attention sinks, ensuring that more suitable backends like AITER_UNIFIED are chosen when sinks are required. My review identifies a potential issue with the new backend priority order which may not align with the intended logic.

vllm/platforms/rocm.py

mergify · 2026-03-11T07:58:53Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @gshtras.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

micah-wil · 2026-03-12T20:42:34Z

AMD CI build with this PR to compare against nightly: https://buildkite.com/vllm/amd-ci/builds/5975/steps/canvas
Everything checks out, so this change should be good to merge.

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

mergify · 2026-03-16T20:36:36Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @gshtras.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

gshtras added 2 commits March 10, 2026 20:08

Rework ROCm attention priorities and capabilities

495882f

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

Remove the env to control custom attn

0cfa8fb

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

mergify bot added documentation Improvements or additions to documentation ci/build rocm Related to AMD ROCm v1 labels Mar 10, 2026

github-project-automation bot added this to AMD Mar 10, 2026

github-project-automation bot moved this to Todo in AMD Mar 10, 2026

gemini-code-assist bot reviewed Mar 10, 2026

View reviewed changes

vllm/platforms/rocm.py Show resolved Hide resolved

mergify bot added the needs-rebase label Mar 11, 2026

gshtras marked this pull request as ready for review March 12, 2026 20:56

gshtras requested review from tdoublep and tjtanaa as code owners March 12, 2026 20:56

Merge remote-tracking branch 'origin/main' into rocm_attn_revisit

29966f2

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

mergify bot removed the needs-rebase label Mar 12, 2026

mergify bot added the needs-rebase label Mar 16, 2026

Merge remote-tracking branch 'origin/main' into rocm_attn_revisit

a0e9e61

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

mergify bot removed the needs-rebase label Mar 17, 2026

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] Attention selector reordering#36702

[ROCm] Attention selector reordering#36702
gshtras wants to merge 4 commits intovllm-project:mainfrom
ROCm:rocm_attn_revisit

gshtras commented Mar 10, 2026 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Mar 11, 2026

Uh oh!

micah-wil commented Mar 12, 2026

Uh oh!

mergify bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

gshtras commented Mar 10, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Mar 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Mar 11, 2026

Uh oh!

micah-wil commented Mar 12, 2026

Uh oh!

mergify bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gshtras commented Mar 10, 2026 •

edited by github-actions bot

Loading