[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI by micah-wil · Pull Request #29367 · vllm-project/vllm

micah-wil · 2025-11-25T03:19:24Z

We are seeing failures in the tests/v1/cudagraph/test_cudagraph_mode.py test in AMD CI after #26980 was merged. It fails because it reaches the error "V0 attention backends have been removed. Set VLLM_USE_V1=1 to select a supported backend" since the test tries to use the FlashAttn backend. I updated the test to test ROCm attention backends if current_platform.is_rocm().

After this PR, we see:

pytest -v -s v1/cudagraph/test_cudagraph_mode.py:

=================================================== warnings summary ====================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================= 12 passed, 4 skipped, 2 warnings in 150.51s (0:02:30) =================================

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

gemini-code-assist

Code Review

This pull request resolves a CI failure on ROCm by implementing a fallback to the Triton attention backend when an unsupported backend is selected, instead of raising a RuntimeError. This is a sensible approach to make the system more robust. My review includes a suggestion to refine the warning message to be more specific about the reason for the fallback, which will improve clarity and aid in future debugging efforts.

vllm/platforms/rocm.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/platforms/rocm.py

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

tjtanaa

Thank you for the fix.

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil · 2025-11-25T05:32:13Z

Hey @tjtanaa, I have updated the test_cudagraph_mode test itself to test ROCm attention backends. I also reverted the change of defaulting to TritonAttn when trying to use an invalid attention backend. Could you please take another look? Thanks

cc @ProExpertProg

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

Fall back to unified attention

63d4b7f

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil requested a review from tjtanaa as a code owner November 25, 2025 03:19

mergify bot added nvidia rocm Related to AMD ROCm labels Nov 25, 2025

github-project-automation bot added this to NVIDIA Nov 25, 2025

gemini-code-assist bot reviewed Nov 25, 2025

View reviewed changes

vllm/platforms/rocm.py Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Nov 25, 2025

View reviewed changes

vllm/platforms/rocm.py Outdated Show resolved Hide resolved

clearer warning message

e2393a0

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

tjtanaa approved these changes Nov 25, 2025

View reviewed changes

github-project-automation bot moved this to In review in NVIDIA Nov 25, 2025

tjtanaa enabled auto-merge (squash) November 25, 2025 04:10

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 25, 2025

add rocm-specific tests to test_cudagraph_mode

3988630

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

auto-merge was automatically disabled November 25, 2025 05:25
Head branch was pushed to by a user without write access

mergify bot added the v1 label Nov 25, 2025

revert change in rocm attention backend selctor

1756c37

Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil changed the title ~~[ROCm][CI] Fall back to Triton Unified Attention on ROCm to resolve test_cudagraph_mode failure~~ [ROCm][CI] Fix test_cudagraph_mode failure in AMD CI Nov 25, 2025

tjtanaa enabled auto-merge (squash) November 25, 2025 06:10

tjtanaa merged commit ef1f703 into vllm-project:main Nov 25, 2025
49 checks passed

github-project-automation bot moved this from In review to Done in NVIDIA Nov 25, 2025

micah-wil deleted the micah/CI_cudagraph_test branch November 25, 2025 13:35

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI (vllm-project#29367

633d178

) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025

[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI (vllm-project#29367

c3be77e

) Signed-off-by: Micah Williamson <micah.williamson@amd.com>

micah-wil mentioned this pull request Dec 1, 2025

[ROCm][CI] Fix test_cudagraph_mode.py Failure For AMD CI #29808

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI#29367

[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI#29367
tjtanaa merged 4 commits intovllm-project:mainfrom
ROCm:micah/CI_cudagraph_test

micah-wil commented Nov 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

tjtanaa left a comment

Uh oh!

micah-wil commented Nov 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

micah-wil commented Nov 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

tjtanaa left a comment

Choose a reason for hiding this comment

Uh oh!

micah-wil commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

micah-wil commented Nov 25, 2025 •

edited by github-actions bot

Loading

micah-wil commented Nov 25, 2025 •

edited

Loading