[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI#29367
[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI#29367tjtanaa merged 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
There was a problem hiding this comment.
Code Review
This pull request resolves a CI failure on ROCm by implementing a fallback to the Triton attention backend when an unsupported backend is selected, instead of raising a RuntimeError. This is a sensible approach to make the system more robust. My review includes a suggestion to refine the warning message to be more specific about the reason for the fallback, which will improve clarity and aid in future debugging efforts.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Head branch was pushed to by a user without write access
|
Hey @tjtanaa, I have updated the test_cudagraph_mode test itself to test ROCm attention backends. I also reverted the change of defaulting to TritonAttn when trying to use an invalid attention backend. Could you please take another look? Thanks |
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
We are seeing failures in the
tests/v1/cudagraph/test_cudagraph_mode.pytest in AMD CI after #26980 was merged. It fails because it reaches the error "V0 attention backends have been removed. Set VLLM_USE_V1=1 to select a supported backend" since the test tries to use the FlashAttn backend. I updated the test to test ROCm attention backends if current_platform.is_rocm().After this PR, we see: