Skip to content

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test#32904

Merged
LucasWilkinson merged 1 commit intovllm-project:mainfrom
ROCm:fix_fp8_kernels_test
Jan 23, 2026
Merged

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test#32904
LucasWilkinson merged 1 commit intovllm-project:mainfrom
ROCm:fix_fp8_kernels_test

Conversation

@mawong-amd
Copy link
Contributor

@mawong-amd mawong-amd commented Jan 23, 2026

Purpose

This PR fixes a failing test (kernels/attention/test_cache.py::test_reshape_and_cache_flash) caused by #30141. The reference dequantization implementation used in the test assumes the FP8 data format is e4m3fn, but on AMD gfx942-series cards, the FP8 data type used is e4m3fnuz instead.

Test Plan

Run
pytest -sv tests/kernels/attention/test_cache.py -k test_reshape_and_cache_flash
on a MI325X as part of the Kernels Attention tests in AMD CI.

Test Result

The above test now passes.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
@mergify mergify bot added rocm Related to AMD ROCm bug Something isn't working labels Jan 23, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a targeted and correct fix for a failing test related to FP8 dequantization. By replacing the hardcoded torch.float8_e4m3fn with current_platform.fp8_dtype(), the test now correctly handles different FP8 formats across various hardware platforms, such as the e4m3fnuz format on certain AMD GPUs. The change is well-scoped and effectively resolves the bug described, improving the test suite's robustness.

@LucasWilkinson LucasWilkinson enabled auto-merge (squash) January 23, 2026 14:53
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 23, 2026
Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@LucasWilkinson LucasWilkinson merged commit 305e53a into vllm-project:main Jan 23, 2026
24 of 25 checks passed
cwazai pushed a commit to cwazai/vllm that referenced this pull request Jan 25, 2026
…oject#32904)

Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>
Signed-off-by: 陈建华 <1647430658@qq.com>
lapy pushed a commit to lapy/vllm that referenced this pull request Jan 27, 2026
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants