[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test by mawong-amd · Pull Request #32904 · vllm-project/vllm

mawong-amd · 2026-01-23T03:50:29Z

Purpose

This PR fixes a failing test (kernels/attention/test_cache.py::test_reshape_and_cache_flash) caused by #30141. The reference dequantization implementation used in the test assumes the FP8 data format is e4m3fn, but on AMD gfx942-series cards, the FP8 data type used is e4m3fnuz instead.

Test Plan

Run
pytest -sv tests/kernels/attention/test_cache.py -k test_reshape_and_cache_flash
on a MI325X as part of the Kernels Attention tests in AMD CI.

Test Result

The above test now passes.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>

gemini-code-assist

Code Review

This pull request provides a targeted and correct fix for a failing test related to FP8 dequantization. By replacing the hardcoded torch.float8_e4m3fn with current_platform.fp8_dtype(), the test now correctly handles different FP8 formats across various hardware platforms, such as the e4m3fnuz format on certain AMD GPUs. The change is well-scoped and effectively resolves the bug described, improving the test suite's robustness.

yewentao256

LGTM, thanks for the work!

…oject#32904) Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com> Signed-off-by: 陈建华 <1647430658@qq.com>

…oject#32904) Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>

Use current_platform.fp8_dtype in Kernels Attention Cache test

7a7f0c0

Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>

mawong-amd requested review from WoosukKwon, mgoin, tlrmchlsmth and yewentao256 as code owners January 23, 2026 03:50

mergify bot added rocm Related to AMD ROCm bug Something isn't working labels Jan 23, 2026

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

mawong-amd mentioned this pull request Jan 23, 2026

Add llmcompressor fp8 kv-cache quant (per-tensor and per-attn_head) #30141

Merged

LucasWilkinson approved these changes Jan 23, 2026

View reviewed changes

LucasWilkinson enabled auto-merge (squash) January 23, 2026 14:53

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 23, 2026

yewentao256 approved these changes Jan 23, 2026

View reviewed changes

LucasWilkinson merged commit 305e53a into vllm-project:main Jan 23, 2026
24 of 25 checks passed

AndreasKaratzas mentioned this pull request Jan 23, 2026

[CI Failure]: mi325_8: Kernels Attention Test %N #32972

Closed

3 tasks

cwazai pushed a commit to cwazai/vllm that referenced this pull request Jan 25, 2026

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test (vllm-pr…

04c168d

…oject#32904) Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com> Signed-off-by: 陈建华 <1647430658@qq.com>

lapy pushed a commit to lapy/vllm that referenced this pull request Jan 27, 2026

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test (vllm-pr…

f03adc5

…oject#32904) Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test (vllm-pr…

3dd96f0

…oject#32904) Signed-off-by: Matthew Wong <Matthew.Wong2@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test#32904

[Hardware][AMD][CI][Bugfix] Fix Kernels Attention Cache test#32904
LucasWilkinson merged 1 commit intovllm-project:mainfrom
ROCm:fix_fp8_kernels_test

mawong-amd commented Jan 23, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mawong-amd commented Jan 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mawong-amd commented Jan 23, 2026 •

edited by github-actions bot

Loading