[Test] Add FP8 KV Cache Testing for MLA Backends by wzhao18 · Pull Request #34473 · vllm-project/vllm

wzhao18 · 2026-02-12T22:49:21Z

Purpose

This PR improves the MLA backend test coverage to include fp8 kv cache testing.

Test Plan

Tested tests/v1/attention/

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request enhances the MLA backend test coverage by adding support for FP8 KV cache testing. The changes correctly parameterize the tests for different kv_cache_dtype values and generalize the test logic to handle various FP8 formats. My review found one critical issue where a safety check was removed, which could lead to a runtime error. I've provided a suggestion to fix it.

tests/v1/attention/test_mla_backends.py

wzhao18 · 2026-02-13T04:41:28Z

@pavanimajety Could you help review this PR?

MatthewBonanni

LGTM, thanks for the contribution!

mgoin

LGTM, thanks

pavanimajety

Thanks for the contribution @wzhao18, LGTM!

For a future self note: We need to add tests for chunked prefill when it uses FP8 KV Cache + MHA kernels

wzhao18 · 2026-02-17T17:52:56Z

@pavanimajety Please feel free to pin me and Xin for assistance on improving the test coverage.

mergify · 2026-02-18T17:42:45Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @wzhao18.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>

yewentao256

LGTM, thanks for the work!

wzhao18 · 2026-02-18T20:56:48Z

@mgoin Relevant CI tests are passing. I believe the failing ones also fail on main.

yewentao256

One of the issue could be fixed by #34913

LucasWilkinson

LGTM, thanks

mergify bot added the v1 label Feb 12, 2026

gemini-code-assist bot reviewed Feb 12, 2026

View reviewed changes

tests/v1/attention/test_mla_backends.py Show resolved Hide resolved

wzhao18 force-pushed the wzhao/flashinfer-mla-fp8-tests branch from fc4170b to aafb941 Compare February 13, 2026 04:14

mgoin requested review from LucasWilkinson and pavanimajety February 17, 2026 15:43

MatthewBonanni approved these changes Feb 17, 2026

View reviewed changes

mgoin approved these changes Feb 17, 2026

View reviewed changes

mgoin added ready ONLY add when PR is ready to merge/full CI is needed ci/build labels Feb 17, 2026

pavanimajety approved these changes Feb 17, 2026

View reviewed changes

wzhao18 force-pushed the wzhao/flashinfer-mla-fp8-tests branch from 2cd12a0 to 29aecb2 Compare February 18, 2026 17:39

wzhao18 requested review from ApostaC, WoosukKwon, aarnphm, alexm-redhat, bigPYJ1151, heheda12345, hmellor, markmc, njhill, noooop, orozery, patrickvonplaten, robertgshaw2-redhat, russellb, sighingnow, tjtanaa and ywang96 as code owners February 18, 2026 17:39

github-project-automation bot moved this to Ready in NVIDIA Feb 18, 2026

mergify bot added the cpu Related to CPU backends label Feb 18, 2026

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Feb 18, 2026

mergify bot added structured-output tpu Related to Google TPUs labels Feb 18, 2026

github-project-automation bot added this to Structured Output Feb 18, 2026

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Feb 18, 2026

wzhao18 marked this pull request as draft February 18, 2026 17:40

mergify bot added the kv-connector label Feb 18, 2026

mergify bot added the needs-rebase label Feb 18, 2026

wzhao18 force-pushed the wzhao/flashinfer-mla-fp8-tests branch from 29aecb2 to e456a25 Compare February 18, 2026 17:46

mergify bot removed the tpu Related to Google TPUs label Feb 18, 2026

wzhao18 force-pushed the wzhao/flashinfer-mla-fp8-tests branch from e456a25 to 50c031d Compare February 18, 2026 17:46

wzhao18 added 4 commits February 18, 2026 12:47

Test Fp8 flashinfer MLA backend

5800947

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>

Fix comments

8ef6a0b

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>

Fix format

4f1ac6b

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>

fix hopper test on flashMLA

1622bad

Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>

wzhao18 force-pushed the wzhao/flashinfer-mla-fp8-tests branch from 50c031d to 1622bad Compare February 18, 2026 17:47

wzhao18 marked this pull request as ready for review February 18, 2026 17:48

mergify bot removed the needs-rebase label Feb 18, 2026

yewentao256 approved these changes Feb 18, 2026

View reviewed changes

yewentao256 reviewed Feb 19, 2026

View reviewed changes

LucasWilkinson approved these changes Feb 19, 2026

View reviewed changes

LucasWilkinson enabled auto-merge (squash) February 19, 2026 21:51

yewentao256 approved these changes Feb 20, 2026

View reviewed changes

Merge branch 'main' into wzhao/flashinfer-mla-fp8-tests

2f2fdb3

LucasWilkinson merged commit f24b2de into vllm-project:main Feb 20, 2026
19 checks passed

github-project-automation bot moved this to Done in Structured Output Feb 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Test] Add FP8 KV Cache Testing for MLA Backends#34473

[Test] Add FP8 KV Cache Testing for MLA Backends#34473
LucasWilkinson merged 5 commits intovllm-project:mainfrom
wzhao18:wzhao/flashinfer-mla-fp8-tests

wzhao18 commented Feb 12, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

wzhao18 commented Feb 13, 2026

Uh oh!

MatthewBonanni left a comment

Uh oh!

mgoin left a comment

Uh oh!

pavanimajety left a comment

Uh oh!

wzhao18 commented Feb 17, 2026

Uh oh!

mergify bot commented Feb 18, 2026

Uh oh!

yewentao256 left a comment

Uh oh!

wzhao18 commented Feb 18, 2026

Uh oh!

yewentao256 left a comment

Uh oh!

LucasWilkinson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

wzhao18 commented Feb 12, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

wzhao18 commented Feb 13, 2026

Uh oh!

MatthewBonanni left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

pavanimajety left a comment

Choose a reason for hiding this comment

Uh oh!

wzhao18 commented Feb 17, 2026

Uh oh!

mergify bot commented Feb 18, 2026

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

wzhao18 commented Feb 18, 2026

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

wzhao18 commented Feb 12, 2026 •

edited by github-actions bot

Loading