[Test] Add FP8 KV Cache Testing for MLA Backends#34473
[Test] Add FP8 KV Cache Testing for MLA Backends#34473LucasWilkinson merged 5 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request enhances the MLA backend test coverage by adding support for FP8 KV cache testing. The changes correctly parameterize the tests for different kv_cache_dtype values and generalize the test logic to handle various FP8 formats. My review found one critical issue where a safety check was removed, which could lead to a runtime error. I've provided a suggestion to fix it.
fc4170b to
aafb941
Compare
|
@pavanimajety Could you help review this PR? |
MatthewBonanni
left a comment
There was a problem hiding this comment.
LGTM, thanks for the contribution!
pavanimajety
left a comment
There was a problem hiding this comment.
Thanks for the contribution @wzhao18, LGTM!
For a future self note: We need to add tests for chunked prefill when it uses FP8 KV Cache + MHA kernels
|
@pavanimajety Please feel free to pin me and Xin for assistance on improving the test coverage. |
2cd12a0 to
29aecb2
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
29aecb2 to
e456a25
Compare
e456a25 to
50c031d
Compare
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
50c031d to
1622bad
Compare
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
|
@mgoin Relevant CI tests are passing. I believe the failing ones also fail on main. |
yewentao256
left a comment
There was a problem hiding this comment.
One of the issue could be fixed by #34913
Purpose
This PR improves the MLA backend test coverage to include fp8 kv cache testing.
Test Plan
Tested
tests/v1/attention/Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.