[Bugfix] Fix unstable silu_mul+nvfp4 quant fusion test#24370
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses an unstable silu_mul + nvfp4 quantization fusion test by improving the test data generation. Previously, random quantized tensors and scales were created independently, leading to potential accuracy issues and test flakiness. The new approach generates a floating-point tensor first and then quantizes it, ensuring consistency between the data and its scales. This is a solid improvement that makes the test more reliable. My review includes one high-severity suggestion to make a new utility function more robust by correctly handling tensors with negative values.
aba2c24 to
315f94e
Compare
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
bce6557 to
af7edda
Compare
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
cf37285 to
c56e5c3
Compare
| "model_class", | ||
| cast(list[type], [TestSiluMulFp8QuantModel, TestSiluMulNvfp4QuantModel] | ||
| if is_nvfp4_supported() else [TestSiluMulFp8QuantModel])) |
There was a problem hiding this comment.
To fix the mypy issue:
https://github.com/vllm-project/vllm/actions/runs/17517499414/job/49756911227?pr=24370
Error: tests/compile/test_silu_mul_quant_fusion.py:102: error: List item 0 has incompatible type "type[TestSiluMulFp8QuantModel]"; expected "type[object]" [list-item]
Error: tests/compile/test_silu_mul_quant_fusion.py:102: error: List item 1 has incompatible type "type[TestSiluMulNvfp4QuantModel]"; expected "type[object]" [list-item]
Error: tests/compile/test_silu_mul_quant_fusion.py:103: error: List item 0 has incompatible type "type[TestSiluMulFp8QuantModel]"; expected "type[object]" [list-item]
…24370) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
…24370) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
…24370) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Purpose
After the silu_mul + quant fusion enabled on CI, I observed the fusion with nvfp4 quant test is unstable:
Passed:
https://buildkite.com/vllm/ci/builds/29637/steps/canvas?sid=01991d7c-263a-4e93-b62a-00096c5d7489
Failed:
https://buildkite.com/vllm/ci/builds/29627/steps/canvas?sid=01991ce7-509c-4c7d-9b45-a893b15243e7
The reason is in the unit test we created the nvfp4 tensor/scale randomly, this may break the accuracy. This PR use a more consistent way to create the initial nvfp4 tensor like the way in nvfp4 matmul test:
vllm/tests/kernels/quantization/test_nvfp4_scaled_mm.py
Lines 60 to 84 in 0eadaef
Test Plan && Test Result
tests/compile/test_silu_mul_quant_fusion.py::test_fusion_silu_and_mul_quant[False-TestSiluMulNvfp4QuantModel-128-64]Run the test 500 times:
main:
PR:
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.