[Bugfix] Fix Dynamo unexpected keyword argument #34320
[Bugfix] Fix Dynamo unexpected keyword argument #34320vllm-bot merged 5 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Samu Tamminen <stammine@amd.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a TypeError that occurs with torch.compile on ROCm when the quant_fp8 custom operation is disabled. The error was caused by an unexpected use_triton keyword argument being passed through **kwargs. The fix involves changing the signatures of forward_cuda, forward_hip, and forward_native methods in the QuantFP8 class to explicitly include use_triton as a keyword argument. This change makes the API consistent across different implementations and resolves the issue with Dynamo tracing. The fix is correct, well-targeted, and improves code clarity.
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
yewentao256
left a comment
There was a problem hiding this comment.
Could you take a look at CI failure? Maybe related
Looking into it. Many of the CI tests fail with: Then |
|
H100 is down, the rest are known failures on main |
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: athrael-soju <athrael-soju@users.noreply.github.com>
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Eldar Kurtic <research@neuralmagic.com>
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Samu Tamminen <stammine@amd.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Purpose
Fix QuantFP8 with torch.compile on ROCm when CustomOP
quant_fp8is disabled with--compilation-config '{"custom_ops": ["-quant_fp8"]}'.Current main branch raises error:
This was introduced in #33047 .
Test Plan
Server
Test Result
After moving
use_tritonfrom kwargs to positional argument, Dynamo error disappears.Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.