[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp#23725
[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp#23725ProExpertProg merged 8 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Julien Lin <jullin@nvidia.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the Fp8LinearOp to remove the force_fp8_e4m3fnuz parameter, simplifying its interface. The logic to force a specific backend for testing is now handled in the tests themselves using a new override_cutlass_fp8_supported context manager, which is a good improvement. However, I've found a potential issue in the test parametrization for cases where cutlass is not supported, which could leave a code path untested. My review includes suggestions to fix this.
ProExpertProg
left a comment
There was a problem hiding this comment.
LGTM, thanks for this follow up. Could you just add quick comments in the tests that we do this in order to test fusion for the non-cutlass path on cutlass platform?
Signed-off-by: Julien Lin <jullin@nvidia.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
@nvjullin it looks like the test failure is related, trying a fix |
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com>
Head branch was pushed to by a user without write access
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Purpose
Follow up on #22895.
Removed
force_fp8_e4m3fnuzand monkey-patch to test for torch code path on cutlass_fp8_supported platforms.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.