[Bugfix][Hardware][AMD] Gate FP4 BMM on gfx950 to fix MI300X crash#35103
[Bugfix][Hardware][AMD] Gate FP4 BMM on gfx950 to fix MI300X crash#35103c0de128 wants to merge 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
The pull request successfully addresses the reported crash on MI300X (gfx942) hardware by gating the FP4 Batched Matrix Multiply (BMM) feature to only be active on supported CDNA4 hardware (gfx950). By adding the on_gfx950() check within is_fp4bmm_enabled(), the system will now correctly fall back to FP8 on MI300X/MI325X instead of attempting to use unsupported MXFP4 operations. The implementation follows the repository's existing pattern of using local imports to handle circular dependencies between the operations and platform modules.
5694483 to
bb15718
Compare
|
@hongxiayang Could you take a look at this when you get a chance? This addresses the same issue as #34647 (gating FP4 BMM on gfx950 to fix MI300X crash) but with a minimal 3-line change in |
|
Note: this is a minimal 3-line alternative to #34647, which has been inactive for 6 days. @hongxiayang you noted #34647 was "very verbose" — this PR gates FP4 BMM entirely within |
MXFP4 quantization requires CDNA4 hardware (gfx950). Gate is_fp4bmm_enabled() on on_gfx950() so MI300X/MI325X (gfx942) gracefully falls back to FP8 instead of crashing. Fixes vllm-project#34641 Signed-off-by: c0de128 <kevin.mckay@outlook.com>
bb15718 to
7e601a6
Compare
|
Rebased onto latest main to keep the branch current. For context: this is the minimal alternative to #34647 (which received CHANGES_REQUESTED). On the underlying issue (#34641), the consensus direction was to gate at the The change is 3 lines — adds an |
|
Closing in favor of #35250, which includes this fix along with the same gfx950 gate for |
Summary
is_fp4bmm_enabled()onon_gfx950()so MI300X/MI325X (gfx942) gracefully falls back to FP8 instead of crashing withRuntimeError: MXFP4 quantization is not supported on gfx942Test plan
is_fp4bmm_enabled()returnsFalse, FP8 fallback used — no crashpre-commit run --all-filespassesFixes #34641