Skip to content

Commit 9335e0e

Browse files
committed
use_workspace_output only supports w4a8_mxfp4_mxfp8 (gpt-oss) for now.
Signed-off-by: Bo Li <[email protected]>
1 parent 8248034 commit 9335e0e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

tensorrt_llm/_torch/modules/fused_moe/fused_moe_trtllm_gen.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,8 @@ def forward_impl(
503503

504504
moe_output: Optional[torch.Tensor] = None
505505
use_workspace_output = False
506-
if self.enable_alltoall and self.moe_alltoall_backend == "mnnvlthroughput":
506+
# TODO: use_workspace_output only supports w4a8_mxfp4_mxfp8 (gpt-oss) for now
507+
if self.enable_alltoall and self.moe_alltoall_backend == "mnnvlthroughput" and self.has_w4a8_mxfp4_mxfp8:
507508
moe_output = self.moe_a2a.get_combine_payload_tensor_in_workspace(
508509
runtime_max_tokens_per_rank, self.hidden_size, torch.bfloat16)
509510
use_workspace_output = True

0 commit comments

Comments
 (0)