Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion vllm_gaudi/ops/hpu_fused_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,10 @@ def forward_oot(
permuted_weights=True,
activation=layer.activation,
)
return output.view(*(output.size(0), *input_shape[1:]))
if layer.dp_size > 1:
return output.view(*(output.size(0), *input_shape[1:]))
else:
Comment on lines +163 to +165
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional logic based on dp_size > 1 appears to be a workaround for handling different input shapes rather than addressing the root cause. Consider explicitly checking the input tensor dimensionality (len(input_shape)) to make the intent clearer and more maintainable. This would better document why different reshaping strategies are needed and make the code less fragile if dp_size semantics change.

Suggested change
if layer.dp_size > 1:
return output.view(*(output.size(0), *input_shape[1:]))
else:
if len(input_shape) == 2:
# Handle 2D inputs where the leading dimension may have been
# modified (e.g. by data parallel dispatch); keep the trailing
# dimension(s) from the original shape and infer the leading one
# from the actual output tensor.
return output.view(output.size(0), *input_shape[1:])
else:
# For higher-rank inputs, restore the original shape directly.

Copilot uses AI. Check for mistakes.
return output.view(*input_shape)


def reduce_output(self, states: torch.Tensor) -> torch.Tensor:
Expand Down