Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion python/sglang/srt/models/deepseek_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,8 @@ def forward(self, hidden_states):
)

if (
hidden_states.shape[0] < 4
_is_cuda
and hidden_states.shape[0] < 4
and hidden_states.shape[1] == 7168
and self.weight.shape[0] == 256
and _device_sm >= 90
Comment on lines 235 to 240
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change correctly restricts a CUDA-specific optimization path, preventing potential crashes on other hardware like AMD GPUs. To ensure long-term stability and prevent regressions, it would be beneficial to add a unit test for this logic. A test could simulate a non-CUDA environment (e.g., by mocking _is_cuda to False and _device_sm to a value >= 90) and verify that this specialized code path is not executed.

Expand Down
Loading