add swiglu limits for shared experts activation by zyongye · Pull Request #29 · ivanium/vllm

zyongye · 2026-04-27T01:02:42Z

No description provided.

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

gemini-code-assist

Code Review

This pull request introduces clamping functionality to the silu_and_mul activation kernel and the SiluAndMul layer, primarily to support DeepSeek-V4. The changes include updates to the CUDA kernels to handle clamping logic, modifications to the C++ bindings, and the addition of a swiglu_limit parameter in the Python layer. A potential inconsistency was identified between the native PyTorch implementation and the CUDA implementation when the limit is set to zero, which could lead to divergent behavior.

gemini-code-assist · 2026-04-27T01:07:34Z

+        if self.swiglu_limit is not None:
+            gate = torch.clamp(gate, max=self.swiglu_limit)
+            up = torch.clamp(up, min=-self.swiglu_limit, max=self.swiglu_limit)


There is a potential inconsistency between the native and CUDA implementations when swiglu_limit is set to 0.0. In forward_native, any value of swiglu_limit that is not None (including 0.0) will trigger clamping, effectively zeroing out the output. However, in forward_cuda (and the underlying CUDA kernel), clamping is only enabled if limit > 0.0. While swiglu_limit is typically a positive value, it's safer to align the logic to avoid discrepancies.

Suggested change

if self.swiglu_limit is not None:

gate = torch.clamp(gate, max=self.swiglu_limit)

up = torch.clamp(up, min=-self.swiglu_limit, max=self.swiglu_limit)

if self.swiglu_limit is not None and self.swiglu_limit > 0:

gate = torch.clamp(gate, max=self.swiglu_limit)

up = torch.clamp(up, min=-self.swiglu_limit, max=self.swiglu_limit)

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

ivanium · 2026-04-27T22:10:30Z

merged in vllm-project#40950

add swiglu limits for shared experts activation

0179474

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

switch clamp value

d399920

Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>

ivanium closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add swiglu limits for shared experts activation#29

add swiglu limits for shared experts activation#29
zyongye wants to merge 2 commits intoivanium:feat/dsv4-supportfrom
zyongye:silu_mul_with_clamp

zyongye commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

ivanium commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zyongye commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

ivanium commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants