[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev by Clorist33 · Pull Request #4885 · vllm-project/vllm-ascend

Clorist33 · 2025-12-10T08:43:01Z

What this PR does / why we need it?

This PR fixes a bug in the moe_mlp module by correcting the arguments passed to the torch_npu.npu_dequant_swiglu_quant function.It properly converts group_list from a cumulative sum to counts for the group_index parameter.

Does this PR introduce any user-facing change?

No

vLLM version: v0.12.0
vLLM main: https://github.com/vllm-project/vllm/main

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

gemini-code-assist

Code Review

This pull request refactors the cumsum_group_list function to support bidirectional conversions between cumulative sum and count representations of expert groups. While this is a good improvement for code clarity, the refactoring has introduced a critical bug in the conversion from cumulative sum to counts. I've provided a code suggestion to fix this. Additionally, I've pointed out that the test suite is missing coverage for this new conversion path, which is why the bug was not caught. Please add the suggested test case to prevent future regressions.

gemini-code-assist · 2025-12-10T08:44:03Z

+    if src_list_type == 0 and dst_list_type == 1:
+        group_diff = torch.diff(group_list)
+        new_group = torch.cat([group_diff[0].unsqueeze(0), group_diff], dim=0)
+        return new_group


There is a bug in the logic for converting group_list from cumulative sum (src_list_type=0) to counts (dst_list_type=1). The implementation incorrectly uses group_diff[0] instead of group_list[0] to construct the new group tensor. This will lead to incorrect counts and subsequent errors in npu_dequant_swiglu_quant.

The previous implementation before this refactoring was correct. You should use group_list[0] to get the first element of the cumulative sum, which corresponds to the first count.

Suggested change

if src_list_type == 0 and dst_list_type == 1:

group_diff = torch.diff(group_list)

new_group = torch.cat([group_diff[0].unsqueeze(0), group_diff], dim=0)

return new_group

if src_list_type == 0 and dst_list_type == 1:

group_diff = torch.diff(group_list)

new_group = torch.cat([group_list[0].unsqueeze(0), group_diff], dim=0)

return new_group

gemini-code-assist · 2025-12-10T08:44:03Z

        group_list_type = 0
-        result = cumsum_group_list(group_list, group_list_type)
+        result = cumsum_group_list(group_list, group_list_type, 0)
        self.assertTrue(torch.equal(result, self.group_list))


The test suite for cumsum_group_list is incomplete. It's missing a test case for converting from src_list_type=0 (cumulative sum) to dst_list_type=1 (counts). Adding this test case would have caught the critical bug introduced in moe_mlp.py.

Please add a test to cover this conversion. For example:

def test_cumsum_group_list_from_type_0_to_1(self): group_list_cumsum = self.experts.cumsum(dim=0) result = cumsum_group_list(group_list_cumsum, src_list_type=0, dst_list_type=1) self.assertTrue(torch.equal(result, self.experts))

github-actions · 2025-12-10T09:49:13Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

wangxiyuan · 2025-12-11T10:00:38Z

picked from #4822

bugfix for moe_mlp in vllm-ascend/v0.11.0-dev

ce1e68e

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

gemini-code-assist bot reviewed Dec 10, 2025

View reviewed changes

github-actions bot added module:tests module:ops labels Dec 10, 2025

Mercykid-bash added 5 commits December 11, 2025 17:19

Set moe_expert_num based on expert_map length

1b47203

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Update test_token_dispatcher.py

559fc6b

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Set moe_expert_num based on expert_map length

1fc4005

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Update test_expert_load_balancer.py

42ab319

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Update test_eplb_device_transfer_loader.py

ac2a72c

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Clorist33 force-pushed the bugfix_moe_mlp_for_dev_new branch from 56bf855 to ac2a72c Compare December 11, 2025 09:19

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 11, 2025

weijinqian0 approved these changes Dec 12, 2025

View reviewed changes

wangxiyuan merged commit 4f0dddc into vllm-project:v0.11.0-dev Dec 12, 2025
36 of 40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev#4885

[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev#4885
wangxiyuan merged 6 commits intovllm-project:v0.11.0-devfrom
Clorist33:bugfix_moe_mlp_for_dev_new

Clorist33 commented Dec 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 10, 2025

Uh oh!

gemini-code-assist bot Dec 10, 2025

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

wangxiyuan commented Dec 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Clorist33 commented Dec 10, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

wangxiyuan commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wangxiyuan commented Dec 11, 2025 •

edited

Loading