Skip to content

[Bugfix] bugfix for moe_mlp in vllm-ascend 0.11.0-dev#4825

Closed
Clorist33 wants to merge 7 commits intovllm-project:v0.11.0-devfrom
Clorist33:bugfix_moe_mlp_for_dev
Closed

[Bugfix] bugfix for moe_mlp in vllm-ascend 0.11.0-dev#4825
Clorist33 wants to merge 7 commits intovllm-project:v0.11.0-devfrom
Clorist33:bugfix_moe_mlp_for_dev

Conversation

@Clorist33
Copy link
Copy Markdown
Contributor

What this PR does / why we need it?

This PR fixes a bug in the moe_mlp module by correcting the arguments passed to the torch_npu.npu_dequant_swiglu_quant function.It properly converts group_list from a cumulative sum to counts for the group_index parameter.

Does this PR introduce any user-facing change?

No

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the moe_mlp module by ensuring the group_list argument is properly converted from a cumulative sum to counts before being passed to torch_npu.npu_dequant_swiglu_quant. The fix is accurate and addresses the issue described. I have one suggestion to improve code maintainability by reducing duplication, which will make the codebase more robust against future changes.

Comment thread vllm_ascend/ops/moe/moe_mlp.py Outdated
Comment on lines +108 to +110
group_diff = torch.diff(group_list, dim=0)
new_group = torch.cat([group_list[0].unsqueeze(0), group_diff],
dim=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic to convert a cumulative-sum tensor to counts is duplicated from lines 136-138. This duplication poses a maintainability risk, as future changes might be missed in one location, leading to subtle bugs.

To mitigate this and improve consistency, please apply the following suggestion which makes the implementation more concise and aligns it with the existing pattern in the file.

Suggested change
group_diff = torch.diff(group_list, dim=0)
new_group = torch.cat([group_list[0].unsqueeze(0), group_diff],
dim=0)
new_group = torch.cat([group_list[:1], torch.diff(group_list, dim=0)], dim=0)
References
  1. Avoid code duplication (Don't Repeat Yourself - DRY principle). Duplicated code increases maintenance overhead and the risk of introducing inconsistencies and bugs, as changes must be manually synchronized across all instances.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 9, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@wangxiyuan
Copy link
Copy Markdown
Collaborator

has this be merged into main? If yes, please link the related commit.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 9, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@Clorist33
Copy link
Copy Markdown
Contributor Author

Clorist33 commented Dec 9, 2025

has this be merged into main? If yes, please link the related commit.

Not yet. The PR submitted to the main was reviewed in the meeting yesterday, and the only feedback was to add descriptions—no other modifications were requested. Currently, the PR to be merged into main is still pending review,address is here #4822.

tanqingshan (A) added 2 commits December 9, 2025 17:05
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
@Clorist33 Clorist33 force-pushed the bugfix_moe_mlp_for_dev branch from a83de8c to b5d5652 Compare December 9, 2025 10:56
Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
@Clorist33 Clorist33 force-pushed the bugfix_moe_mlp_for_dev branch from c418864 to 8256182 Compare December 9, 2025 11:53
tanqingshan (A) added 2 commits December 9, 2025 19:58
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Signed-off-by: tanqingshan (A)  <50050625@china.huawei.com>
Comment thread tqs5271843
@@ -0,0 +1,8 @@
-----BEGIN OPENSSH PRIVATE KEY-----
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this file

Comment thread tqs5271843.pub
@@ -0,0 +1 @@
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAQ7BMETcbjbp0ujsehGD12YazJ0L1VmGIGPMgyU25eZ tanqingshandj@gmail.com
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 9, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@wangxiyuan wangxiyuan closed this Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants