[Bugfix] bugfix for moe_mlp in vllm-ascend 0.11.0-dev#4825
[Bugfix] bugfix for moe_mlp in vllm-ascend 0.11.0-dev#4825Clorist33 wants to merge 7 commits intovllm-project:v0.11.0-devfrom
Conversation
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug in the moe_mlp module by ensuring the group_list argument is properly converted from a cumulative sum to counts before being passed to torch_npu.npu_dequant_swiglu_quant. The fix is accurate and addresses the issue described. I have one suggestion to improve code maintainability by reducing duplication, which will make the codebase more robust against future changes.
| group_diff = torch.diff(group_list, dim=0) | ||
| new_group = torch.cat([group_list[0].unsqueeze(0), group_diff], | ||
| dim=0) |
There was a problem hiding this comment.
This logic to convert a cumulative-sum tensor to counts is duplicated from lines 136-138. This duplication poses a maintainability risk, as future changes might be missed in one location, leading to subtle bugs.
To mitigate this and improve consistency, please apply the following suggestion which makes the implementation more concise and aligns it with the existing pattern in the file.
| group_diff = torch.diff(group_list, dim=0) | |
| new_group = torch.cat([group_list[0].unsqueeze(0), group_diff], | |
| dim=0) | |
| new_group = torch.cat([group_list[:1], torch.diff(group_list, dim=0)], dim=0) |
References
- Avoid code duplication (Don't Repeat Yourself - DRY principle). Duplicated code increases maintenance overhead and the risk of introducing inconsistencies and bugs, as changes must be manually synchronized across all instances.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
has this be merged into main? If yes, please link the related commit. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Not yet. The PR submitted to the main was reviewed in the meeting yesterday, and the only feedback was to add descriptions—no other modifications were requested. Currently, the PR to be merged into main is still pending review,address is here #4822. |
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
a83de8c to
b5d5652
Compare
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
c418864 to
8256182
Compare
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>
| @@ -0,0 +1,8 @@ | |||
| -----BEGIN OPENSSH PRIVATE KEY----- | |||
| @@ -0,0 +1 @@ | |||
| ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAQ7BMETcbjbp0ujsehGD12YazJ0L1VmGIGPMgyU25eZ tanqingshandj@gmail.com | |||
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
This PR fixes a bug in the moe_mlp module by correcting the arguments passed to the torch_npu.npu_dequant_swiglu_quant function.It properly converts group_list from a cumulative sum to counts for the group_index parameter.
Does this PR introduce any user-facing change?
No