[Bugfix] bugfix for moe_mlp by Clorist33 · Pull Request #4822 · vllm-project/vllm-ascend

Clorist33 · 2025-12-09T04:46:43Z

What this PR does / why we need it?

This PR fixes a bug in the moe_mlp module by correcting the arguments passed to the torch_npu.npu_dequant_swiglu_quant function.It properly converts group_list from a cumulative sum to counts for the group_index parameter.

Does this PR introduce any user-facing change?

No

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

gemini-code-assist

Code Review

This pull request addresses a bug in the moe_mlp module by correctly converting the group_list argument from a cumulative sum to counts before passing it to torch_npu.npu_dequant_swiglu_quant. The logic for this conversion is sound. However, I've identified a potential edge case in the new code: it doesn't handle an empty group_list, which could lead to a runtime error. I have provided a suggestion to make the implementation more robust against this scenario.

gemini-code-assist · 2025-12-09T04:48:13Z

vllm_ascend/ops/fused_moe/moe_mlp.py

+            new_group = torch.cat([group_list[0].unsqueeze(0), group_diff],
+                                  dim=0)


Using group_list[0].unsqueeze(0) will raise an IndexError if group_list is an empty tensor. This can occur if no tokens are routed to experts on the current device. Using slicing group_list[:1] is more robust as it returns an empty tensor for an empty input, preventing a crash.

new_group = torch.cat([group_list[:1], group_diff], dim=0)

github-actions · 2025-12-09T05:26:46Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

zhoux77899 · 2025-12-09T06:42:56Z

vllm_ascend/ops/fused_moe/moe_mlp.py

+            group_diff = torch.diff(group_list, dim=0)
+            new_group = torch.cat([group_list[0].unsqueeze(0), group_diff],
+                                  dim=0)


It’s better to first check if there are cases where group_list_type != 1. You can extract a function to convert other types of group_list into count format, taking group_list and group_list_type as parameters, and determine how to perform the conversion within the function (refer to the cumsum_group_list function).

def count_group_list(group_list: torch.Tensor, group_list_type: int) -> torch.Tensor if group_list_type not in [0, 1, 2]: raise ValueError( f"group_list_type should be in [0, 1, 2], but received {group_list_type}" ) if group_list_type == 0: return torch.cat((group_list[:1], torch.diff(group_list))) if group_list_type == 1: return group_list # group_list_type == 2 ...

Thanks for your suggestions. We have incorporated your proposal into the cumsum_group_list function. Additionally, could you please clarify what scenario corresponds to group_list_type == 2?

The changes are here：

def cumsum_group_list(group_list: torch.Tensor, src_list_type: int, dst_list_type: int, active_num: int = 0, expert_num: int = 0) -> torch.Tensor: if src_list_type not in [0, 1, 2]: raise ValueError( f"group_list_type should be in [0, 1, 2], but received {src_list_type}" ) if src_list_type == dst_list_type: return group_list if src_list_type == 1 and dst_list_type == 0: return group_list.cumsum(dim=0) if src_list_type == 0 and dst_list_type == 1: group_diff = torch.diff(group_list) new_group = torch.cat([group_diff[0].unsqueeze(0), group_diff], dim=0) return new_group experts = pad(group_list[:, 0], (1, 0)) tokens = pad(group_list[:, 1].cumsum(dim=0), (1, 0)) cumsum_group_list = torch.full(size=(expert_num, ), fill_value=active_num, dtype=group_list.dtype, device=group_list.device) for i, (start, end) in enumerate(zip(experts[:-1], experts[1:])): if end > start: cumsum_group_list[start:end] = tokens[i] return cumsum_group_list

zhoux77899 · 2025-12-09T06:43:57Z

vllm_ascend/ops/fused_moe/moe_mlp.py

                quant_scale=None,
                quant_offset=None,
-                group_index=group_list,
+                group_index=new_group,


And use the function here.

Suggested change

group_index=new_group,

group_index=count_group_list(group_list, group_list_type)

zzzzwwjj · 2025-12-09T08:51:04Z

vllm_ascend/ops/fused_moe/moe_mlp.py

                      active_num: int = 0,
                      expert_num: int = 0) -> torch.Tensor:
-    if group_list_type not in [0, 1, 2]:
+    if src_list_type not in [0, 1, 2]:


what meanings src_list_type==2? How to handle src_list_type=2 and dst_list_type=0?

We also have the same confusion regarding the scenario where src_list_type == 2 in the file ops/fused_moe/moe_mlp.py on the main branch of the vllm-ascend repository. @zhoux77899 Would you please clarify this point ?

Some ops like moe_init_routing_v2 can output the type 2 (key_value) group_list, but I’ve never seen this type of group_list actually used anywhere.

If there is a type 1 group_list like [0, 2, 1, 0]:

group_list_type = 0 means cumsum group_list, it will be [0, 2, 3, 3];

group_list_type = 1 means count group_list, it will be [0, 2, 1, 0];

group_list_type = 2 means key_value group_list, it will be [[1, 2], [2, 1], [0, 0], [0, 0]];

group_list_type == 2 means key_value group_list, it will be [[0, 0], [1, 2], [2, 1], [3, 0]]？

Apologies, when group_list_type = 2, group_list will be [[1, 2], [2, 1], [0, 0], [0, 0]]. It only contains tokens of active_num but is padded to expert_num. Maybe you can compare their differences using the script below.

import torch import torch_npu from vllm_ascend.ops.fused_moe.moe_mlp import cumsum_group_list class GroupListTypeTester: def __init__( self, batch_size: int = 1, hidden_size: int = 768, active_experts: int = 2, num_experts: int = 4, ) -> None: self.batch_size = batch_size self.hidden_size = hidden_size self.active_experts = active_experts self.num_experts = num_experts self.x = torch.randn(size=(self.batch_size, self.hidden_size), dtype=torch.bfloat16).npu() self.expert_idx = torch.randint(low=0, high=self.num_experts, size=(self.batch_size, self.active_experts), dtype=torch.int32).npu() self.scale = torch.randn(size=(self.batch_size, ), dtype=torch.float32).npu() self.offset = None self.init_routing_kwargs = { "x": self.x, "expert_idx": self.expert_idx, "scale": self.scale, "offset": self.offset, "active_num": self.active_experts, "expert_num": self.num_experts, "expert_tokens_num_flag": True, "quant_mode": -1, "active_expert_range": [0, self.num_experts], "row_idx_type": 0, } def __call__(self) -> None: count_group_list = self.output_count_group_list() kv_group_list = self.output_kv_group_list() print(f"{count_group_list=}, cumsum_group_list_from_count={cumsum_group_list(count_group_list, 1)}") print(f"{kv_group_list=}, cumsum_group_list_from_kv={cumsum_group_list(kv_group_list, 2, self.active_experts, self.num_experts)}") def output_count_group_list(self) -> torch.Tensor: _, _, group_list, _ = torch_npu.npu_moe_init_routing_v2( **self.init_routing_kwargs, expert_tokens_num_type=1, ) return group_list def output_kv_group_list(self) -> torch.Tensor: _, _, group_list, _ = torch_npu.npu_moe_init_routing_v2( **self.init_routing_kwargs, expert_tokens_num_type=2, ) return group_list if __name__ == "__main__": tester = GroupListTypeTester() tester()

I think consolidating all types of group_list computations into a single function might be overly complex, as it requires handling 6 different cases, and many call sites would also need modifications. Splitting them into separate functions would be more reasonable.

You may also consider whether to include the group_list_type = 2 scenario. The reason cumsum_group_list includes it is for safety redundancy, but I’m unsure about the specific use case it was designed for and I’ve never seen actually uses it at anywhere.

github-actions · 2025-12-10T03:05:25Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

…m-ascend into bugfix_moe_mlp_new

weijinqian0 · 2025-12-12T02:30:13Z

vllm_ascend/ops/fused_moe/moe_mlp.py

                    weight_scale=w1_scale,
                    x_scale=pertoken_scale,
-                    group_list=cumsum_group_list(group_list, group_list_type),
+                    group_list=group_list,


Need to check the range of group_list supported by the corresponding operator.

Confirmed. Code updated.

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

wangxiyuan · 2025-12-12T06:51:37Z

This need be merged to dev branch. So I merged this now.

@zzzzwwjj

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include #3232 (comment), #4822 (comment), #4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: #1229 #1979 #4359 #4878 - Community Involvement‌: He lead the #1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include #4868 (comment), #2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - #3334 - #3420 - #3015 co-author: - #3495 - #4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](#2867) and [rejection sampler](#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](#4345 (comment)), [issuecomment-3540994801](#4161 (comment)), and [discussion_r2492593988](#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer #1568 #2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack #2913 #3350 - Quality Contribution‌: #1568 #2602 #2913 #3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

github-actions bot added the module:ops label Dec 9, 2025

zhoux77899 reviewed Dec 9, 2025

View reviewed changes

Clorist33 mentioned this pull request Dec 9, 2025

[Bugfix] bugfix for moe_mlp in vllm-ascend 0.11.0-dev #4825

Closed

weijinqian0 approved these changes Dec 9, 2025

View reviewed changes

zzzzwwjj requested changes Dec 9, 2025

View reviewed changes

Clorist33 force-pushed the bugfix_moe_mlp_new branch from fb6a23c to d4b6cc6 Compare December 9, 2025 09:16

github-actions bot added the module:tests label Dec 9, 2025

github-actions bot added the merge-conflicts label Dec 10, 2025

reset: re-submit all modifications after sync vllm-ascend/main

db29fda

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Clorist33 force-pushed the bugfix_moe_mlp_new branch from 760351f to db29fda Compare December 10, 2025 03:51

github-actions bot removed the merge-conflicts label Dec 10, 2025

Clorist33 requested a review from weijinqian0 December 10, 2025 07:16

tanqingshan (A) added 3 commits December 11, 2025 15:18

Use group_list = group_list to replace group_list = cumsum_group_list.

718d94b

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Use group_list = group_list to replace group_list = cumsum_group_list

ce478b3

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

Merge branch 'bugfix_moe_mlp_new' of https://github.com/Clorist33/vll…

6ad93e9

…m-ascend into bugfix_moe_mlp_new

wangxiyuan mentioned this pull request Dec 11, 2025

[Bugfix] bugfix for moe_mlp in vllm-ascend/v0.11.0-dev #4885

Merged

zzzzwwjj added ready read for review ready-for-test start test by label for PR labels Dec 11, 2025

zzzzwwjj approved these changes Dec 11, 2025

View reviewed changes

weijinqian0 reviewed Dec 12, 2025

View reviewed changes

tanqingshan (A) added 2 commits December 12, 2025 11:04

update group_list

bf6b20b

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

update group_list again

ee07e3a

Signed-off-by: tanqingshan (A) <50050625@china.huawei.com> Signed-off-by: tanqingshan (A) <50050625@china.huawei.com>

wangxiyuan merged commit 4984e8a into vllm-project:main Dec 12, 2025
25 checks passed

wangxiyuan mentioned this pull request Dec 18, 2025

Nominate new maintainers @zzzzwwjj @realliujiaxu @LCAIZJ #5152

Merged

		new_group = torch.cat([group_list[0].unsqueeze(0), group_diff],
		dim=0)

	group_index=new_group,
	group_index=count_group_list(group_list, group_list_type)

Conversation

Clorist33 commented Dec 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

zhoux77899 Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Clorist33 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Clorist33 Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhoux77899 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

zzzzwwjj Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Clorist33 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

zhoux77899 Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Clorist33 Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhoux77899 Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

weijinqian0 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Clorist33 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wangxiyuan commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Clorist33 commented Dec 9, 2025 •

edited by github-actions bot

Loading

zhoux77899 Dec 9, 2025 •

edited

Loading

Clorist33 Dec 9, 2025 •

edited

Loading

zhoux77899 Dec 9, 2025 •

edited

Loading

Clorist33 Dec 10, 2025 •

edited

Loading

zhoux77899 Dec 11, 2025 •

edited

Loading