Skip to content

[Feat] enable hierarchical mc2 ops on A2 by default#5545

Merged
realliujiaxu merged 1 commit intovllm-project:mainfrom
hwhaokun:mc2
Jan 4, 2026
Merged

[Feat] enable hierarchical mc2 ops on A2 by default#5545
realliujiaxu merged 1 commit intovllm-project:mainfrom
hwhaokun:mc2

Conversation

@hwhaokun
Copy link
Contributor

@hwhaokun hwhaokun commented Dec 31, 2025

What this PR does / why we need it?

Previously, it was necessary to set the environment variables HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables hierarchical MC2 operations on A2 by default.

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly enables hierarchical MC2 operations on A2 devices by default, removing the need for specific environment variables. However, there is a potential issue in vllm_ascend/ops/fused_moe/token_dispatcher.py where the expert_scales parameter is now passed unconditionally. This could lead to incorrect behavior on non-A2 devices. I have provided comments and suggestions to make this parameter conditional, aligning it with the intended A2-specific change.

Comment on lines +149 to 150
"expert_scales": topk_weights.to(torch.float32),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The expert_scales parameter should not be added unconditionally, as it is specific to hierarchical communication which is being enabled for A2 devices. It should be moved into the conditional block for A2 devices. Please remove it from this general dictionary initialization.

        }

Comment on lines +152 to +153
if get_ascend_device_type() == AscendDeviceType.A2:
kwargs_mc2["comm_alg"] = "hierarchy"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To ensure hierarchical communication is correctly configured only for A2 devices, the expert_scales parameter should be added here, inside the conditional block, along with comm_alg.

        if get_ascend_device_type() == AscendDeviceType.A2:
            kwargs_mc2["comm_alg"] = "hierarchy"
            kwargs_mc2["expert_scales"] = topk_weights.to(torch.float32)

@github-actions
Copy link
Contributor

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: hwhaokun <haokun0405@163.com>
@realliujiaxu realliujiaxu changed the title [bugfix] enable hierarchical mc2 ops on A2 by default [Feat] enable hierarchical mc2 ops on A2 by default Jan 4, 2026
@realliujiaxu realliujiaxu added ready read for review ready-for-test start test by label for PR labels Jan 4, 2026
@realliujiaxu realliujiaxu merged commit fb9fdcd into vllm-project:main Jan 4, 2026
60 of 65 checks passed
Toneymiller added a commit to Toneymiller/vllm-ascend that referenced this pull request Jan 5, 2026
Toneymiller added a commit to Toneymiller/vllm-ascend that referenced this pull request Jan 5, 2026
…ject#5545)"

This reverts commit fb9fdcd.

Signed-off-by: zxwang <1476209578@qq.com>
wangxiyuan pushed a commit that referenced this pull request Jan 5, 2026
…5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
845473182 pushed a commit to 845473182/vllm-ascend that referenced this pull request Jan 6, 2026
…to FIA_rebase

* 'main' of https://github.com/vllm-project/vllm-ascend: (58 commits)
  [Main2Main] Upgrade vllm commit to 0106 (vllm-project#5617)
  [CI]update bisheng version (vllm-project#5621)
  [UT][PCP&DCP] UT for block_table.py (vllm-project#5032)
  [Main2Main] Upgrade vllm commit to 0105 (vllm-project#5595)
  [CI] mv ops to correct path (vllm-project#5615)
  [BugFix] Fix Smoke Testing Bug for DSR1 longseq (vllm-project#5613)
  Revert "[Feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5545)" (vllm-project#5611)
  [TRITON][TEST]Add nightly test for triton split_qkv_rmsnorm_rope (vllm-project#5267)
  [perf] Fix MLAPO weight disposal for KV-consumer MLA in PD-mix deploy... (vllm-project#5192)
  [docs] Correct image about prefill phase of PCP (vllm-project#5598)
  [CI] update triton-ascend version (vllm-project#5584)
  [P/D]Remove mooncake kvpool unused parameter `local_hostname` (vllm-project#5574)
  [Bugfix] record cos and sin cache in AscendRotaryEmbedding (vllm-project#5516)
  [bugfix] fix test_camem failed with triton-ascend (vllm-project#5492)
  [UT]add triton ops ut :  test_fused_qkvzba_split_reshape_cat (vllm-project#5474)
  [CI] Download models from ms (vllm-project#5405)
  Docs: Add A3 Docker image guidance for Atlas A3 machines (vllm-project#5256)
  [Doc] Add NNAL installation guide and requirements (vllm-project#5235)
  Add the requirement of arctic-inference which  speculative decoding with suffix_decode  (vllm-project#5045)
  [BugFix][Fusion] Fix graph fusion failure problem (vllm-project#5253)
  ...
Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
### What this PR does / why we need it?
Previously, it was necessary to set the environment variables
HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables
hierarchical MC2 operations on A2 by default.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: hwhaokun <haokun0405@163.com>
Rozwel-dx pushed a commit to Rozwel-dx/vllm-ascend that referenced this pull request Jan 8, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
aipaes pushed a commit to aipaes/vllm-ascend that referenced this pull request Jan 15, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
Previously, it was necessary to set the environment variables
HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables
hierarchical MC2 operations on A2 by default.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: hwhaokun <haokun0405@163.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
Previously, it was necessary to set the environment variables
HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables
hierarchical MC2 operations on A2 by default.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: hwhaokun <haokun0405@163.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
Previously, it was necessary to set the environment variables
HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables
hierarchical MC2 operations on A2 by default.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: hwhaokun <haokun0405@163.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
Previously, it was necessary to set the environment variables
HCCL_INTRA_PCIE_ENABLE=1 and HCCL_INTRA_ROCE_ENABLE=0. This PR enables
hierarchical MC2 operations on A2 by default.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: hwhaokun <haokun0405@163.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
…ject#5545)" (vllm-project#5611)

This reverts commit fb9fdcd.

### What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of
aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual
        launch failed for NotEqual, errno:361001
<img width="1149" height="166"
alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9"
src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726"
/>

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
vllm-project/vllm@7157596

Signed-off-by: zxwang <1476209578@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:core module:ops ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants