Revert "[Feat] enable hierarchical mc2 ops on A2 by default (#5545)"#5611
Revert "[Feat] enable hierarchical mc2 ops on A2 by default (#5545)"#5611wangxiyuan merged 1 commit intovllm-project:mainfrom
Conversation
…ject#5545)" This reverts commit fb9fdcd. Signed-off-by: zxwang <1476209578@qq.com>
There was a problem hiding this comment.
Code Review
This pull request reverts a previous commit that enabled hierarchical mc2 ops on A2 devices by default. The changes reintroduce environment variable checks to conditionally enable this feature, making it non-default. My review focuses on ensuring the reverted code is robust. I've identified a potential issue where the feature could be accidentally enabled on non-A2 devices, and I've suggested a fix to make the check more specific.
| def is_hierarchical_communication_enabled(): | ||
| return (os.getenv("HCCL_INTRA_ROCE_ENABLE", "") == "0" | ||
| and os.getenv("HCCL_INTRA_PCIE_ENABLE", "") == "1") |
There was a problem hiding this comment.
The comments in both this file and token_dispatcher.py strongly suggest that this hierarchical communication feature is specifically for A2 devices. To prevent it from being accidentally enabled on other device types if the environment variables are set, which could lead to unexpected behavior or performance issues, it's safer to add an explicit device type check here.
| def is_hierarchical_communication_enabled(): | |
| return (os.getenv("HCCL_INTRA_ROCE_ENABLE", "") == "0" | |
| and os.getenv("HCCL_INTRA_PCIE_ENABLE", "") == "1") | |
| def is_hierarchical_communication_enabled(): | |
| return (get_ascend_device_type() == AscendDeviceType.A2 and | |
| os.getenv("HCCL_INTRA_ROCE_ENABLE", "") == "0" and os.getenv("HCCL_INTRA_PCIE_ENABLE", "") == "1") |
|
Please add revert reason in commit message |
done |
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (58 commits) [Main2Main] Upgrade vllm commit to 0106 (vllm-project#5617) [CI]update bisheng version (vllm-project#5621) [UT][PCP&DCP] UT for block_table.py (vllm-project#5032) [Main2Main] Upgrade vllm commit to 0105 (vllm-project#5595) [CI] mv ops to correct path (vllm-project#5615) [BugFix] Fix Smoke Testing Bug for DSR1 longseq (vllm-project#5613) Revert "[Feat] enable hierarchical mc2 ops on A2 by default (vllm-project#5545)" (vllm-project#5611) [TRITON][TEST]Add nightly test for triton split_qkv_rmsnorm_rope (vllm-project#5267) [perf] Fix MLAPO weight disposal for KV-consumer MLA in PD-mix deploy... (vllm-project#5192) [docs] Correct image about prefill phase of PCP (vllm-project#5598) [CI] update triton-ascend version (vllm-project#5584) [P/D]Remove mooncake kvpool unused parameter `local_hostname` (vllm-project#5574) [Bugfix] record cos and sin cache in AscendRotaryEmbedding (vllm-project#5516) [bugfix] fix test_camem failed with triton-ascend (vllm-project#5492) [UT]add triton ops ut : test_fused_qkvzba_split_reshape_cat (vllm-project#5474) [CI] Download models from ms (vllm-project#5405) Docs: Add A3 Docker image guidance for Atlas A3 machines (vllm-project#5256) [Doc] Add NNAL installation guide and requirements (vllm-project#5235) Add the requirement of arctic-inference which speculative decoding with suffix_decode (vllm-project#5045) [BugFix][Fusion] Fix graph fusion failure problem (vllm-project#5253) ...
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com>
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com>
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com>
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ject#5545)" (vllm-project#5611) This reverts commit fb9fdcd. ### What this PR does / why we need it? this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual launch failed for NotEqual, errno:361001 <img width="1149" height="166" alt="A6C9453D-4F0B-4256-DD80-A9C181DAB2D9" src="https://github.com/user-attachments/assets/cab9c4b8-3fd1-4c6b-b424-474b46042726" /> ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: zxwang <1476209578@qq.com>
This reverts commit fb9fdcd.
What this PR does / why we need it?
this pr breaks the smoke test because of that leads the error of aclnnNeScalar:Kernel Run failed. opType: 25, NotEqual

launch failed for NotEqual, errno:361001
Does this PR introduce any user-facing change?
How was this patch tested?