[CI] Fix test on pyhccl to 2 cards #2094

MengqingCao · 2025-07-30T00:44:50Z

What this PR does / why we need it?

Fix test on pyhccl to 2 cards

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with existing test.

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@0d0cc9e

Signed-off-by: MengqingCao <[email protected]>

Yikun

LGTM if multicard CI passed

codecov · 2025-07-30T01:01:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.68%. Comparing base (b6a7f07) to head (e722cf6).
⚠️ Report is 612 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2094   +/-   ##
=======================================
  Coverage   73.68%   73.68%           
=======================================
  Files          96       96           
  Lines       10920    10920           
=======================================
  Hits         8046     8046           
  Misses       2874     2874

Flag	Coverage Δ
unittests	`73.68% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Yikun · 2025-07-30T01:04:58Z

Test locally, let's merge this directly to recover CI cc @ganyi1996ppo @jianzs @wangxiyuan

Failure introduced here: 4df8e00 , because the CI passed before f60bb47

export IMAGE=quay.nju.edu.cn/ascend/vllm-ascend:main
docker run --rm \
--name yikun-test-3 \
--device /dev/davinci2 \
--device /dev/davinci3 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/.cache:/root/.cache \
-it $IMAGE bash

# git log
commit b6a7f07c701984eb5c76e474a74f8889f9c300c5 (grafted, HEAD -> main, origin/main)
Author: whx <[email protected]>
Date:   Tue Jul 29 23:53:19 2025 +0800

    [Perf][MoE] Improve MoE multistream parallel performace. (#1891)

    This PR designs the shared expert multi-stream parallelism of
    w8a8-dynamic-quantized MoE stage in more detail to achieve better
    performance.

    - vLLM version: v0.10.0
    - vLLM main:
    https://github.com/vllm-project/vllm/commit/2cc571199b1446f376ee019fcafda19155fc6b71

    Signed-off-by: whx-sjtu <[email protected]>

# pytest -sv tests/e2e/multicard/test_pyhccl_distributed.py
INFO 07-30 01:02:50 [__init__.py:38] Available plugins for group vllm.platform_plugins:
INFO 07-30 01:02:50 [__init__.py:40] - ascend -> vllm_ascend:register
INFO 07-30 01:02:50 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 07-30 01:02:50 [__init__.py:226] Platform plugin ascend is activated
WARNING 07-30 01:02:52 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
INFO 07-30 01:02:54 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
=================================================================================================================================================== test session starts ====================================================================================================================================================
platform linux -- Python 3.11.13, pytest-8.4.1, pluggy-1.6.0 -- /usr/local/python3.11.13/bin/python3.11
cachedir: .pytest_cache
rootdir: /vllm-workspace/vllm-ascend
configfile: pyproject.toml
plugins: anyio-4.9.0
collected 2 items

tests/e2e/multicard/test_pyhccl_distributed.py::test_pyhccl INFO 07-30 01:03:02 [__init__.py:38] Available plugins for group vllm.platform_plugins:
INFO 07-30 01:03:02 [__init__.py:40] - ascend -> vllm_ascend:register
INFO 07-30 01:03:02 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 07-30 01:03:02 [__init__.py:226] Platform plugin ascend is activated
INFO 07-30 01:03:03 [__init__.py:38] Available plugins for group vllm.platform_plugins:
INFO 07-30 01:03:03 [__init__.py:40] - ascend -> vllm_ascend:register
INFO 07-30 01:03:03 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 07-30 01:03:03 [__init__.py:226] Platform plugin ascend is activated
WARNING 07-30 01:03:05 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
WARNING 07-30 01:03:05 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
INFO 07-30 01:03:06 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 07-30 01:03:06 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 07-30 01:03:14 [config.py:4898] Current vLLM config is not set.
WARNING 07-30 01:03:14 [platform.py:135] Model config is missing. This may indicate that we are running a test case
INFO 07-30 01:03:14 [platform.py:144] Compilation disabled, using eager mode by default
WARNING 07-30 01:03:17 [config.py:4898] Current vLLM config is not set.
WARNING 07-30 01:03:17 [platform.py:135] Model config is missing. This may indicate that we are running a test case
INFO 07-30 01:03:17 [platform.py:144] Compilation disabled, using eager mode by default
INFO 07-30 01:03:19 [utils.py:246] Found hccl from library libhccl.so
INFO 07-30 01:03:19 [utils.py:246] Found hccl from library libhccl.so
INFO 07-30 01:03:19 [pyhccl.py:83] vLLM is using pyhccl
INFO 07-30 01:03:19 [pyhccl.py:83] vLLM is using pyhccl
PASSED
tests/e2e/multicard/test_pyhccl_distributed.py::test_pyhccl_broadcast INFO 07-30 01:03:35 [__init__.py:38] Available plugins for group vllm.platform_plugins:
INFO 07-30 01:03:35 [__init__.py:40] - ascend -> vllm_ascend:register
INFO 07-30 01:03:35 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 07-30 01:03:35 [__init__.py:226] Platform plugin ascend is activated
INFO 07-30 01:03:36 [__init__.py:38] Available plugins for group vllm.platform_plugins:
INFO 07-30 01:03:36 [__init__.py:40] - ascend -> vllm_ascend:register
INFO 07-30 01:03:36 [__init__.py:43] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
INFO 07-30 01:03:36 [__init__.py:226] Platform plugin ascend is activated
WARNING 07-30 01:03:38 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
WARNING 07-30 01:03:38 [_custom_ops.py:20] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")
INFO 07-30 01:03:39 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 07-30 01:03:40 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
WARNING 07-30 01:03:46 [config.py:4898] Current vLLM config is not set.
WARNING 07-30 01:03:46 [platform.py:135] Model config is missing. This may indicate that we are running a test case
INFO 07-30 01:03:46 [platform.py:144] Compilation disabled, using eager mode by default
WARNING 07-30 01:03:47 [config.py:4898] Current vLLM config is not set.
WARNING 07-30 01:03:47 [platform.py:135] Model config is missing. This may indicate that we are running a test case
INFO 07-30 01:03:47 [platform.py:144] Compilation disabled, using eager mode by default
INFO 07-30 01:03:49 [utils.py:246] Found hccl from library libhccl.so
INFO 07-30 01:03:49 [utils.py:246] Found hccl from library libhccl.so
INFO 07-30 01:03:49 [pyhccl.py:83] vLLM is using pyhccl
INFO 07-30 01:03:49 [pyhccl.py:83] vLLM is using pyhccl
PASSED

2 passed in 68.82s (0:01:08

root@7ec57ea5edb1:/vllm-workspace/vllm-ascend# git diff
diff --git a/tests/e2e/multicard/test_pyhccl_distributed.py b/tests/e2e/multicard/test_pyhccl_distributed.py
index e3d9aed..2300e0a 100644
--- a/tests/e2e/multicard/test_pyhccl_distributed.py
+++ b/tests/e2e/multicard/test_pyhccl_distributed.py
@@ -89,7 +89,7 @@ def worker_fn():


 def test_pyhccl():
-    distributed_run(worker_fn, 4)
+    distributed_run(worker_fn, 2)


 def broadcast_worker_fn():
@@ -118,4 +118,4 @@ def broadcast_worker_fn():


 def test_pyhccl_broadcast():
-    distributed_run(broadcast_worker_fn, 4)
+    distributed_run(broadcast_worker_fn, 2)

### What this PR does / why we need it? Fix test on pyhccl to 2 cards ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0d0cc9e Signed-off-by: MengqingCao <[email protected]>

### What this PR does / why we need it? Fix test on pyhccl to 2 cards ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0d0cc9e Signed-off-by: MengqingCao <[email protected]> Signed-off-by: weijinqian_v1 <[email protected]>

### What this PR does / why we need it? Fix test on pyhccl to 2 cards ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@0d0cc9e Signed-off-by: MengqingCao <[email protected]>

[CI] Fix test on pyhccl to 2 card

e722cf6

Signed-off-by: MengqingCao <[email protected]>

github-actions bot added the module:tests label Jul 30, 2025

Yikun approved these changes Jul 30, 2025

View reviewed changes

Yikun merged commit d80b0cc into vllm-project:main Jul 30, 2025
11 of 13 checks passed

MengqingCao deleted the pyhccl branch July 30, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Fix test on pyhccl to 2 cards #2094

[CI] Fix test on pyhccl to 2 cards #2094

Uh oh!

MengqingCao commented Jul 30, 2025 •

edited by github-actions bot

Loading

Uh oh!

Yikun left a comment

Uh oh!

codecov bot commented Jul 30, 2025 •

edited

Loading

Uh oh!

Yikun commented Jul 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CI] Fix test on pyhccl to 2 cards #2094

[CI] Fix test on pyhccl to 2 cards #2094

Uh oh!

Conversation

MengqingCao commented Jul 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Yikun left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Yikun commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MengqingCao commented Jul 30, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 30, 2025 •

edited

Loading

Yikun commented Jul 30, 2025 •

edited

Loading