[Refactor] Modify the binding logic to allocate CPU cores for each NPU card by Rozwel-dx · Pull Request #5555 · vllm-project/vllm-ascend

Rozwel-dx · 2025-12-31T07:17:17Z

[Refactor] Modify the binding logic to allocate CPU cores for each NPU card

What this PR does / why we need it?

Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Rozwel-dx@c85cc04

Signed-off-by: rowzwel_dx 1392851715@qq.com

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@7157596

gemini-code-assist

Code Review

This pull request refactors the CPU binding logic to be more sophisticated, allocating CPU cores for each NPU card based on NUMA affinity and isolating specific threads. The implementation has been moved into CpuAlloc and DeviceInfo classes, which is a good structural improvement.

My review has identified a few critical issues that need to be addressed:

A TypeError will occur when calling the refactored bind_cpus function from worker.py due to a mismatched function signature.
The CPU binding may be non-deterministic and incorrect because a list of Process IDs is not sorted before being used for mapping.

Additionally, there are opportunities for improving correctness and performance:

A potential TypeError in get_running_npus if a chip logic ID is not found.
Inefficient repeated calls to ps -Te when binding threads.

Please see the detailed comments for suggestions on how to fix these issues.

github-actions · 2025-12-31T08:09:19Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2026-01-08T06:04:06Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

…d on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Add cpu_binding UT Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com> Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. Signed-off-by: Rozwel-dx <1392851715@qq.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: [CI] Unblock 4-cards test (vllm-project#5831) [Refactor] Provide a framework to accommodate operators for different hardware devices (vllm-project#5735) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card (vllm-project#5555) [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#5519) support triton of mrope (vllm-project#5664) [bugfix] A2 Environment Pooling for Memcache Compatibility (vllm-project#5601) [Doc] Update community contributors and versioning naming to follow vLLM (vllm-project#5820) [Refactor] Add comments for Metadata classes in attention module (vllm-project#5789) [Bugfix] bugfix for the order of dummy run pad and sync (vllm-project#5777) [CI] Move nightly-a2 test to hk (vllm-project#5807) [CI] Show disk usage for CI shared volume (vllm-project#5821) Bump actions/checkout from 4 to 6 (vllm-project#5795) Bump actions/github-script from 7 to 8 (vllm-project#5796) [bugfix](cp) align max_context_chunk to cp_virtual_block_size (vllm-project#5767) [bugfix]limit graph replay sync (vllm-project#5761) [CI]Add Kimi k2 nightly test (vllm-project#5682) [Doc] add tls check to pd disaggregation readme (vllm-project#5638) [CI] adpat v0.13.0 change (vllm-project#5793)

…U card (vllm-project#5555) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card ### What this PR does / why we need it? Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rozwel-dx@c85cc04 Signed-off-by: rowzwel_dx <1392851715@qq.com> - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: Rozwel-dx <1392851715@qq.com>

…U card (vllm-project#5555) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card ### What this PR does / why we need it? Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rozwel-dx@c85cc04 Signed-off-by: rowzwel_dx <1392851715@qq.com> - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: Rozwel-dx <1392851715@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…U card (vllm-project#5555) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card ### What this PR does / why we need it? Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rozwel-dx@c85cc04 Signed-off-by: rowzwel_dx <1392851715@qq.com> - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: Rozwel-dx <1392851715@qq.com>

…U card (vllm-project#5555) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card ### What this PR does / why we need it? Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rozwel-dx@c85cc04 Signed-off-by: rowzwel_dx <1392851715@qq.com> - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: Rozwel-dx <1392851715@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…U card (vllm-project#5555) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card ### What this PR does / why we need it? Modify the binding logic to allocate CPU cores for each NPU card based on NUMA affinity, while isolating acl_thread/release_thread and other processes to prevent mutual interference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rozwel-dx@c85cc04 Signed-off-by: rowzwel_dx <1392851715@qq.com> - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 Signed-off-by: Rozwel-dx <1392851715@qq.com>

gemini-code-assist bot reviewed Dec 31, 2025

View reviewed changes

Comment thread vllm_ascend/cpu_binding.py Outdated

Comment thread vllm_ascend/worker/worker.py Outdated

Comment thread vllm_ascend/cpu_binding.py Outdated

Comment thread vllm_ascend/cpu_binding.py Outdated

github-actions bot added the module:core label Dec 31, 2025

Rozwel-dx force-pushed the main branch 2 times, most recently from 496d07b to d69f7a9 Compare January 5, 2026 02:40

wangxiyuan reviewed Jan 5, 2026

View reviewed changes

Comment thread vllm_ascend/worker/worker.py Outdated

Rozwel-dx force-pushed the main branch 6 times, most recently from 62fbf5c to 9788812 Compare January 8, 2026 06:03

github-actions bot added the merge-conflicts label Jan 8, 2026

Rozwel-dx force-pushed the main branch from 9788812 to e947d3a Compare January 8, 2026 06:15

github-actions bot removed the merge-conflicts label Jan 8, 2026

Rozwel-dx force-pushed the main branch from e947d3a to 4c756fc Compare January 8, 2026 07:01

realliujiaxu added ready read for review ready-for-test start test by label for PR labels Jan 9, 2026

Rozwel-dx force-pushed the main branch 2 times, most recently from 29e3c7b to 9293082 Compare January 9, 2026 08:42

Rozwel-dx force-pushed the main branch from 9293082 to ef991fb Compare January 12, 2026 08:49

wangxiyuan merged commit 8d57128 into vllm-project:main Jan 13, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] Modify the binding logic to allocate CPU cores for each NPU card#5555

[Refactor] Modify the binding logic to allocate CPU cores for each NPU card#5555
wangxiyuan merged 1 commit intovllm-project:mainfrom
Rozwel-dx:main

Rozwel-dx commented Dec 31, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Rozwel-dx commented Dec 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 31, 2025

Uh oh!

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Rozwel-dx commented Dec 31, 2025 •

edited by github-actions bot

Loading