[bugfix][npugraph_ex]fix static kernel uninstall issue by ChenCangtao · Pull Request #6128 · vllm-project/vllm-ascend

ChenCangtao · 2026-01-22T07:58:53Z

What this PR does / why we need it?

The static kernel in torch_npu is uninstalled through Python's atexit mechanism.
However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded.
When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel.
When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@d682094

Signed-off-by: chencangtao <chencangtao@huawei.com>

github-actions · 2026-01-22T07:59:09Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: chencangtao <chencangtao@huawei.com>

…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (86 commits) [refactor] refactor excute_model and _dymmy_run method (vllm-project#6043) [Refactor] profiler config optimze (vllm-project#6141) [Graph][Fusion] Add MatmulAllReduceAddRMSNorm graph fusion for npugraph_ex. (vllm-project#6006) [UT]: refactoring 310p ops ut (vllm-project#6296) [Refact.]: refactoring 310p-kv cache allocator, align with main branch (vllm-project#6270) [Misc] Removes unnecessary graph size re-initialization (vllm-project#6280) [Main2Main] Upgrade vllm commit to 0123 (vllm-project#6169) [BugFix] Fix wheel package build workflow (vllm-project#6276) [CI][BugFix] Qwen3-Next nightly test fix. (vllm-project#6247) [Doc] quick fix for vllm-ascend version (vllm-project#6278) [Community] Nominate whx-sjtu as maintainer (vllm-project#6268) [Lint] Fix mypy issue to make CI happy (vllm-project#6272) BugFix: Fix moe_load accumulation error in ACL graph mode (vllm-project#6182) [Patch] Remove the patch of ECExampleConnector (vllm-project#5976) [Bugfix] Fix PP+PCP and PP+flashcomm1 bugs (vllm-project#5416) [Feat] proxy delay to remove instances (vllm-project#5934) [CI] Add workfolw_dispatch for nightly image build (vllm-project#6269) [bugfix][npugraph_ex]fix static kernel uninstall issue (vllm-project#6128) [Doc] 310P Documents update (vllm-project#6246) [Feature] Mooncake connector get remote ptp size (vllm-project#5822) ...

…6128) ### What this PR does / why we need it? The static kernel in torch_npu is uninstalled through Python's atexit mechanism. However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded. When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel. When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com>

…6128) ### What this PR does / why we need it? The static kernel in torch_npu is uninstalled through Python's atexit mechanism. However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded. When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel. When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…6128) ### What this PR does / why we need it? The static kernel in torch_npu is uninstalled through Python's atexit mechanism. However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded. When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel. When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com>

…6128) ### What this PR does / why we need it? The static kernel in torch_npu is uninstalled through Python's atexit mechanism. However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded. When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel. When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…6128) ### What this PR does / why we need it? The static kernel in torch_npu is uninstalled through Python's atexit mechanism. However, in vllm-ascend, when inference ends or the service stops, the worker process is terminated. This way, ending the process does not trigger the atexit mechanism, causing the static kernel not to be unloaded. When using the nougraph_ex backend and enabling the static kernel, we registered a signal handler to explicitly unload the static kernel. When there are many static kernels, unloading usually takes some time, whereas vllm will directly kill the process after sending a terminate event. Therefore, we choose to handle it by starting a new process. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com>

fix static kernel uninstall

6e3d88b

Signed-off-by: chencangtao <chencangtao@huawei.com>

ChenCangtao requested a review from yiz-liu as a code owner January 22, 2026 07:58

Revise the review comments and move the signal processor code location

77b5932

Signed-off-by: chencangtao <chencangtao@huawei.com>

ChenCangtao requested a review from MengqingCao as a code owner January 24, 2026 12:31

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 26, 2026

wangxiyuan approved these changes Jan 26, 2026

View reviewed changes

wangxiyuan merged commit 1645546 into vllm-project:main Jan 26, 2026
45 checks passed

wangxiyuan mentioned this pull request Feb 24, 2026

[Misc]: test #6787

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bugfix][npugraph_ex]fix static kernel uninstall issue#6128

[bugfix][npugraph_ex]fix static kernel uninstall issue#6128
wangxiyuan merged 2 commits intovllm-project:mainfrom
ChenCangtao:static_kernel_uninstall

ChenCangtao commented Jan 22, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ChenCangtao commented Jan 22, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions Bot commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChenCangtao commented Jan 22, 2026 •

edited by github-actions Bot

Loading