[P/D][PCP]bugfix pcp force free twice caused logger error#6124
[P/D][PCP]bugfix pcp force free twice caused logger error#6124wangxiyuan merged 1 commit intovllm-project:mainfrom
Conversation
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request addresses a race condition in the KVCacheRecvingThread that could lead to resources being freed twice in Prefill Context Parallelism (PCP) scenarios. The bug occurred in multi-task requests where the run-once flag for _send_done_signal_to_free_remote_port was cleared prematurely. The fix involves reordering operations within the finally block of _handle_request to ensure the flag is checked before it's cleared, which correctly prevents the redundant execution. The change is logical and effectively resolves the issue.
| self._send_done_signal_to_free_remote_port(remote_request_id, remote_host, | ||
| remote_port_send_num) |
There was a problem hiding this comment.
Moving this call to the beginning of the finally block correctly resolves a race condition. For multi-task requests, the run-once flag (self.proc_not_transfer_request) was previously deleted on the final task before this function was called, causing its logic to execute twice. This change ensures the flag is checked before it's cleared when all_task_done is true, preventing the double execution.
…to qwen3next_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (51 commits) [Bugfix] Remove `use_aclgraph` in mtp_proposer and use `use_cuda_graph` (vllm-project#6032) [BugFix] fix 3vl dense model load quant weight (vllm-project#6100) [CP&SP] Integrate FIA operator in mla_cp._forward_decode (vllm-project#5641) [CI][Doc] Upgrade wheel building's CANN to 8.5.0 and update the Docs (vllm-project#6145) [CI]Install clang in dokerfile for triton ascend (vllm-project#4409) [Main] Upgrade PTA to 2.9.0 (vllm-project#6112) [Graph][Fusion] Add QKVNormRope and QKVNormRopeWithBias (vllm-project#5721) [P/D][PCP]bugfix pcp force free twice caused logger error (vllm-project#6124) [BugFix]converting pa get_workspace back to capturing (vllm-project#5833) [CI] optimize lint term (vllm-project#5986) [Bugfix] Fix Triton operator usage for multimodal models based on `the mrope_interleaved` parameter (vllm-project#6042) [bugfix][npugraph_ex]fix the model output type issue caused by manually modify FX graph (vllm-project#6015) [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#6097) [Misc] Bump mooncake version to v0.3.8.post1 (vllm-project#6110) [Feature]Enable DispatchGmmCombineDecode when eagle is moe with w8a8 or not moe [RFC: issue 5476] (vllm-project#5758) [bugfix] adapt_remote_request_id (vllm-project#6051) [Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (vllm-project#5143) [Feature] Support DSA-CP for Hybrid scenario (vllm-project#5702) [CI] Upgrade CANN to 8.5.0 (vllm-project#6070) Default enable MLAPO (vllm-project#5952) ...
) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. pick-from: #6124 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
…lm-project#6132) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. pick-from: vllm-project#6124 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ct#6124) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@d682094 Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
What this PR does / why we need it?
The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By ci