[0.13.0][P/D][PCP]bugfix pcp force free twice caused logger error#6132
Conversation
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a critical bug where the D node could send a 'pull-end' signal twice to the P node. This was caused by a race condition in the _handle_request method's finally block. When all_task_done was true, the proc_not_transfer_request flag for a request was deleted before the final call to _send_done_signal_to_free_remote_port, causing the function's one-time logic to execute a second time for the same request. Moving the call to _send_done_signal_to_free_remote_port to before the all_task_done check correctly resolves this issue by ensuring the function is called before its state is cleaned up. The change is correct and effectively prevents the duplicate signal, which was causing erroneous logging on the P node.
…lm-project#6132) ### What this PR does / why we need it? The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved. pick-from: vllm-project#6124 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
What this PR does / why we need it?
The issue of the D node mistakenly sending the pull-end signal twice, leading to the P node printing logger errors abnormally, has been resolved.
pick-from: #6124
Does this PR introduce any user-facing change?
No
How was this patch tested?
By ci