[BugFix] scheduler: Fix resuming of preempted requests after async load#31583
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses a bug in the scheduler where preempted requests that undergo asynchronous KV cache loading were incorrectly treated as new requests instead of resumed requests. The fix introduces a new state tracking set, preempted_requests_being_async_loaded, to correctly manage the status of these requests. When a preempted request is scheduled for async loading, it's added to this set. Upon completion of the async load, its status is correctly restored to PREEMPTED. The changes also handle the cleanup of this tracking set when requests are aborted. The implementation is clear, well-contained, and effectively resolves the state transition issue. The logic appears robust and correct.
31c105d to
3d68a05
Compare
@NickLucche I've added a unit test. |
3d68a05 to
25476ad
Compare
Thanks! I've updated to use your simplified fix. |
Head branch was pushed to by a user without write access
This commit fixes the scheduler to correctly resume preempted requests after being async loaded. Signed-off-by: Or Ozeri <oro@il.ibm.com>
3162ea5 to
a2e70f0
Compare
|
@njhill I rebased and pushed a squashed commit |
…ad (vllm-project#31583) Signed-off-by: Or Ozeri <oro@il.ibm.com>
…ad (vllm-project#31583) Signed-off-by: Or Ozeri <oro@il.ibm.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
…ad (vllm-project#31583) Signed-off-by: Or Ozeri <oro@il.ibm.com>
The correct flow is for resuming preempted requests is:
RequestStatus.PREEMTPED -> resumed req in SchedulerOutput
However, for requests going through async loading after preemption this becomes:
RequestStatus.PREEMTPED -> RequestStatus.WAITING_FOR_REMOTE_KVS -> RequestStatus.WAITING
then, the scheduler incorrectly flags it as a new request in SchedulerOutput, instead of a resumed request.
This PR fixes the scheduler to correctly resume preempted requests after being async loaded.
RequestStatus.PREEMTPED -> RequestStatus.WAITING_FOR_REMOTE_KVS -> RequestStatus.PREEMTPED -> resumed
Note
Fixes incorrect classification of resumed requests after async KV loading.
scheduler.schedule(), when a request exitsWAITING_FOR_REMOTE_KVS, set status toPREEMPTEDifnum_preemptions > 0(otherwiseWAITING), so resumed requests are emitted as resumed (not new) inSchedulerOutput.is_async, add_step_until_kv_transfer_finished(...), and assert correct scheduling/output behavior for async KV transfers (including cached vs new reqs counts).Written by Cursor Bugbot for commit 25476ad33e86c13746f894393ae7a2c9d129ec83. This will update automatically on new commits. Configure here.
Note
Ensures preempted requests that underwent async KV loading are resumed correctly instead of being reclassified as new.
scheduler.schedule(), when leavingWAITING_FOR_REMOTE_KVS, setstatus=PREEMPTEDifnum_preemptions>0(elseWAITING), so resumed requests appear inscheduled_cached_reqsas resumed.is_async, add_step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.Request.num_preemptionsmeaning.Written by Cursor Bugbot for commit 399053a497be3ccfc1bb17179a156c1bbb1bb8ea. This will update automatically on new commits. Configure here.
Note
Cursor Bugbot is generating a summary for commit cb37857d209dbdca4f080484dcbd7a78aedd7dfc. Configure here.
Note
Fixes misclassification of resumed requests after async KV loading.
scheduler.schedule(), when leavingWAITING_FOR_REMOTE_KVS, setstatus=PREEMPTEDifnum_preemptions>0(elseWAITING) so resumed requests appear inscheduled_cached_reqs.is_async, add_step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.Request.num_preemptions.Written by Cursor Bugbot for commit cb37857d209dbdca4f080484dcbd7a78aedd7dfc. This will update automatically on new commits. Configure here.
Note
Fixes misclassification of resumed requests following async KV transfers.
scheduler.schedule(), when leavingWAITING_FOR_REMOTE_KVS, setstatus=PREEMPTEDifnum_preemptions>0(elseWAITING), so resumed requests are emitted as resumed inscheduled_cached_reqs.is_async, add_step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.Request.num_preemptionsmeaning.Written by Cursor Bugbot for commit 3162ea59039b916a9628a22d6c45f4a4bd0106d1. This will update automatically on new commits. Configure here.
Note
Fixes misclassification of resumed requests after async KV loading.
scheduler.schedule(), when leavingWAITING_FOR_REMOTE_KVS, set status toPREEMPTEDifnum_preemptions>0(elseWAITING), so resumed requests appear inscheduled_cached_reqsrather thanscheduled_new_reqs.is_async, add_step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.Request.num_preemptionscomment to reflect "times preempted".Written by Cursor Bugbot for commit a2e70f0. This will update automatically on new commits. Configure here.