Skip to content

[BugFix] scheduler: Fix resuming of preempted requests after async load#31583

Merged
vllm-bot merged 1 commit intovllm-project:mainfrom
orozery:sched-async-load-preempted
Jan 10, 2026
Merged

[BugFix] scheduler: Fix resuming of preempted requests after async load#31583
vllm-bot merged 1 commit intovllm-project:mainfrom
orozery:sched-async-load-preempted

Conversation

@orozery
Copy link
Copy Markdown
Collaborator

@orozery orozery commented Dec 31, 2025

The correct flow is for resuming preempted requests is:
RequestStatus.PREEMTPED -> resumed req in SchedulerOutput

However, for requests going through async loading after preemption this becomes:
RequestStatus.PREEMTPED -> RequestStatus.WAITING_FOR_REMOTE_KVS -> RequestStatus.WAITING
then, the scheduler incorrectly flags it as a new request in SchedulerOutput, instead of a resumed request.

This PR fixes the scheduler to correctly resume preempted requests after being async loaded.
RequestStatus.PREEMTPED -> RequestStatus.WAITING_FOR_REMOTE_KVS -> RequestStatus.PREEMTPED -> resumed


Note

Fixes incorrect classification of resumed requests after async KV loading.

  • In scheduler.schedule(), when a request exits WAITING_FOR_REMOTE_KVS, set status to PREEMPTED if num_preemptions > 0 (otherwise WAITING), so resumed requests are emitted as resumed (not new) in SchedulerOutput.
  • Test updates: parameterize KV preemption tests with is_async, add _step_until_kv_transfer_finished(...), and assert correct scheduling/output behavior for async KV transfers (including cached vs new reqs counts).

Written by Cursor Bugbot for commit 25476ad33e86c13746f894393ae7a2c9d129ec83. This will update automatically on new commits. Configure here.


Note

Ensures preempted requests that underwent async KV loading are resumed correctly instead of being reclassified as new.

  • In scheduler.schedule(), when leaving WAITING_FOR_REMOTE_KVS, set status=PREEMPTED if num_preemptions>0 (else WAITING), so resumed requests appear in scheduled_cached_reqs as resumed.
  • Extend tests: parameterize KV preemption scenarios with is_async, add _step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.
  • Minor doc tweak: clarify Request.num_preemptions meaning.

Written by Cursor Bugbot for commit 399053a497be3ccfc1bb17179a156c1bbb1bb8ea. This will update automatically on new commits. Configure here.


Note

Cursor Bugbot is generating a summary for commit cb37857d209dbdca4f080484dcbd7a78aedd7dfc. Configure here.


Note

Fixes misclassification of resumed requests after async KV loading.

  • In scheduler.schedule(), when leaving WAITING_FOR_REMOTE_KVS, set status=PREEMPTED if num_preemptions>0 (else WAITING) so resumed requests appear in scheduled_cached_reqs.
  • Tests: parameterize KV preemption with is_async, add _step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.
  • Minor doc tweak: clarify meaning of Request.num_preemptions.

Written by Cursor Bugbot for commit cb37857d209dbdca4f080484dcbd7a78aedd7dfc. This will update automatically on new commits. Configure here.


Note

Fixes misclassification of resumed requests following async KV transfers.

  • In scheduler.schedule(), when leaving WAITING_FOR_REMOTE_KVS, set status=PREEMPTED if num_preemptions>0 (else WAITING), so resumed requests are emitted as resumed in scheduled_cached_reqs.
  • Tests: parameterize KV preemption path with is_async, add _step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.
  • Minor doc tweak: clarify Request.num_preemptions meaning.

Written by Cursor Bugbot for commit 3162ea59039b916a9628a22d6c45f4a4bd0106d1. This will update automatically on new commits. Configure here.


Note

Fixes misclassification of resumed requests after async KV loading.

  • In scheduler.schedule(), when leaving WAITING_FOR_REMOTE_KVS, set status to PREEMPTED if num_preemptions>0 (else WAITING), so resumed requests appear in scheduled_cached_reqs rather than scheduled_new_reqs.
  • Extend tests: parameterize KV preemption path with is_async, add _step_until_kv_transfer_finished(...), and assert correct waiting/running transitions and cached vs new request counts.
  • Clarify Request.num_preemptions comment to reflect "times preempted".

Written by Cursor Bugbot for commit a2e70f0. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in the scheduler where preempted requests that undergo asynchronous KV cache loading were incorrectly treated as new requests instead of resumed requests. The fix introduces a new state tracking set, preempted_requests_being_async_loaded, to correctly manage the status of these requests. When a preempted request is scheduled for async loading, it's added to this set. Upon completion of the async load, its status is correctly restored to PREEMPTED. The changes also handle the cleanup of this tracking set when requests are aborted. The implementation is clear, well-contained, and effectively resolves the state transition issue. The logic appears robust and correct.

Copy link
Copy Markdown
Collaborator

@NickLucche NickLucche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am admittedly not too familiar with the full extent of the preemption lifecycle, perhaps @markmc can chime in on this?

In the meantime could we add a unit test to trace through this case? Thanks @orozery !

@orozery orozery force-pushed the sched-async-load-preempted branch from 31c105d to 3d68a05 Compare January 6, 2026 10:46
@orozery
Copy link
Copy Markdown
Collaborator Author

orozery commented Jan 6, 2026

In the meantime could we add a unit test to trace through this case? Thanks @orozery !

@NickLucche I've added a unit test.

Copy link
Copy Markdown
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @orozery, good catch with this!

I think there is a simpler fix that can avoid introducing new state, pls see the inline comment.

@orozery orozery force-pushed the sched-async-load-preempted branch from 3d68a05 to 25476ad Compare January 10, 2026 16:17
@orozery
Copy link
Copy Markdown
Collaborator Author

orozery commented Jan 10, 2026

I think there is a simpler fix that can avoid introducing new state, pls see the inline comment.

Thanks! I've updated to use your simplified fix.

Copy link
Copy Markdown
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @orozery!

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 10, 2026
@njhill njhill added this to the v0.14.0 milestone Jan 10, 2026
@njhill njhill enabled auto-merge (squash) January 10, 2026 17:37
@njhill njhill disabled auto-merge January 10, 2026 18:02
@njhill njhill enabled auto-merge (squash) January 10, 2026 18:02
auto-merge was automatically disabled January 10, 2026 18:13

Head branch was pushed to by a user without write access

This commit fixes the scheduler to correctly resume preempted requests after being async loaded.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
@orozery orozery force-pushed the sched-async-load-preempted branch from 3162ea5 to a2e70f0 Compare January 10, 2026 18:14
@orozery
Copy link
Copy Markdown
Collaborator Author

orozery commented Jan 10, 2026

@njhill I rebased and pushed a squashed commit

@njhill njhill enabled auto-merge (squash) January 10, 2026 18:18
@vllm-bot vllm-bot merged commit 0285997 into vllm-project:main Jan 10, 2026
47 of 49 checks passed
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…ad (vllm-project#31583)

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants