Skip to content

[KV Offload] Return None from lookup() for in-flight blocks#41795

Merged
orozery merged 1 commit into
vllm-project:mainfrom
ronensc:lookup-none
May 6, 2026
Merged

[KV Offload] Return None from lookup() for in-flight blocks#41795
orozery merged 1 commit into
vllm-project:mainfrom
ronensc:lookup-none

Conversation

@ronensc

@ronensc ronensc commented May 6, 2026

Copy link
Copy Markdown
Contributor

Purpose

CPUOffloadingManager.lookup() was returning False for both absent blocks and in-flight blocks (write started but not yet complete).
The base class defines None to mean "retry later" which is the right semantic for in-flight.
This makes the scheduler defer instead of treating an in-flight block as a miss.

Test Plan

pytest -v -s tests/v1/kv_offload/cpu/test_manager.py

Test Result

tests/v1/kv_offload/cpu/test_manager.py::test_already_stored_block_not_evicted_during_prepare_store[lru] PASSED
tests/v1/kv_offload/cpu/test_manager.py::test_already_stored_block_not_evicted_during_prepare_store[arc] PASSED
tests/v1/kv_offload/cpu/test_manager.py::test_cpu_manager PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_basic PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_t1_to_t2_promotion PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_eviction_with_load PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_adaptive_target PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_t1_t2_eviction_policy PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_ghost_list_bounds PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_touch_ordering PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_failed_store PASSED
tests/v1/kv_offload/cpu/test_manager.py::TestARCPolicy::test_full_scenario PASSED
tests/v1/kv_offload/cpu/test_manager.py::test_filter_reused_manager PASSED

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

@ronensc ronensc requested review from ApostaC and orozery as code owners May 6, 2026 07:49

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the v1 label May 6, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the lookup method in the CPU manager to distinguish between a missing block and an in-flight write. Previously, the method returned a boolean indicating readiness; it now returns False if the block is not found, None if a write is in-flight, and True if the block is ready. The unit tests have been updated to reflect these new return values. I have no feedback to provide as there were no review comments to evaluate.

# lookup [1, 2] -> not ready
assert cpu_manager.lookup(to_key(1), _EMPTY_REQ_CTX) is False
assert cpu_manager.lookup(to_key(2), _EMPTY_REQ_CTX) is False
# lookup [1, 2] -> write in-flight, not yet ready

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we don't test the False flow.

Can you add one assert for a lookup returning False?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lines later in the test there are already several assertions covering the False case. Should I add another assertion here as well?

assert cpu_manager.lookup(to_key(3), _EMPTY_REQ_CTX) is False

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed it, right.

@orozery orozery added the ready ONLY add when PR is ready to merge/full CI is needed label May 6, 2026
Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
@orozery orozery merged commit f39bcf1 into vllm-project:main May 6, 2026
49 checks passed
@ronensc ronensc deleted the lookup-none branch May 6, 2026 14:39
ikaadil pushed a commit to ikaadil/vllm that referenced this pull request May 7, 2026
…ject#41795)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
Signed-off-by: Ifta Khairul Alam Adil <ikaadil007@gmail.com>
libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026
…ject#41795)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
Signed-off-by: Libin Tang <libin.tang@intel.com>
weifang231 pushed a commit to weifang231/eb-vllm that referenced this pull request May 13, 2026
…ject#41795)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…ject#41795)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…ject#41795)

Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants