[KVConnector][Bugfix] Treat no-connector as no-op success in reset_connector_cache by aoshen02 · Pull Request #42693 · vllm-project/vllm

aoshen02 · 2026-05-15T01:12:16Z

Summary

Scheduler.reset_connector_cache() currently returns False when no KV connector is configured, which makes Scheduler.reset_prefix_cache(reset_running_requests, reset_connector=True) return False on every engine that doesn't have a connector — even when the local prefix cache reset itself succeeded.

A caller that interprets the return value as "did the reset I asked for succeed?" (e.g. an RL framework that asks for a clear before a weight update, or an ops tool that pipes the return value into an SLO check) sees a spurious failure for the most common deployment shape (no KV connector configured). The user-facing log line was also a warning, even though "no connector exists, so nothing to reset" is the expected steady-state behavior of most engines.

Change

Return True when self.connector is None. There is no external KV store to invalidate, so the operation trivially succeeds; the local prefix cache (handled earlier in reset_prefix_cache) decides the real return value.
Demote the log line from warning to debug: this branch is hit on every engine without a connector, which is the steady state rather than a misconfiguration.

The True/False/None return contract on KVConnector.reset_cache itself (None = not implemented, False = explicit failure, anything else = success) is unchanged; only the "no connector at all" path moves from a sentinel failure to a sentinel success, matching how reset_mm_cache / reset_encoder_cache already behave when their target structures are empty.

Why this is not a duplicate

gh pr list --repo vllm-project/vllm --state open --search "reset_connector_cache in:body" → no results
gh pr list --repo vllm-project/vllm --state open --search "reset_connector reset_prefix_cache" → no results
gh issue list --repo vllm-project/vllm --state open --search "reset_connector_cache" → no results

The only prior PR touching this method is #27170 (which introduced the reset_connector parameter). That PR is merged; this is a follow-up correctness fix in the same method.

Test plan

New unit test added: tests/v1/core/test_scheduler.py::test_reset_connector_cache_no_connector_is_no_op_success. It exercises:

Scheduler.reset_connector_cache() directly returns True on a no-connector scheduler.
Scheduler.reset_prefix_cache(reset_connector=True) end-to-end returns True on an idle no-connector scheduler.

Commands run locally:

pre-commit run --files vllm/v1/core/sched/scheduler.py tests/v1/core/test_scheduler.py
# → all hooks Passed (ruff check / format, typos, mypy-3.10, SPDX, etc.)

The full pytest invocation (pytest tests/v1/core/test_scheduler.py::test_reset_connector_cache_no_connector_is_no_op_success) could not be executed in my local sandbox (the editable vllm install points at a different worktree whose precompiled CUDA flash-attention extensions don't match upstream main). The standalone logic of the patched branch was sanity-checked with a minimal reproduction (4 cases: no-connector / connector returns True / False / None — all match expectation). CI will be the source of truth for the integration test.

AI assistance

This patch was authored with assistance from Claude Opus 4.7 (1M context). Every changed line was reviewed by me end-to-end before opening this PR.

🤖 Generated with Claude Code

…nnector_cache ``Scheduler.reset_connector_cache()`` currently returns ``False`` when no KV connector is configured, which makes ``Scheduler.reset_prefix_cache(reset_running_requests, reset_connector=True)`` return ``False`` on every engine that doesn't have a connector -- even when the local prefix cache reset itself succeeded. A caller that interprets the return value as "did the reset I asked for succeed?" (e.g. an RL framework that asks for a clear before a weight update, or an ops tool that pipes the return value into an SLO check) sees a spurious failure for the most common deployment shape (no KV connector configured). The user-facing log line was also a ``warning``, even though "no connector exists, so nothing to reset" is the expected steady-state behavior of most engines. Fix: - Return ``True`` when ``self.connector is None``. There is no external KV store to invalidate, so the operation trivially succeeds; the local prefix cache (handled earlier in ``reset_prefix_cache``) decides the real return value. - Demote the log line from ``warning`` to ``debug``: this branch is hit on every engine without a connector, which is the norm rather than a misconfiguration. The True/False/None return contract on ``KVConnector.reset_cache`` itself (None = not implemented, False = explicit failure, anything else = success) is unchanged; only the "no connector at all" path moves from a sentinel failure to a sentinel success, matching how ``reset_mm_cache`` / ``reset_encoder_cache`` already behave when their target structures are empty. Tests: ``tests/v1/core/test_scheduler.py::test_reset_connector_cache_no_connector_is_no_op_success`` covers both the direct method call and the ``reset_prefix_cache(reset_connector=True)`` cascade end-to-end on a no-connector scheduler. ## Duplicate-work check - ``gh pr list --repo vllm-project/vllm --state open --search "reset_connector_cache in:body"`` -> no results - ``gh pr list --repo vllm-project/vllm --state open --search "reset_connector reset_prefix_cache"`` -> no results - ``gh issue list --repo vllm-project/vllm --state open --search "reset_connector_cache"`` -> no results The only prior PR touching this method is vllm-project#27170 (which introduced the ``reset_connector`` parameter); that PR is merged and this is a follow-up correctness fix in the same method. ## AI assistance This patch was authored with assistance from Claude Opus 4.7 (1M context). Every changed line was reviewed by the submitter before opening the PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: aoshen524 <aoshen524@gmail.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request modifies the reset_connector_cache method in the scheduler to return True instead of False when no KV connector is configured, treating the operation as a successful no-op. This change ensures that reset_prefix_cache does not report a failure when a connector reset is requested but no connector is present. A new test case has been added to verify this behavior. I have no feedback to provide as there were no review comments to evaluate.

`AsyncLLM.pause_generation(clear_cache=True)` internally calls `EngineCore._reset_caches`, which then calls `Scheduler.reset_prefix_cache(reset_running_requests=True)` -- with the new `reset_connector` parameter (added in vllm-project#27170) defaulting to False. For engines with a configured external KV store connector (e.g. `MooncakeStoreConnector`, `LMCacheConnectorV1`, `HF3FSConnector`, ...) that means a user asking the engine to "clear cache" after a weight update gets the local prefix cache cleared but leaves the external store full of KV blocks hashed against the previous policy. For RL post-training workflows this is a silent stale-cache correctness hole: the next request can read external KV that was written by the old policy and serve it as a prefix hit against the new policy. This patch parameterizes `EngineCore._reset_caches` with `reset_connector: bool = True` and forwards the value to `reset_prefix_cache`. The default flips from False to True so that `pause_generation(clear_cache=True)` -- the user-facing API whose contract is "clear caches" -- now actually clears the external KV store too. `Scheduler.reset_connector_cache` already gracefully handles the no-connector case (warns + returns False), so this is a no-op for engines that don't configure a connector. Internal callers that want a local-only invalidation can override with `reset_connector=False`; outside callers can still bypass the cascade by calling `scheduler.reset_prefix_cache(reset_connector=False)` directly. This intentionally keeps `reset_connector` *out* of the user-facing `AsyncLLM.pause_generation` signature: threading an opt-in flag all the way up turns the safety net into a footgun (user forgets the flag -> silent stale KV). The user-facing surface stays a single `clear_cache` semantic ("clear means clear"). Future internal callers can override via the new kwarg if they need to. ## Why this is not a duplicate - `gh pr list --repo vllm-project/vllm --state open --search "_reset_caches reset_connector"` -> empty - `gh pr list --repo vllm-project/vllm --state open --search "pause_generation clear_cache reset_connector"` -> empty - `gh issue list --repo vllm-project/vllm --state open --search "pause_generation reset_connector"` -> empty Related PRs: - vllm-project#27170 introduced `reset_connector` on `Scheduler.reset_prefix_cache`. - vllm-project#42693 is a parallel correctness fix for the no-connector branch of the same code path (returns True instead of False when no connector is attached) -- independent of this change. - vllm-project#42694 implements `MooncakeStoreConnector.reset_cache`, the most immediate beneficiary of this cascade. ## Test plan No new test is added: the behavior change is exercised by the existing `tests/v1/core/test_reset_prefix_cache_e2e.py` (no-connector engine keeps working) and by the new tests in vllm-project#42694 (Mooncake connector now sees its `reset_cache` called via `pause_generation(clear_cache=True)`). Commands run locally: ```bash pre-commit run --files vllm/v1/engine/core.py # -> ruff check, ruff format, mypy-3.10, SPDX, etc. all Passed ``` The full pytest run could not be executed in my local sandbox (the editable `vllm` install in the venv points at a different worktree whose precompiled CUDA flash-attention extensions don't match upstream `main`). CI will be the source of truth. ## AI assistance This patch was authored with assistance from Claude Opus 4.7 (1M context). Every changed line was reviewed by the submitter before opening the PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: aoshen524 <aoshen524@gmail.com>

aoshen02 · 2026-05-15T01:54:07Z

Superseded by #42694, which now consolidates this no-op-success fix together with the engine cascade change (#42695, also being closed) and the Mooncake-side reset_cache implementation into a single end-to-end PR per review feedback.

aoshen02 requested review from ApostaC, WoosukKwon, alexm-redhat, heheda12345, njhill, orozery, robertgshaw2-redhat and ywang96 as code owners May 15, 2026 01:12

claude Bot reviewed May 15, 2026

View reviewed changes

mergify Bot added v1 bug Something isn't working labels May 15, 2026

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

This was referenced May 15, 2026

[KVConnector][Mooncake] Wire reset_cache cascade end-to-end #42694

Merged

[Engine] Reset KV connector cache in pause_generation cascade #42695

Closed

aoshen02 closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[KVConnector][Bugfix] Treat no-connector as no-op success in reset_connector_cache#42693

[KVConnector][Bugfix] Treat no-connector as no-op success in reset_connector_cache#42693
aoshen02 wants to merge 1 commit into
vllm-project:mainfrom
aoshen02:aoshen524/reset-connector-cache-noop-success

aoshen02 commented May 15, 2026

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

aoshen02 commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

aoshen02 commented May 15, 2026

Summary

Change

Why this is not a duplicate

Test plan

AI assistance

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

aoshen02 commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant