[KVConnector] Introduce `bind_scheduler_state` API by NickLucche · Pull Request #41011 · vllm-project/vllm

NickLucche · 2026-04-27T12:53:18Z

Alternative approach to #39654.

This PR introduces a generic "post-init hook", scheduler-side, that allows a connector to peek into the state of the scheduler:

    # Scheduler-side methods
    # ==============================

    def bind_scheduler_state(self, scheduler_state: SchedulerState):
        """
        Bind the scheduler state to the connector.
        This function is called by the scheduler after initialization
        and before the first model execution.
        Args:
            scheduler_state (SchedulerState): the scheduler state.
        """
        return

while avoiding initialization pattern that are too narrow to one connector's use case (note this is also the driving motivation behind a potential ConnectorV2 API overhaul).

SchedulerState allows for expanding the kind of data that we may want to pipe-in through this API, while ensuring access is read-only: scheduler-state is not meant to be consumed by the scheduler, it's a scheduler->connector relation.

Signed-off-by: NickLucche <nlucches@redhat.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

NickLucche · 2026-04-27T12:56:39Z

cc @orozery @ivanium @ApostaC

gemini-code-assist

Code Review

This pull request refactors the KV connector interface by introducing a SchedulerState dataclass and a bind_scheduler_state method, replacing the previous bind_gpu_block_pool approach to improve extensibility. Feedback indicates that the SchedulerState docstring is misleading, as it claims to provide read-only access while the internal kv_cache_manager remains mutable.

gemini-code-assist · 2026-04-27T12:59:13Z

+class SchedulerState:
+    """
+    State of the scheduler that the connector can access, scheduler-side.
+    This dataclass ensures read-only access to scheduler state, while enabling


The docstring states that this dataclass "ensures read-only access to scheduler state". While the dataclass itself is frozen=True (preventing reassignment of its fields), the kv_cache_manager object it contains is mutable. A connector could technically call mutating methods on the manager. This is more of a design guideline than a technical enforcement, but the docstring might be slightly misleading in its current phrasing.

Signed-off-by: NickLucche <nlucches@redhat.com>

orozery · 2026-04-27T15:40:36Z

@NickLucche The approach here is pretty similar to #39654, but somewhat more combersome.
It still lets the connectors access the block pool, and even the kv cache manager.
It still adds a dependency between base.py and vLLM internals.

I was rather thinking that SchedulerState would actually be an abstract class.
It's implementation would be in some new SchedulerConnectorMixin class, which will be inherited by Scheduler.
I can prepare a prototype if you prefer.

ivanium · 2026-04-27T23:30:21Z

This PR looks okay to me and I think having a SchedulerState is indeed more general than merely block_pool. But I wonder if we need anything other than block_pool so far. I know currently block_pool is hidden inside the KV cache manager so you cannot expose that in the SchedulerState directly, so I am okay with the current status.

Regarding @orozery 's proposal, I am hesitating because of the coupling. In the long run, I actually think we should try to decouple the scheduler and kv connector as much as possible, and perhaps moving all kv_connector stuff inside the KV cache manager.

orozery · 2026-04-28T03:23:04Z

Regarding @orozery 's proposal, I am hesitating because of the coupling. In the long run, I actually think we should try to decouple the scheduler and kv connector as much as possible, and perhaps moving all kv_connector stuff inside the KV cache manager.

I agree that KV connector fits better inside the KV cache manager.
i.e. the KV cache manager should be the one querying the KV connector for matched blocks.
I can downgrade my proposal one level down to the KV cache manager.

However, I think that KV connectors would benefit from being exposed to some scheduler-specific state.
Specifically, the list of waiting requests (and their tokens/block hashes, as well as their kv_transfer_params).
This will be useful for connectors planning ahead eviction and pre-fetching.

To summarize, I agree the connector hooks should be moved inside KV cache manager.
But I think we want to allow the connector a read-only view on the Scheduler state.

orozery · 2026-04-28T03:30:22Z

To summarize, I agree the connector hooks should be moved inside KV cache manager. But I think we want to allow the connector a read-only view on the Scheduler state.

@ivanium Another way to achieve what I described is creating a two-level coupling:

Couple the KV connector with a read-only view of the KV cache manager
Couple the KV cache manager with a read-only view of the Scheduler state

We can start off (1).
Once we have (2), we can extend the KV-connector view to include the Scheduler state view.

I think this is somewhat better than my previous suggestion, as it will allow the KV cache manager (from a GPU prefix cache POV) to benefit from possible optimizations thanks to access to the scheduler waiting list.

NickLucche · 2026-04-28T08:37:44Z

The approach here is pretty similar to #39654, but somewhat more combersome

Yes that is because I was mostly fine with @ivanium PR. I just generalized the hook to a post-init one so we don't have to add another one if the next change requires something that isn't strictly block_pool.

perhaps moving all kv_connector stuff inside the KV cache manager

Not sure about this, although this encapsulation would be nice to achieve, I think having the KVConnector be at the scheduler level allows for a better request-level interface.
ie I also find myself in need to have a way to peek into the waiting request queue for PD here https://docs.google.com/document/d/1i-O6kqY7WfF1lPyyftRpCQt5fwnFYIEDZKCxyB51Sjg/edit?usp=sharing.

However, I think that KV connectors would benefit from being exposed to some scheduler-specific state.

same situation and reasoning for me above

NickLucche · 2026-04-28T08:43:26Z

Couple the KV cache manager with a read-only view of the Scheduler state

I am not sure how happy I would be to pipe this state through the manager just to expose it to the connector tbh.
I think the current manager abstraction is quite nice and designed for internal use above all.
While kvconnectors definitely share a lot of the semantics here, they're designed to also allow OOT implementations and not just for internals.
Therefore I would still be slightly more in favor of the most flexibile/powerful coupling, that is scheduler <> connector.

markmc · 2026-04-28T11:33:09Z

See #39654 (comment) - I think we should be cautious about the API surface we expose to out-of-tree connectors, and this PR opens up even more than BlockPool which is probably already too wide a surface

init

2ac1421

Signed-off-by: NickLucche <nlucches@redhat.com>

NickLucche requested review from ApostaC, WoosukKwon, alexm-redhat, heheda12345, njhill, orozery, robertgshaw2-redhat, xuechendi and ywang96 as code owners April 27, 2026 12:53

claude Bot reviewed Apr 27, 2026

View reviewed changes

mergify Bot added v1 kv-connector labels Apr 27, 2026

NickLucche mentioned this pull request Apr 27, 2026

[Feat][KVConnector] Add bind_gpu_block_pool() to KVConnectorBase_V1 #39654

Open

5 tasks

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

multiconn case

fc041da

Signed-off-by: NickLucche <nlucches@redhat.com>

orozery mentioned this pull request Apr 28, 2026

KVConnector: Introduce bind_scheduler_context #41120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[KVConnector] Introduce `bind_scheduler_state` API#41011

[KVConnector] Introduce `bind_scheduler_state` API#41011
NickLucche wants to merge 2 commits intovllm-project:mainfrom
NickLucche:conn-bind-scheduler-state

NickLucche commented Apr 27, 2026

Uh oh!

claude Bot left a comment

Uh oh!

NickLucche commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Uh oh!

orozery commented Apr 27, 2026 •

edited

Loading

Uh oh!

ivanium commented Apr 27, 2026

Uh oh!

orozery commented Apr 28, 2026

Uh oh!

orozery commented Apr 28, 2026

Uh oh!

NickLucche commented Apr 28, 2026

Uh oh!

NickLucche commented Apr 28, 2026 •

edited

Loading

Uh oh!

markmc commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

NickLucche commented Apr 27, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

NickLucche commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

orozery commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivanium commented Apr 27, 2026

Uh oh!

orozery commented Apr 28, 2026

Uh oh!

orozery commented Apr 28, 2026

Uh oh!

NickLucche commented Apr 28, 2026

Uh oh!

NickLucche commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markmc commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

orozery commented Apr 27, 2026 •

edited

Loading

NickLucche commented Apr 28, 2026 •

edited

Loading