[KVConnector] Support worker -> scheduler metadata by orozery · Pull Request #31964 · vllm-project/vllm

orozery · 2026-01-08T11:04:04Z

This PR introduces a new build_worker_connector_meta KV connector API,
allowing workers to send back arbitrary metadata back to the scheduler-side connector.
This aligns with the already existing API build_connector_metadata which allows for the same on the opposite direction (scheduler -> worker).

In particular, this API is needed for the OffloadingConnector to be able to notify the scheduler-side on offloaded blocks, even before a request is finished.

gemini-code-assist

Code Review

This pull request introduces a new API for workers to send metadata to the scheduler, which is a valuable addition for features like the OffloadingConnector. The implementation is well-structured and includes comprehensive tests. However, I've identified a critical correctness issue in MultiConnector.update_connector_output where state mutation is not handled safely, potentially leaving objects in an inconsistent state if an error occurs. My feedback includes a code suggestion to make this part of the implementation more robust.

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

mergify · 2026-01-08T11:11:14Z

Hi @orozery, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

LucasWilkinson · 2026-01-10T02:16:05Z

cc @NickLucche

NickLucche

Agreed with @orozery offline to override this new pipe to coalesce worker->scheduler data propagation for a future Connector v2 interface.
I also believe we're already in partial agreement with @njhill (cc to check out PR if I missed something here).
=>For the time being, I think this is a generic channel that unlocks some development and the impl looks good on my side.

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

NickLucche · 2026-02-10T17:27:27Z

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

+                c.update_connector_output(connector_output)
+        finally:
+            # restore kv_connector_worker_meta
+            connector_output.kv_connector_worker_meta = multi_connector_worker_meta


NickLucche · 2026-02-10T17:40:21Z

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

+            if metadata_list is None and kv_connector_worker_meta is not None:
+                metadata_list = [None] * i
+            if metadata_list is not None:
+                metadata_list.append(kv_connector_worker_meta)


I think this is not immediate but I don't have a cleaner solution other than doing an if any() check at the end.

This is very nice code - pythonic and terse but makes great use of indices of connectors with the tuple.

hickeyma

I like this in the main, thanks @orozery.

Overall is the aim to include all worker data to be returned through KVConnectorWorkerMetadata. Hence, stats and kv events and part of the class?

hickeyma · 2026-02-11T12:00:55Z

vllm/distributed/kv_transfer/kv_connector/v1/base.py

        """
        return None

+    def build_connector_worker_meta(self) -> KVConnectorWorkerMetadata | None:


Would this read better as get_connector_worker_meta(self)?

I thought to follow the convention being used for the other direction (build_connector_meta).

hickeyma · 2026-02-11T12:15:08Z

vllm/v1/worker/kv_connector_model_runner_mixin.py

            output.kv_connector_stats = kv_connector.get_kv_connector_stats()
            output.kv_cache_events = kv_connector.get_kv_connector_kv_cache_events()
+            output.kv_connector_worker_meta = kv_connector.build_connector_worker_meta()


Does this mean that output.kv_connector_stats and output.kv_cache_events will be integrated into output.kv_connector_worker_meta at some stage? Hence all future data from workers will be returned in output.kv_connector_worker_meta?

hickeyma · 2026-02-11T13:08:07Z

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

+            if metadata_list is None and kv_connector_worker_meta is not None:
+                metadata_list = [None] * i
+            if metadata_list is not None:
+                metadata_list.append(kv_connector_worker_meta)


This is very nice code - pythonic and terse but makes great use of indices of connectors with the tuple.

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

orozery · 2026-02-11T13:26:16Z

Overall is the aim to include all worker data to be returned through KVConnectorWorkerMetadata. Hence, stats and kv events and part of the class?

My thinking:
KVConnectorWorkerMetadata will remain abstract (just like the respective KVConnectorMetadata), and will basically replace KVConnectorOutput.
The consumer of KVConnectorOutput is the scheduler (an exception to that is KVConnectorKVEvents which is not consumed by the scheduler, but only by the LMCache scheduler-side connector)
Instead of consuming of ModelRunnerOutput.kv_connector_output, the scheduler will consume its required output (stats, finished requests, block errors) via a single new scheduler-side API.

To ease the common use-case of aggregating across all workers (as in the case of finished_sending, finished_recving, etc), we can create a utility class inhering from KVConnectorWorkerMetadata that will include a generic aggregation logic (sort of a generalization of today's KVOutputAggregator.

hickeyma

My thinking:
KVConnectorWorkerMetadata will remain abstract (just like the respective KVConnectorMetadata), and will basically replace KVConnectorOutput.
The consumer of KVConnectorOutput is the scheduler (an exception to that is KVConnectorKVEvents which is not consumed by the scheduler, but only by the LMCache scheduler-side connector)
Instead of consuming of ModelRunnerOutput.kv_connector_output, the scheduler will consume its required output (stats, finished requests, block errors) via a single new scheduler-side API.
To ease the common use-case of aggregating across all workers (as in the case of finished_sending, finished_recving, etc), we can create a utility class inhering from KVConnectorWorkerMetadata that will include a generic aggregation logic (sort of a generalization of today's KVOutputAggregator.

Sounds good but would like to get a better picture of the overall direction. Do you have a RFC or design doc that I could look at?

orozery · 2026-02-12T16:20:49Z

Sounds good but would like to get a better picture of the overall direction. Do you have a RFC or design doc that I could look at?

I'm working on an RFC for connector API refactoring. It will include other proposed changes as well.
Note that even just with this current API, KVConnectorKVEvents can be eliminated. I don't think there's a need to wait for the RFC.

hickeyma · 2026-02-12T16:26:11Z

Note that even just with this current API, KVConnectorKVEvents can be eliminated. I don't think there's a need to wait for the RFC.

Can you explain what you mean by removed? How will events from connector be processed by vLLM then?

orozery · 2026-02-12T16:29:25Z

Can you explain what you mean by removed? How will events from connector be processed by vLLM then?

The same way they are processed today, using the take_events function.

hickeyma · 2026-02-12T16:31:17Z

The same way they are processed today, using the take_events function.

How do they get from workers to scheduler process?

orozery · 2026-02-12T16:38:43Z

How do they get from workers to scheduler process?

KVConnectorWorkerMetadata

NickLucche

LGTM thanks for the work @orozery , only left one nit on the test, double check to make sure I haven't overlooked that case.

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py

NickLucche · 2026-02-13T08:23:45Z

tests/v1/kv_connector/unit/test_multi_connector.py

+
+    # ----------------------------- test aggregate ----------------------------
+
+    # aggregate ({"0a"}, None) and (None, {"1a"}) -> ({"0a"}, {"1a"})


nit missing cae

aggregate (None, {"1a"}) and (None, {"1b"}) -> (None, {. . .} )

Thanks! I've now added:

# aggregate ({"0a"}, None) and ({"0b"}, None) -> ({"0a", "0b"}, None)

hickeyma

LGTM, thanks @orozery for improving metadata handling.

This change removes kv_cache_events as a top-level field on KVConnectorOutput and instead carries KV cache events inside a connector-specific KVConnectorWorkerMetadata subclass (LMCacheWorkerMetadata). This aligns KV event transport with the generic worker-to-scheduler metadata mechanism introduced in PR vllm-project#31964, eliminating redundant aggregation code paths. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>

hickeyma · 2026-02-13T16:08:43Z

Note that even just with this current API, KVConnectorKVEvents can be eliminated.

@orozery I pushed an initial draft #34522 for this. Will iterate on it a bit more for now.

This commit introduces a new build_worker_connector_meta KV connector API allowing workers to send back arbitrary metadata back to the scheduler-side connector. This aligns with the already existing API build_connector_metadata which allows for the same on the opposite direction (scheduler -> worker). Signed-off-by: Or Ozeri <oro@il.ibm.com>

This change removes kv_cache_events as a top-level field on KVConnectorOutput and instead carries KV cache events inside a connector-specific KVConnectorWorkerMetadata subclass (LMCacheWorkerMetadata). This aligns KV event transport with the generic worker-to-scheduler metadata mechanism introduced in PR vllm-project#31964, eliminating redundant aggregation code paths. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>

Signed-off-by: Or Ozeri <oro@il.ibm.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>

orozery requested review from ApostaC and NickLucche as code owners January 8, 2026 11:04

mergify bot added v1 kv-connector labels Jan 8, 2026

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py Outdated Show resolved Hide resolved

orozery force-pushed the connector-worker-metadata branch from 50468f2 to bc8ade9 Compare January 8, 2026 11:11

orozery force-pushed the connector-worker-metadata branch from bc8ade9 to d1de052 Compare January 8, 2026 11:13

LucasWilkinson assigned NickLucche Jan 10, 2026

orozery mentioned this pull request Jan 10, 2026

[BugFix] Wait for compute before offloading KV to CPU #31341

Merged

orozery mentioned this pull request Feb 2, 2026

[RFC]: Progressive KV Cache CPU Onloading #33526

Open

1 task

NickLucche reviewed Feb 10, 2026

View reviewed changes

hickeyma reviewed Feb 11, 2026

View reviewed changes

hickeyma reviewed Feb 12, 2026

View reviewed changes

NickLucche approved these changes Feb 13, 2026

View reviewed changes

hickeyma approved these changes Feb 13, 2026

View reviewed changes

orozery force-pushed the connector-worker-metadata branch from d1de052 to 4068559 Compare February 13, 2026 10:07

orozery added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 13, 2026

hickeyma mentioned this pull request Feb 13, 2026

[Refactor][KVConnector]: Move KV Cache Events into KVConnectorWorkerMetadata #34522

Open

5 tasks

orozery force-pushed the connector-worker-metadata branch from 7db7632 to 263ba3f Compare March 5, 2026 09:09

Merge branch 'main' into connector-worker-metadata

c8b9af7

NickLucche enabled auto-merge (squash) March 5, 2026 09:18

orozery added 5 commits March 6, 2026 07:24

Merge branch 'main' into connector-worker-metadata

dbaf1d6

Merge branch 'main' into connector-worker-metadata

e73c52b

Merge branch 'main' into connector-worker-metadata

45a4a14

Merge branch 'main' into connector-worker-metadata

6eea1bb

Merge branch 'main' into connector-worker-metadata

262348b

NickLucche merged commit a1a3523 into vllm-project:main Mar 11, 2026
49 checks passed

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

[KVConnector] Support worker -> scheduler metadata (vllm-project#31964)

6798c79

Signed-off-by: Or Ozeri <oro@il.ibm.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[KVConnector] Support worker -> scheduler metadata (vllm-project#31964)

c121a14

Signed-off-by: Or Ozeri <oro@il.ibm.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>


		# ----------------------------- test aggregate ----------------------------

		# aggregate ({"0a"}, None) and (None, {"1a"}) -> ({"0a"}, {"1a"})

Uh oh!

Conversation

orozery commented Jan 8, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Jan 8, 2026

Uh oh!

LucasWilkinson commented Jan 10, 2026

Uh oh!

NickLucche left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hickeyma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

orozery commented Feb 11, 2026

Uh oh!

hickeyma left a comment

Choose a reason for hiding this comment

Uh oh!

orozery commented Feb 12, 2026

Uh oh!

hickeyma commented Feb 12, 2026

Uh oh!

orozery commented Feb 12, 2026

Uh oh!

hickeyma commented Feb 12, 2026

Uh oh!

orozery commented Feb 12, 2026

Uh oh!

NickLucche left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hickeyma left a comment

Choose a reason for hiding this comment

Uh oh!

hickeyma commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

orozery commented Jan 8, 2026 •

edited by github-actions bot

Loading

NickLucche left a comment •

edited

Loading