[V1] Support multiple kv connectors #17564

mgoin · 2025-05-01T23:26:39Z

Co-authored with @njhill.

We take advantage of the kv_connector_extra_config: dict[str, Any] already present in KVTransferConfig to stash all the connectors we want in an ordered list of kwargs. The new MultiConnector then creates a connector from each item in its list of KVTransferConfig.kv_connector_extra_config["connectors"]:

class MultiConnector(KVConnectorBase_V1):

    def __init__(self, vllm_config: "VllmConfig", role: KVConnectorRole):
        super().__init__(vllm_config=vllm_config, role=role)
        self._connectors = []
        ktcs = vllm_config.kv_transfer_config.kv_connector_extra_config.get("connectors")
        assert ktcs is not None
        for ktc in ktcs:
            temp_config = copy.copy(vllm_config)
            temp_config.kv_transfer_config = KVTransferConfig(**ktc)
            self._connectors.append(KVConnectorFactory.create_connector_v1(temp_config, role))

Example usage:

CUDA_VISIBLE_DEVICES=0 NIXL_ROLE="SENDER" vllm serve meta-llama/Llama-3.2-1B-Instruct \
    --port 8100 \
    --enforce-eager \
    --disable-log-requests \
    --kv-transfer-config '{"kv_connector":"MultiConnector","kv_role":"kv_both","kv_connector_extra_config":{"connectors":[{"kv_connector":"NixlConnector","kv_role":"kv_both"},{"kv_connector":"SharedStorageConnector","kv_role":"kv_both","kv_connector_extra_config":{"shared_storage_path":"local_storage"}}]}}'

github-actions · 2025-05-01T23:26:48Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: mgoin <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: mgoin <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: mgoin <[email protected]>

mgoin · 2025-05-06T12:13:12Z

Dependent on #17686 landing first for KVConnectorBase_V1 API changes/additions made in that PR.

Signed-off-by: Nick Hill <[email protected]>

From PR vllm-project#17564 Co-authored-by: Nick Hill <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>

mergify · 2025-05-12T16:47:28Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mgoin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: mgoin <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: mgoin <[email protected]>

tlrmchlsmth

LGTM, I think my main comment is that we should make sure to document the behavior of the loads. I.E. make sure the precedence order stuff is written down somewhere.

This was there originally but inadvertently dropped Signed-off-by: Nick Hill <[email protected]>

maobaolong · 2025-05-14T06:41:41Z

@mgoin @njhill Thanks for this feature! It looks useful. And I want to make connectorA as P/D transfer connector, make connectorB as offload-reuse purpose connector, how to config to make the connectorA and connectorB co-exist?

njhill · 2025-05-14T15:20:17Z

@mgoin @njhill Thanks for this feature! It looks useful. And I want to make connectorA as P/D transfer connector, make connectorB as offload-reuse purpose connector, how to config to make the connectorA and connectorB co-exist?

@maobaolong We'll add better doc/comments for this but the logic initially is to:

Load from the first connector that advertises available tokens (in the order specified in the config)
Save to all connectors

maobaolong · 2025-05-14T15:38:01Z

@mgoin @njhill Thanks for this feature! It looks useful. And I want to make connectorA as P/D transfer connector, make connectorB as offload-reuse purpose connector, how to config to make the connectorA and connectorB co-exist?

@maobaolong We'll add better doc/comments for this but the logic initially is to:

Load from the first connector that advertises available tokens (in the order specified in the config)

Save to all connectors

@njhill Thanks for your explain. And I'd like to know more about the motivation and the scenario, if you can add some more description, it will help. So I guess this PR can not solve the co-exist issue for P/D transfer connector and offload-reuse purpose connector ?

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-05-14T19:42:38Z

@mgoin @njhill Thanks for this feature! It looks useful. And I want to make connectorA as P/D transfer connector, make connectorB as offload-reuse purpose connector, how to config to make the connectorA and connectorB co-exist?

@maobaolong We'll add better doc/comments for this but the logic initially is to:

Load from the first connector that advertises available tokens (in the order specified in the config)

Save to all connectors

@njhill Thanks for your explain. And I'd like to know more about the motivation and the scenario, if you can add some more description, it will help. So I guess this PR can not solve the co-exist issue for P/D transfer connector and offload-reuse purpose connector ?

@maobaolong this is intended to be a starting point, it can work with P/D + offload-reuse if the P/D connector uses kv transfer params passed in the request to determine whether it should offer tokens to load. Which is how the current NixlConnector works - it will only return nonzero token count from get_matched_tokens if the kv transfer params in the request contain do_remote_prefill=True, otherwise it will fall back to the second connector which can be the one to reuse offloaded kvcache.

We can extend it with other loading selection logic if you have additional ideas...

Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]> Signed-off-by: Yuqi Zhang <[email protected]>

mergify bot added ci/build v1 labels May 2, 2025

njhill force-pushed the multi-kv-connectors branch from 02b54bd to f6a48c5 Compare May 3, 2025 15:36

mergify bot added the frontend label May 3, 2025

mgoin and others added 11 commits May 3, 2025 08:42

[V1] Support multiple kv connectors

3a70eda

Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>

Example script

89a5788

Signed-off-by: mgoin <[email protected]>

.

4fa62d4

Signed-off-by: mgoin <[email protected]>

Add test

15ca542

Signed-off-by: mgoin <[email protected]>

make mypy happy

cd5af12

Signed-off-by: mgoin <[email protected]>

move MultiKVConnectorMetadata to multi_connector.py

5a9a314

Signed-off-by: Nick Hill <[email protected]>

minor simplifications

2f5e537

Signed-off-by: Nick Hill <[email protected]>

Remove script

014cb2c

Signed-off-by: mgoin <[email protected]>

michael inprogress

7370d83

Signed-off-by: Nick Hill <[email protected]>

Make sure we pop requests from connector dict

df0a82b

Signed-off-by: mgoin <[email protected]>

req_id -> request_id

8423135

Signed-off-by: mgoin <[email protected]>

njhill force-pushed the multi-kv-connectors branch from f6a48c5 to 8423135 Compare May 3, 2025 15:44

Update with better test

551fff1

Signed-off-by: mgoin <[email protected]>

mgoin changed the title ~~[WIP] Support multiple kv connectors~~ [V1] Support multiple kv connectors May 6, 2025

mgoin marked this pull request as ready for review May 6, 2025 12:11

mgoin marked this pull request as draft May 6, 2025 12:12

njhill added 3 commits May 6, 2025 11:41

add comment to test

10a26b6

Signed-off-by: Nick Hill <[email protected]>

Handle other new methods and latest API changes

35f3748

Signed-off-by: Nick Hill <[email protected]>

update get_finished() method with new arg

ff4e083

Signed-off-by: Nick Hill <[email protected]>

njhill pushed a commit to neuralmagic/vllm that referenced this pull request May 10, 2025

[V1] Support multiple kv connectors

ea17042

From PR vllm-project#17564 Co-authored-by: Nick Hill <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]>

mergify bot added the needs-rebase label May 12, 2025

mgoin added 2 commits May 12, 2025 17:00

Merge branch 'main' into multi-kv-connectors

a0f7136

Signed-off-by: mgoin <[email protected]>

Move test to v1/kv_connector/unit

fc65a18

Signed-off-by: mgoin <[email protected]>

mergify bot removed the needs-rebase label May 12, 2025

mgoin marked this pull request as ready for review May 12, 2025 17:03

njhill and others added 2 commits May 12, 2025 10:39

update KVTransferParams typing

e5e1191

Signed-off-by: Nick Hill <[email protected]>

Fix typing issue for get_finished

8537d93

Signed-off-by: mgoin <[email protected]>

tlrmchlsmth approved these changes May 12, 2025

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label May 13, 2025

remove @DataClass from KVConnectorMetadata

9a767d1

This was there originally but inadvertently dropped Signed-off-by: Nick Hill <[email protected]>

Merge branch 'main' into multi-kv-connectors

59ba66f

Add class docstring

abf38bf

Signed-off-by: Nick Hill <[email protected]>

simon-mo merged commit 2142035 into vllm-project:main May 14, 2025
62 of 64 checks passed

njhill deleted the multi-kv-connectors branch May 15, 2025 01:31

maobaolong mentioned this pull request May 20, 2025

[V0] Support multiple kv connectors #18395

Closed

maobaolong mentioned this pull request Jul 11, 2025

[Feature]: co-exist of multiply kv connector #16397

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[V1] Support multiple kv connectors #17564

[V1] Support multiple kv connectors #17564

Uh oh!

mgoin commented May 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 1, 2025

Uh oh!

mgoin commented May 6, 2025 •

edited by njhill

Loading

Uh oh!

mergify bot commented May 12, 2025

Uh oh!

tlrmchlsmth left a comment

Uh oh!

maobaolong commented May 14, 2025

Uh oh!

njhill commented May 14, 2025 •

edited

Loading

Uh oh!

maobaolong commented May 14, 2025

Uh oh!

njhill commented May 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[V1] Support multiple kv connectors #17564

[V1] Support multiple kv connectors #17564

Uh oh!

Conversation

mgoin commented May 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 1, 2025

Uh oh!

mgoin commented May 6, 2025 • edited by njhill Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented May 12, 2025

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

maobaolong commented May 14, 2025

Uh oh!

njhill commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maobaolong commented May 14, 2025

Uh oh!

njhill commented May 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mgoin commented May 1, 2025 •

edited by github-actions bot

Loading

mgoin commented May 6, 2025 •

edited by njhill

Loading

njhill commented May 14, 2025 •

edited

Loading