fix multiconnector for multi connector use push kv connector by liziyu179 · Pull Request #31144 · vllm-project/vllm

liziyu179 · 2025-12-22T12:13:59Z

Purpose

a bugfix for multi connector use push kv connector，pass real blocks to kv connector.

Fixes #31145.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request addresses a bug in the MultiConnector by correctly passing real KV cache blocks to connectors configured as KV producers with PUT_ASYNC or PUT send types. The change ensures that all relevant connectors receive the necessary data for their operations, improving the correctness of KV transfer in multi-connector setups. The logic appears sound and directly resolves the described issue.

mergify · 2025-12-22T12:32:12Z

Hi @liziyu179, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

github-actions · 2025-12-22T12:33:17Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: liziyu <liziyu16@huawei.com>

LCAIZJ · 2026-01-04T02:44:40Z

@NickLucche Could you please take a look at the issue addressed by this PR when you have time?

NickLucche

Thanks for spotting this @liziyu179 !

Personally, I think this level of detail should be abstracted away from the multiconnector, but more generally I am not sure that prioritizing push connectors (with connector-specific format) is the way to go.
I also understand that given the inherent flexibility of the interface, cases like this where some connectors are not "compatible" may arise again.
Perhaps we should consider again some "priority" level of make the inherent one more clear?

@njhill do you have a better idea on how the MultiConnector behavior should be modified to accomodate this case?

cc @sdavidbd who may also be interested in the discussion

sdavidbd · 2026-01-07T13:40:45Z

I agree with @NickLucche that the MultiConnector implementation should remain agnostic to any specific connector’s internal semantics.

As I understand it, MultiConnector today supports loading a given request’s KV from a single connector (specifically, the first connector that reports a non-zero match via get_num_new_matched_tokens()), while supporting save/send via multiple connectors.

On the API semantics: I believe update_state_after_alloc() was designed to be tightly coupled to get_num_new_matched_tokens() and intended for the load path. The num_external_tokens parameter is explicitly “the number of tokens that will be loaded from the external KV cache,” (i.e., the value returned by get_num_new_matched_tokens()). In Scheduler::schedule, comments in the code also frame update_state_after_alloc() as load-related (e.g., “used to determine if a load is needed”).

Accordingly, if a connector is using update_state_after_alloc() as the mechanism to derive save operations, that assumption will not hold under MultiConnector’s current contract: when there is no external hit, update_state_after_alloc() will not be invoked at all, by design. (Related: it seems the NixlConnector's CPU save path will also run into the same issue.)

For deriving save/send work, the intended mechanism appears to be build_connector_meta(), which receives SchedulerOutput and therefore has access to the per-request metadata accumulated during scheduling (including any metadata contributed via update_state_after_alloc() when applicable).

Concretely, I think the correct fix here is to derive save/send work in build_connector_meta().

That said, I agree the name update_state_after_alloc() can be misleading, since it reads like a general “post-allocation hook” for both load and save. I would suggest either renaming it to make the intent explicit (e.g., update_state_for_load_after_alloc()), or considering deprecating it altogether - since in principle build_connector_meta() can be sufficient to derive both load and save work.

njhill · 2026-01-09T22:47:12Z

I agree with what @sdavidbd wrote, the intention is for get_num_new_matched_tokens and update_state_after_alloc to be used on the loading/receiving side only, and to use other hooks for the sending/saving.

@liziyu179 it would help if you could provide an example of a "push" connector implementation of the kind you're talking about here.

Related: it seems the NixlConnector's CPU save path will also run into the same issue.

@sdavidbd I don't quite understand what you mean by this?

sdavidbd · 2026-01-13T11:19:32Z

Related: it seems the NixlConnector's CPU save path will also run into the same issue.

@sdavidbd I don't quite understand what you mean by this?

@njhill I was referring to the NixlConnector path that offloads KV to a CPU buffer (added for TPU support in PR #18293). In that implementation, the connector appears to mark/derive the save work in update_state_after_alloc(). Under MultiConnector, update_state_after_alloc() is not invoked at all on do_remote_decode, so relying on it to schedule CPU offload/save work would not be reliable when MultiConnector is used.

liziyu179 requested review from ApostaC and NickLucche as code owners December 22, 2025 12:14

mergify bot added the kv-connector label Dec 22, 2025

gemini-code-assist bot reviewed Dec 22, 2025

View reviewed changes

liziyu179 mentioned this pull request Dec 22, 2025

[Bug]: [P/D] multi-connector cannot be used together with P2pNcclConnector that uses a put mode(push kv cache from P node to D node) #31145

Open

1 task

fix multiconnector for multi connector use push kv connector

9ce2335

Signed-off-by: liziyu <liziyu16@huawei.com>

liziyu179 force-pushed the fix_multi_connector branch from 9d1de9e to 9ce2335 Compare December 23, 2025 01:02

NickLucche reviewed Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix multiconnector for multi connector use push kv connector#31144

fix multiconnector for multi connector use push kv connector#31144
liziyu179 wants to merge 1 commit intovllm-project:mainfrom
liziyu179:fix_multi_connector

liziyu179 commented Dec 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mergify bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

LCAIZJ commented Jan 4, 2026

Uh oh!

NickLucche left a comment

Uh oh!

sdavidbd commented Jan 7, 2026

Uh oh!

njhill commented Jan 9, 2026

Uh oh!

sdavidbd commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

liziyu179 commented Dec 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

LCAIZJ commented Jan 4, 2026

Uh oh!

NickLucche left a comment

Choose a reason for hiding this comment

Uh oh!

sdavidbd commented Jan 7, 2026

Uh oh!

njhill commented Jan 9, 2026

Uh oh!

sdavidbd commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

liziyu179 commented Dec 22, 2025 •

edited by github-actions bot

Loading