Skip to content

[KVConnector][1/N] PP-aware handshake aggregation and intermediate-PP output plumbing#43720

Merged
WoosukKwon merged 4 commits into
vllm-project:mainfrom
zixi-qi:pr0/pp-disagg-foundation
Jun 5, 2026
Merged

[KVConnector][1/N] PP-aware handshake aggregation and intermediate-PP output plumbing#43720
WoosukKwon merged 4 commits into
vllm-project:mainfrom
zixi-qi:pr0/pp-disagg-foundation

Conversation

@zixi-qi

@zixi-qi zixi-qi commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Purpose

The goal of this PR is to make kv connector / engine / model runner / gpu worker PP-aware, to layout the foundations to support PP(pipeline parallelism) + PD disaggregation.

  • This PR is connector agnostic, the changes here are needed for both NIXL connector and Mooncake connector to support PP + PD.
  • This PR is a pure PP-aware refactor and does not introduce any behavior changes.

Changes

  • Widen `TransferTopology._engines` to `dict[(engine_id, pp_rank), …]`; extend `EngineTransferInfo` with `remote_pp_rank` / `start_layer` / `end_layer` (default `0`). Existing public methods keep their names and call shapes; PP-aware helpers gain an optional `remote_pp_rank: int = 0`.
  • New `SupportsPP` ABC next to `SupportsHMA` (abstract `set_xfer_handshake_metadata_pp_aware` + `supports_pp(connector)` helper). `MultiConnector` inherits `SupportsPP` and dispatches per child.
  • Worker emits `{(pp_rank, tp_rank): metadata}`. Engine core converts at the connector boundary: PP-aware connectors get the tuple-keyed dict; non-PP connectors get the unwrapped `{tp_rank: metadata}` via the existing `set_xfer_handshake_metadata`.
  • Non-last-PP-rank returns `EMPTY_MODEL_RUNNER_OUTPUT` (optionally carrying `kv_connector_output`) instead of `None`, so the KV output aggregator's non-None invariant holds.

Test Plan

  • New unit tests: `test_transfer_topology_sharded.py`, `test_handshake_pp_aggregation.py`, `test_pp_intermediate_output.py` — 9 passed.
  • Existing `test_nixl_connector.py` (59 passed, 2 skipped — same skips on main) and `test_mooncake_connector.py` pass unchanged.
  • `pre-commit run --all-files` clean.
  • Ran PD disagg and non-PD + PP cases e2e and verified everything still work as expected

AI-assisted PR. The submitter has reviewed every line.

@zixi-qi zixi-qi changed the title [KVConnector] Foundation: PP-aware handshake aggregation and intermediate-PP output plumbing [KVConnector] PP-aware handshake aggregation and intermediate-PP output plumbing May 27, 2026
@njhill

njhill commented May 27, 2026

Copy link
Copy Markdown
Member

@zixi-qi fyi I've also included the "intermediate-PP output plumbing" part of this in #43732 which cleans up some of the PP kv connector handling more generally.

@zixi-qi

zixi-qi commented May 27, 2026

Copy link
Copy Markdown
Collaborator Author

@zixi-qi fyi I've also included the "intermediate-PP output plumbing" part of this in #43732 which cleans up some of the PP kv connector handling more generally.

Sounds good! I will rebase once your PR merges

@zixi-qi

zixi-qi commented May 27, 2026

Copy link
Copy Markdown
Collaborator Author

@zixi-qi zixi-qi marked this pull request as ready for review May 27, 2026 23:01
@mergify

mergify Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @zixi-qi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label May 29, 2026
@zixi-qi zixi-qi force-pushed the pr0/pp-disagg-foundation branch from e6532ef to c1336b2 Compare May 29, 2026 06:19
@mergify mergify Bot removed the needs-rebase label May 29, 2026

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zixi-qi, this looks mostly good to me.

However I think the SupportsPP interface is a bit heavy, it would be cleaner to manage this inside the KVConnectorBase_V1 class, and just check for existence of the pp_aware method on self rather than using a marker class.

Would be good for @NickLucche to bless the TransferTopology updates too when he has time!

@zixi-qi

zixi-qi commented May 30, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks @zixi-qi, this looks mostly good to me.

However I think the SupportsPP interface is a bit heavy, it would be cleaner to manage this inside the KVConnectorBase_V1 class, and just check for existence of the pp_aware method on self rather than using a marker class.

Would be good for @NickLucche to bless the TransferTopology updates too when he has time!

Updated to remove SupportsPP interface, thanks for the suggestion!

@zixi-qi zixi-qi changed the title [KVConnector] PP-aware handshake aggregation and intermediate-PP output plumbing [KVConnector][1/N] PP-aware handshake aggregation and intermediate-PP output plumbing May 30, 2026

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zixi-qi that looks much cleaner. I think we can clean up some comments further though.

One question - what happens if a non-PP-aware connector is used in a PP context. Can that still be valid? If not, we should think about where to validate this.

Comment thread vllm/v1/engine/core.py Outdated
Comment thread vllm/distributed/kv_transfer/kv_connector/v1/multi_connector.py Outdated
Comment thread vllm/v1/worker/gpu_worker.py Outdated
@zixi-qi

zixi-qi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks @zixi-qi that looks much cleaner. I think we can clean up some comments further though.

One question - what happens if a non-PP-aware connector is used in a PP context. Can that still be valid? If not, we should think about where to validate this.

Thanks for the suggestion! I have cleaned up the excessive comments, will check the stacked PRs to do the same as well.

non-PP-aware connector is used in a PP context.
Good catch on this! I added a validation in the default set_xfer_handshake_metadata_pp_aware implementation that throws if pp_rank > 0 in this scenario

@zixi-qi

zixi-qi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

cc @NickLucche would be great if you could take a look as well :)

@zixi-qi zixi-qi added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 3, 2026
zixi-qi added 3 commits June 4, 2026 22:19
Co-authored-by: Claude
Signed-off-by: zixi-qi <zixi@inferact.ai>
… base

Co-authored-by: Claude
Signed-off-by: zixi-qi <zixi@inferact.ai>
…fault

Co-authored-by: Claude
Signed-off-by: zixi-qi <zixi@inferact.ai>
Co-authored-by: Claude
Signed-off-by: zixi-qi <zixi@inferact.ai>
@zixi-qi zixi-qi force-pushed the pr0/pp-disagg-foundation branch from d9ba472 to 97fdc87 Compare June 4, 2026 22:23
@WoosukKwon WoosukKwon merged commit 96229fa into vllm-project:main Jun 5, 2026
70 of 72 checks passed
JisoLya pushed a commit to JisoLya/vllm that referenced this pull request Jun 5, 2026
… output plumbing (vllm-project#43720)

Signed-off-by: zixi-qi <zixi@inferact.ai>
Signed-off-by: JisoLya <523420504@qq.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
… output plumbing (vllm-project#43720)

Signed-off-by: zixi-qi <zixi@inferact.ai>
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Jun 9, 2026
… output plumbing (vllm-project#43720)

Signed-off-by: zixi-qi <zixi@inferact.ai>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
… output plumbing (vllm-project#43720)

Signed-off-by: zixi-qi <zixi@inferact.ai>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
… output plumbing (vllm-project#43720)

Signed-off-by: zixi-qi <zixi@inferact.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants