[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector by natureofnature · Pull Request #2705 · vllm-project/vllm-omni

natureofnature · 2026-04-12T10:29:06Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Progress

PR1: [Omni Connector] Omni Transfer Engine Connector: Enable 1-receiver-to-N-senders to support Bagel TP/CFG parallel #2731
PR2: Support tp+cfg parallel using mooncake transfer engine connector (This PR)

Purpose

support mooncake transfer engine tp + cfg paralell
fix tp head lost issue while using connector
Taking i2i tp2 (AR) -> tp2(DIT) using shared memory for example (50ae1de):
The output image is like :

fix: build per-stage sampling_params_list and pass cfg_text/img_scale in serving_chat

Test Plan

L3 tests

pytest   tests/e2e/offline_inference/test_bagel_text2img.py   tests/e2e/offline_inference/test_bagel_img2img.py   tests/e2e/offline_inference/test_bagel_understanding.py   tests/e2e/offline_inference/test_bagel_lora.py   tests/e2e/online_serving/test_bagel_online.py   tests/e2e/online_serving/test_bagel_expansion.py   -x -v

Manual tests
2.1 Text to image
2.2 Image to Image
Prompt: Let the woman wear a white dress
Image:

Test Result

L3 tests
Before this PR:

Using this PR:

AR-TP	DIT-TP	DIT-cfg	mooncake transfer engine connector result (t2i)	mooncake transfer engine connector result (i2i)
1	1	1
1	2	1
2	1	1
2	2	1
2	2	3	TO BE UPLOADED	TO BE UPLOADED

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector · 2026-04-12T10:29:14Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

hsliuustc0106 · 2026-04-12T21:04:13Z

Draft PR - ready for full review when draft status removed.

This PR is substantial (>1000 LOC / >10 files). Please run L3 tests locally and paste the results in the PR description:

Test Result section should include:

L3 test results (hardware, model, resolution/steps, pass/fail status)
Manual test results showing tp+cfg parallel works correctly

natureofnature · 2026-04-13T02:21:33Z

@princepride PTAL

princepride · 2026-04-13T02:24:45Z

Please resolve conflicts first

chatgpt-codex-connector · 2026-04-15T09:39:45Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…and Bagel/engine integration fix: build per-stage sampling_params_list and pass cfg_text/img_scale in serving_chat KV Transfer Manager — rank-aware TP: - Embed from_rank/to_rank into connector keys for per-rank addressing - Rank mapping for heterogeneous TP (M:N) topologies - Sender-side slice and receiver-side merge hooks for KV head redistribution - Per-rank ZMQ port calculation using KV_RANK_PORT_STRIDE - receive_multi_kv_cache_distributed() for pulling from multiple sender ranks - Deduplicate serialization: shared _build_tensors_desc/_build_header_bytes and _populate_caches helpers (~91 lines saved) - Replace traceback.print_exc() with logger.exception() - Remove dead from_model_config alias and get_connector() wrapper CFG Distribution (kv_transfer_manager): - _discover_cfg_branch_roles() auto-detects branch roles from sampling_params - _build_cfg_rank_local_payloads() partitions branch KVs across CFG ranks - Generic contract: cfg_active_branch, cfg_branch_roles, cfg_branch_past_key_values, cfg_branch_kv_metadata OmniSamplingParams (data.py): - Add generic CFG fields for model-agnostic branch contract Bagel pipeline: - Read generic cfg_branch_* fields first, fall back to legacy cfg_text/cfg_img fields for backward compatibility GroupCoordinator: - Fix send_object/recv_object assert: rank_in_group instead of global rank - Initialize self.shm_broadcaster = None Engine TP auto-inference (async_omni_engine.py): - Add _tp_size_for_stage, _inject_inferred_kv_tp_topology helpers - Auto-infer from_tp/to_tp for adjacent stages in OmniKVCacheConfig - Use local _inject_kv_stage_info with TP topology support stage_engine_core_client.py: - Document rank-0 base port and KV_RANK_PORT_STRIDE adjustment cfg scales parameter pass: - pass cfg_text/img_scale in serving_chat, to make cfg scale controllable Tests: test_tp_rank_aware (rank-aware keys, hetero TP merge/slice, CFG leader distribution, payload application, multi-source receive) and test_async_omni_engine_stage_init (TP auto-inference) Signed-off-by: natureofnature <wzliu@connect.hku.hk>

natureofnature · 2026-04-16T05:53:02Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bef2b86c21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

natureofnature · 2026-04-16T06:29:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce20e23582

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

natureofnature · 2026-04-16T06:47:05Z

@codex review

chatgpt-codex-connector · 2026-04-16T06:55:58Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ne connector (vllm-project#2705) Signed-off-by: natureofnature <wzliu@connect.hku.hk> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

natureofnature requested a review from hsliuustc0106 as a code owner April 12, 2026 10:29

natureofnature marked this pull request as draft April 12, 2026 10:29

This was referenced Apr 13, 2026

[RFC]: Support Bagel using mooncakeTransferEngineConnector JiusiServe/vllm-omni#145

Closed

[Omni Connector] Omni Transfer Engine Connector: Enable 1-receiver-to-N-senders to support Bagel TP/CFG parallel #2731

Merged

natureofnature force-pushed the bugfix/fix_tp_cfg_parallel branch from 389df75 to 087f3d7 Compare April 15, 2026 09:35

natureofnature changed the title ~~[Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector~~ [WIP][Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 15, 2026

natureofnature marked this pull request as ready for review April 15, 2026 09:39

natureofnature force-pushed the bugfix/fix_tp_cfg_parallel branch from 087f3d7 to bef2b86 Compare April 15, 2026 10:22

Gaohan123 added this to the v0.20.0 milestone Apr 15, 2026

natureofnature changed the title ~~[WIP][Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector~~ [Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 16, 2026

natureofnature changed the title ~~[Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector~~ [Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 16, 2026

hsliuustc0106 added the ready label to trigger buildkite CI label Apr 16, 2026

hsliuustc0106 added the merge-test label to trigger buildkite merge test CI label Apr 16, 2026

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread vllm_omni/distributed/omni_connectors/utils/kv_utils.py

reconstruct from ranks

ce20e23

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread vllm_omni/entrypoints/openai/serving_chat.py

keep cfg safe

44fb9da

Signed-off-by: natureofnature <wzliu@connect.hku.hk>

Merge branch 'main' into bugfix/fix_tp_cfg_parallel

18b1155

princepride enabled auto-merge (squash) April 16, 2026 08:00

hsliuustc0106 disabled auto-merge April 16, 2026 08:25

hsliuustc0106 merged commit 4d816ff into vllm-project:main Apr 16, 2026
8 checks passed

This was referenced Apr 17, 2026

[RFC]: Bagel Performance Optimization - CFG/TP Mooncake TE Support JiusiServe/vllm-omni#207

Open

[Bugfix]pass TP size to diffusion config #2867

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector#2705

[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector#2705
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
natureofnature:bugfix/fix_tp_cfg_parallel

natureofnature commented Apr 12, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 12, 2026

Uh oh!

hsliuustc0106 commented Apr 12, 2026

Uh oh!

natureofnature commented Apr 13, 2026

Uh oh!

princepride commented Apr 13, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

natureofnature commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 12, 2026

Uh oh!

hsliuustc0106 commented Apr 12, 2026

Uh oh!

natureofnature commented Apr 13, 2026

Uh oh!

princepride commented Apr 13, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

natureofnature commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

natureofnature commented Apr 12, 2026 •

edited

Loading