Skip to content

[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector#2705

Merged
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
natureofnature:bugfix/fix_tp_cfg_parallel
Apr 16, 2026
Merged

[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector#2705
hsliuustc0106 merged 4 commits intovllm-project:mainfrom
natureofnature:bugfix/fix_tp_cfg_parallel

Conversation

@natureofnature
Copy link
Copy Markdown
Contributor

@natureofnature natureofnature commented Apr 12, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Progress

Purpose

  1. support mooncake transfer engine tp + cfg paralell
  2. fix tp head lost issue while using connector
    Taking i2i tp2 (AR) -> tp2(DIT) using shared memory for example (50ae1de):
    The output image is like :
iter_0_white_slot0
  1. fix: build per-stage sampling_params_list and pass cfg_text/img_scale in serving_chat
Screenshot from 2026-04-13 10-37-01

Test Plan

  1. L3 tests
pytest   tests/e2e/offline_inference/test_bagel_text2img.py   tests/e2e/offline_inference/test_bagel_img2img.py   tests/e2e/offline_inference/test_bagel_understanding.py   tests/e2e/offline_inference/test_bagel_lora.py   tests/e2e/online_serving/test_bagel_online.py   tests/e2e/online_serving/test_bagel_expansion.py   -x -v
  1. Manual tests
    2.1 Text to image
    2.2 Image to Image
    Prompt: Let the woman wear a white dress
    Image:
i2i_org

Test Result

  1. L3 tests
    Before this PR:
Screenshot from 2026-04-16 09-31-38 Using this PR: Screenshot from 2026-04-16 09-35-10
AR-TP DIT-TP DIT-cfg mooncake transfer engine connector result (t2i) mooncake transfer engine connector result (i2i)
1 1 1 iter_0_slot0 iter_0_white_slot0
1 2 1 iter_0_slot0 iter_0_white_slot0
2 1 1 iter_0_slot0 iter_0_white_slot0
2 2 1 iter_0_slot0 iter_0_white_slot0
2 2 3 TO BE UPLOADED TO BE UPLOADED

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@natureofnature natureofnature marked this pull request as draft April 12, 2026 10:29
@hsliuustc0106
Copy link
Copy Markdown
Collaborator

Draft PR - ready for full review when draft status removed.

This PR is substantial (>1000 LOC / >10 files). Please run L3 tests locally and paste the results in the PR description:

Test Result section should include:

  • L3 test results (hardware, model, resolution/steps, pass/fail status)
  • Manual test results showing tp+cfg parallel works correctly

@natureofnature
Copy link
Copy Markdown
Contributor Author

@princepride PTAL

@princepride
Copy link
Copy Markdown
Collaborator

Please resolve conflicts first

@natureofnature natureofnature force-pushed the bugfix/fix_tp_cfg_parallel branch from 389df75 to 087f3d7 Compare April 15, 2026 09:35
@natureofnature natureofnature changed the title [Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector [WIP][Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 15, 2026
@natureofnature natureofnature marked this pull request as ready for review April 15, 2026 09:39
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…and Bagel/engine integration

fix: build per-stage sampling_params_list and pass cfg_text/img_scale in serving_chat

KV Transfer Manager — rank-aware TP:
- Embed from_rank/to_rank into connector keys for per-rank addressing
- Rank mapping for heterogeneous TP (M:N) topologies
- Sender-side slice and receiver-side merge hooks for KV head redistribution
- Per-rank ZMQ port calculation using KV_RANK_PORT_STRIDE
- receive_multi_kv_cache_distributed() for pulling from multiple sender ranks
- Deduplicate serialization: shared _build_tensors_desc/_build_header_bytes
  and _populate_caches helpers (~91 lines saved)
- Replace traceback.print_exc() with logger.exception()
- Remove dead from_model_config alias and get_connector() wrapper

CFG Distribution (kv_transfer_manager):
- _discover_cfg_branch_roles() auto-detects branch roles from sampling_params
- _build_cfg_rank_local_payloads() partitions branch KVs across CFG ranks
- Generic contract: cfg_active_branch, cfg_branch_roles,
  cfg_branch_past_key_values, cfg_branch_kv_metadata

OmniSamplingParams (data.py):
- Add generic CFG fields for model-agnostic branch contract

Bagel pipeline:
- Read generic cfg_branch_* fields first, fall back to legacy
  cfg_text/cfg_img fields for backward compatibility

GroupCoordinator:
- Fix send_object/recv_object assert: rank_in_group instead of global rank
- Initialize self.shm_broadcaster = None

Engine TP auto-inference (async_omni_engine.py):
- Add _tp_size_for_stage, _inject_inferred_kv_tp_topology helpers
- Auto-infer from_tp/to_tp for adjacent stages in OmniKVCacheConfig
- Use local _inject_kv_stage_info with TP topology support

stage_engine_core_client.py:
- Document rank-0 base port and KV_RANK_PORT_STRIDE adjustment

cfg scales parameter pass:
- pass cfg_text/img_scale in serving_chat, to make cfg scale
  controllable

Tests: test_tp_rank_aware (rank-aware keys, hetero TP merge/slice, CFG
leader distribution, payload application, multi-source receive) and
test_async_omni_engine_stage_init (TP auto-inference)

Signed-off-by: natureofnature <wzliu@connect.hku.hk>
@natureofnature natureofnature force-pushed the bugfix/fix_tp_cfg_parallel branch from 087f3d7 to bef2b86 Compare April 15, 2026 10:22
@Gaohan123 Gaohan123 added this to the v0.20.0 milestone Apr 15, 2026
@natureofnature natureofnature changed the title [WIP][Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector [Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 16, 2026
@natureofnature natureofnature changed the title [Bugfix] Bagel: Support tp+cfg parallel using mooncake transfer engine connector [Feature] Bagel: Support tp+cfg parallel using mooncake transfer engine connector Apr 16, 2026
@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Apr 16, 2026
@natureofnature
Copy link
Copy Markdown
Contributor Author

@codex review

@hsliuustc0106 hsliuustc0106 added the merge-test label to trigger buildkite merge test CI label Apr 16, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bef2b86c21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vllm_omni/distributed/omni_connectors/utils/kv_utils.py
Signed-off-by: natureofnature <wzliu@connect.hku.hk>
@natureofnature
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce20e23582

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread vllm_omni/entrypoints/openai/serving_chat.py
Signed-off-by: natureofnature <wzliu@connect.hku.hk>
@natureofnature
Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@princepride princepride enabled auto-merge (squash) April 16, 2026 08:00
@hsliuustc0106 hsliuustc0106 disabled auto-merge April 16, 2026 08:25
@hsliuustc0106 hsliuustc0106 merged commit 4d816ff into vllm-project:main Apr 16, 2026
8 checks passed
lvliang-intel pushed a commit to lvliang-intel/vllm-omni that referenced this pull request Apr 20, 2026
…ne connector (vllm-project#2705)

Signed-off-by: natureofnature <wzliu@connect.hku.hk>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-test label to trigger buildkite merge test CI ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants