Skip to content

[Model]Support MiniCPM-o 4.5#3642

Merged
Gaohan123 merged 44 commits into
vllm-project:mainfrom
tc-mb:Support-MiniCPM-o-4.5
Jun 1, 2026
Merged

[Model]Support MiniCPM-o 4.5#3642
Gaohan123 merged 44 commits into
vllm-project:mainfrom
tc-mb:Support-MiniCPM-o-4.5

Conversation

@tc-mb

@tc-mb tc-mb commented May 15, 2026

Copy link
Copy Markdown
Contributor

Purpose

Hi team — I'm from the MiniCPM-V / MiniCPM-o model team and have been
maintaining the integration of the V and O series into vLLM. We have been
following vllm-omni since day one, and this PR brings both
MiniCPM-o 2.6 and MiniCPM-o 4.5 into the project.

Apologies for the delay — over the past weeks the team has been focused on
shipping the MiniCPM-o technical report
(arXiv 2604.27393) and the
MiniCPM-V 4.6 release. Hope this PR makes it easier for the community to
serve the MiniCPM-o family on vllm-omni, and we look forward to deeper
collaboration on omni-modal serving going forward.

What's added

Models (vllm_omni/model_executor/models/)

  • minicpmo_4_5/: full omni pipeline for MiniCPM-o 4.5
    • minicpmo_4_5_omni.py — top-level conditional generation wrapper
    • minicpmo_4_5_omni_llm.py — thinker (LLM) stage
    • minicpmo_4_5_omni_tts.py — talker (TTS) stage
    • minicpmo_4_5_omni_t2w.py — token-to-waveform stage
  • minicpmo_2_6/: full omni pipeline for MiniCPM-o 2.6 (same 4-file
    layout as 4.5).
  • Registry entries in model_executor/models/registry.py for all 8 new
    architectures (MiniCPMO{26,45}Omni{,LLM,TTS,T2W}ForConditionalGeneration).

Stage input processors
(vllm_omni/model_executor/stage_input_processors/)

  • minicpmo_2_6_omni.py, minicpmo_4_5_omni.py — provide
    llm2tts / tts2t2w adapters wired into the stage YAMLs below.

Default stage configs (vllm_omni/model_executor/stage_configs/)

  • minicpmo.yaml — MiniCPM-o 2.6 default
  • minicpmo_8x4090.yaml — MiniCPM-o 2.6 on an 8×4090 host
  • minicpmo45_2gpu.yaml — MiniCPM-o 4.5, 2-GPU layout
  • minicpmo45_3gpu.yaml — MiniCPM-o 4.5, 3-GPU (thinker TP=2)
  • minicpmo45_8x4090.yaml — MiniCPM-o 4.5 on an 8×4090 host

Online serving example (examples/online_serving/minicpmo/)

  • gradio_demo.py, run_gradio_demo.sh, README.md — single Gradio UI
    that drives both 2.6 and 4.5 endpoints over the OpenAI-compatible API.

API server (vllm_omni/entrypoints/openai/api_server.py)

  • Pre-load the model's HF config with trust_remote_code=True (with GPU
    visibility temporarily hidden) so HuggingFace transformers_modules is
    registered in the API server process. This is required for ZMQ pickle
    deserialization of MiniCPM-o stage outputs that reference dynamic
    modules. Failures cleanly fall through, so non-trust_remote_code
    models are unaffected.

Notes

This PR was merged with the latest main (clean fast-forward from
fdb0efea); MiniCPM-o-specific code lives entirely under the paths
listed above and the changes outside those paths are limited to the
registry entry and the API-server pre-load described above.

Test Plan

We validate both models via the OpenAI-compatible server and the Gradio
demo shipped in examples/online_serving/minicpmo/.

1. Launch a backend server

# MiniCPM-o 4.5, 8×4090 layout (thinker TP=4 on GPU0-3, talker+t2w share GPU4)
vllm-omni serve <path-to-MiniCPM-o-4_5> \
    --stage-configs-path vllm_omni/model_executor/stage_configs/minicpmo45_8x4090.yaml \
    --trust-remote-code \
    --host 0.0.0.0 --port 8099

# MiniCPM-o 2.6
vllm-omni serve <path-to-MiniCPM-o-2_6> \
    --stage-configs-path vllm_omni/model_executor/stage_configs/minicpmo.yaml \
    --trust-remote-code \
    --host 0.0.0.0 --port 8091

tc-mb and others added 9 commits March 19, 2026 10:42
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Co-authored-by: GKangaroo <1095103651@qq.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74b5e5fd67

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread vllm_omni/model_executor/models/minicpmo_2_6/minicpmo_2_6_omni_t2w.py Outdated
Comment thread vllm_omni/model_executor/models/minicpmo_4_5/minicpmo_4_5_omni_tts.py Outdated
Comment thread vllm_omni/model_executor/stage_configs/minicpmo.yaml Outdated
tc-mb added 4 commits May 15, 2026 16:22
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
@hsliuustc0106

Copy link
Copy Markdown
Collaborator

thanks for you contribution, can we split it into 2 PRs?

@tc-mb tc-mb force-pushed the Support-MiniCPM-o-4.5 branch from 74b5e5f to 8fd1276 Compare May 15, 2026 09:53
@tc-mb

tc-mb commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

thanks for you contribution, can we split it into 2 PRs?

No problem.

I'll continue merging this PR with minicpm-o4.5, and then open another PR to merge o2.6. Do you think this is appropriate?

Opening two PRs simultaneously might require back-and-forth discussions about the merging syntax. Merging one first will allow me to understand the requirements for merging vllm-omni, saving you the trouble of reviewing twice.

@lishunyang12

lishunyang12 commented May 15, 2026

Copy link
Copy Markdown
Collaborator

thanks for you contribution, can we split it into 2 PRs?

No problem.

I'll continue merging this PR with minicpm-o4.5, and then open another PR to merge o2.6. Do you think this is appropriate?

Opening two PRs simultaneously might require back-and-forth discussions about the merging syntax. Merging one first will allow me to understand the requirements for merging vllm-omni, saving you the trouble of reviewing twice.

We can focus on MiniCPM-o 4.5 first. You can down scope this pr so that we can fast forward the reviewing process.

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
@tc-mb tc-mb changed the title [Model]Support MiniCPM-o 2.6 & MiniCPM-o 4.5 [Model]Support MiniCPM-o 4.5 May 16, 2026
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
tc-mb added 4 commits May 25, 2026 14:36
…rch fallback (avoids 2.6 collision)

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
…ilently returning empty audio

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
…tra instead of doc-only

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
…e info delivery and OmniOutput packaging

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
@tc-mb tc-mb force-pushed the Support-MiniCPM-o-4.5 branch from fdc5c79 to fb6abc2 Compare May 25, 2026 07:53
@zhumingjue138

Copy link
Copy Markdown
Contributor

please add UT case if it is necessary

Signed-off-by: tc-mb <tianchi_cai@icloud.com>
@tc-mb tc-mb requested a review from yenuo26 as a code owner May 26, 2026 14:23
@tc-mb

tc-mb commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

please add UT case if it is necessary

ok, added a unit-test suite for the MiniCPM-o 4.5 path in tests/model_executor/models/minicpmo_4_5
PTAL.

@Nightwing-77 Nightwing-77 May 31, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we have E2E tests covering offline and online inference!?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add them in nightly-test, please follow the corresponding md

@Gaohan123 Gaohan123 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@Gaohan123 Gaohan123 merged commit 2b7249f into vllm-project:main Jun 1, 2026
6 of 8 checks passed
@tc-mb tc-mb deleted the Support-MiniCPM-o-4.5 branch June 1, 2026 14:23
86MaxCao pushed a commit to 86MaxCao/vllm-omni that referenced this pull request Jun 4, 2026
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Co-authored-by: GKangaroo <1095103651@qq.com>
Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Nughm3 pushed a commit to Nughm3/vllm-omni that referenced this pull request Jun 18, 2026
Signed-off-by: tc-mb <tianchi_cai@icloud.com>
Co-authored-by: GKangaroo <1095103651@qq.com>
Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

high priority high priority issue, needs to be done asap merge-test label to trigger buildkite merge test CI ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants