[Model]Support MiniCPM-o 4.5 by tc-mb · Pull Request #3642 · vllm-project/vllm-omni

tc-mb · 2026-05-15T08:00:07Z

Purpose

Hi team — I'm from the MiniCPM-V / MiniCPM-o model team and have been
maintaining the integration of the V and O series into vLLM. We have been
following vllm-omni since day one, and this PR brings both
MiniCPM-o 2.6 and MiniCPM-o 4.5 into the project.

Apologies for the delay — over the past weeks the team has been focused on
shipping the MiniCPM-o technical report
(arXiv 2604.27393) and the
MiniCPM-V 4.6 release. Hope this PR makes it easier for the community to
serve the MiniCPM-o family on vllm-omni, and we look forward to deeper
collaboration on omni-modal serving going forward.

What's added

Models (vllm_omni/model_executor/models/)

minicpmo_4_5/: full omni pipeline for MiniCPM-o 4.5
- minicpmo_4_5_omni.py — top-level conditional generation wrapper
- minicpmo_4_5_omni_llm.py — thinker (LLM) stage
- minicpmo_4_5_omni_tts.py — talker (TTS) stage
- minicpmo_4_5_omni_t2w.py — token-to-waveform stage
minicpmo_2_6/: full omni pipeline for MiniCPM-o 2.6 (same 4-file
layout as 4.5).
Registry entries in model_executor/models/registry.py for all 8 new
architectures (MiniCPMO{26,45}Omni{,LLM,TTS,T2W}ForConditionalGeneration).

Stage input processors
(vllm_omni/model_executor/stage_input_processors/)

minicpmo_2_6_omni.py, minicpmo_4_5_omni.py — provide
llm2tts / tts2t2w adapters wired into the stage YAMLs below.

Default stage configs (vllm_omni/model_executor/stage_configs/)

minicpmo.yaml — MiniCPM-o 2.6 default
minicpmo_8x4090.yaml — MiniCPM-o 2.6 on an 8×4090 host
minicpmo45_2gpu.yaml — MiniCPM-o 4.5, 2-GPU layout
minicpmo45_3gpu.yaml — MiniCPM-o 4.5, 3-GPU (thinker TP=2)
minicpmo45_8x4090.yaml — MiniCPM-o 4.5 on an 8×4090 host

Online serving example (examples/online_serving/minicpmo/)

gradio_demo.py, run_gradio_demo.sh, README.md — single Gradio UI
that drives both 2.6 and 4.5 endpoints over the OpenAI-compatible API.

API server (vllm_omni/entrypoints/openai/api_server.py)

Pre-load the model's HF config with trust_remote_code=True (with GPU
visibility temporarily hidden) so HuggingFace transformers_modules is
registered in the API server process. This is required for ZMQ pickle
deserialization of MiniCPM-o stage outputs that reference dynamic
modules. Failures cleanly fall through, so non-trust_remote_code
models are unaffected.

Notes

This PR was merged with the latest main (clean fast-forward from
fdb0efea); MiniCPM-o-specific code lives entirely under the paths
listed above and the changes outside those paths are limited to the
registry entry and the API-server pre-load described above.

Test Plan

We validate both models via the OpenAI-compatible server and the Gradio
demo shipped in examples/online_serving/minicpmo/.

1. Launch a backend server

# MiniCPM-o 4.5, 8×4090 layout (thinker TP=4 on GPU0-3, talker+t2w share GPU4)
vllm-omni serve <path-to-MiniCPM-o-4_5> \
    --stage-configs-path vllm_omni/model_executor/stage_configs/minicpmo45_8x4090.yaml \
    --trust-remote-code \
    --host 0.0.0.0 --port 8099

# MiniCPM-o 2.6
vllm-omni serve <path-to-MiniCPM-o-2_6> \
    --stage-configs-path vllm_omni/model_executor/stage_configs/minicpmo.yaml \
    --trust-remote-code \
    --host 0.0.0.0 --port 8091

Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: GKangaroo <1095103651@qq.com>

Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn>

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74b5e5fd67

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

hsliuustc0106 · 2026-05-15T09:49:28Z

thanks for you contribution, can we split it into 2 PRs?

tc-mb · 2026-05-15T10:04:15Z

thanks for you contribution, can we split it into 2 PRs?

No problem.

I'll continue merging this PR with minicpm-o4.5, and then open another PR to merge o2.6. Do you think this is appropriate?

Opening two PRs simultaneously might require back-and-forth discussions about the merging syntax. Merging one first will allow me to understand the requirements for merging vllm-omni, saving you the trouble of reviewing twice.

lishunyang12 · 2026-05-15T15:43:54Z

thanks for you contribution, can we split it into 2 PRs?

No problem.

I'll continue merging this PR with minicpm-o4.5, and then open another PR to merge o2.6. Do you think this is appropriate?

Opening two PRs simultaneously might require back-and-forth discussions about the merging syntax. Merging one first will allow me to understand the requirements for merging vllm-omni, saving you the trouble of reviewing twice.

We can focus on MiniCPM-o 4.5 first. You can down scope this pr so that we can fast forward the reviewing process.

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

…rch fallback (avoids 2.6 collision) Signed-off-by: tc-mb <tianchi_cai@icloud.com>

…ilently returning empty audio Signed-off-by: tc-mb <tianchi_cai@icloud.com>

…tra instead of doc-only Signed-off-by: tc-mb <tianchi_cai@icloud.com>

…e info delivery and OmniOutput packaging Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

zhumingjue138 · 2026-05-26T12:18:53Z

please add UT case if it is necessary

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb · 2026-05-26T14:24:35Z

please add UT case if it is necessary

ok, added a unit-test suite for the MiniCPM-o 4.5 path in tests/model_executor/models/minicpmo_4_5
PTAL.

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Nightwing-77 · 2026-05-31T19:39:56Z

could we have E2E tests covering offline and online inference!?

I suggest to add them in nightly-test, please follow the corresponding md

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Gaohan123

LGTM. Thanks

Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: GKangaroo <1095103651@qq.com> Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

tc-mb and others added 9 commits March 19, 2026 10:42

Add MiniCPM-o 2.6 omni model support

8ea3c5b

Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: GKangaroo <1095103651@qq.com>

fix all bugs for vllm 0.18

985d2e2

Signed-off-by: tc-mb <tianchi_cai@icloud.com> Co-authored-by: GKangaroo <gqx24@mails.tsinghua.edu.cn>

fix audio input

d19c136

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

support minicpm-o 4.5

81744fe

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

use minicpmo type

b742162

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix api error

042275c

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix api & gradio

56f3953

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix demo

a50201b

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Merge branch 'main' into Support-MiniCPM-o-4.5

7d56870

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb requested review from Gaohan123, ZeldaHuang, gcanlin, hsliuustc0106, linyueqian, princepride, tzhouam, yuanheng-zhao and ywang96 as code owners May 15, 2026 08:00

chatgpt-codex-connector Bot reviewed May 15, 2026

View reviewed changes

Comment thread vllm_omni/model_executor/models/minicpmo_2_6/minicpmo_2_6_omni_t2w.py Outdated

Comment thread vllm_omni/model_executor/models/minicpmo_4_5/minicpmo_4_5_omni_tts.py Outdated

Comment thread vllm_omni/model_executor/stage_configs/minicpmo.yaml Outdated

tc-mb added 4 commits May 15, 2026 16:22

Declare MiniCPM-o 2.6 token-to-waveform runtime dependencies

525df9d

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Fail loudly when MiniCPM-o 4.5 stepaudio2 dependency is missing

e0a9e42

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Use a 2-GPU layout in the default MiniCPM-o 2.6 stage config

796b6d0

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Fix CI ruff lint errors in MiniCPM-o code

cedf8ce

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb force-pushed the Support-MiniCPM-o-4.5 branch from 74b5e5f to 8fd1276 Compare May 15, 2026 09:53

Scope this PR down to MiniCPM-o 4.5 only

227d67c

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb changed the title ~~[Model]Support MiniCPM-o 2.6 & MiniCPM-o 4.5~~ [Model]Support MiniCPM-o 4.5 May 16, 2026

Fix typos in MiniCPM-o 4.5 LLM stage

4f81a43

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb added 4 commits May 25, 2026 14:36

fix MiniCPM-o 4.5 pipeline routing: add hf_config_predicate to gate a…

13663b4

…rch fallback (avoids 2.6 collision) Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix MiniCPM-o 4.5 TTS init: re-raise non-import failures instead of s…

4ddafeb

…ilently returning empty audio Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix MiniCPM-o 4.5 stepaudio2 dep: declare under [minicpmo] install ex…

fd8336a

…tra instead of doc-only Signed-off-by: tc-mb <tianchi_cai@icloud.com>

fix MiniCPM-o 4.5 stage 1 wiring: route talker via wrapper for runtim…

fb6abc2

…e info delivery and OmniOutput packaging Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb force-pushed the Support-MiniCPM-o-4.5 branch from fdc5c79 to fb6abc2 Compare May 25, 2026 07:53

tc-mb and others added 4 commits May 25, 2026 15:57

Merge branch 'main' into Support-MiniCPM-o-4.5

22456b5

fix typo in stage_config comment: mis-routing -> misrouting (typos hook)

ee96bff

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Merge branch 'main' into Support-MiniCPM-o-4.5

7150ca1

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Merge branch 'main' into Support-MiniCPM-o-4.5

f74a303

add MiniCPM-o 4.5 pipeline registration and llm2tts bridge UTs

352b2c6

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

tc-mb requested a review from yenuo26 as a code owner May 26, 2026 14:23

tc-mb and others added 2 commits May 27, 2026 12:57

Merge branch 'main' into Support-MiniCPM-o-4.5

afd07b4

fix typos hook false positive in MiniCPM-o 4.5 UT doc comment

52cfc97

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Sy0307 mentioned this pull request May 27, 2026

[WIP] [Full Duplex] Feat: Support Full-Duplex realtime runtime & add MiniCPM-o 4.5 demo #3907

Draft

10 tasks

Merge branch 'main' into Support-MiniCPM-o-4.5

ab4bf10

Nightwing-77 reviewed May 31, 2026

View reviewed changes

hsliuustc0106 reviewed Jun 1, 2026

View reviewed changes

Comment thread vllm_omni/model_executor/models/minicpmo_4_5/minicpmo_4_5_omni_llm.py

tc-mb and others added 2 commits June 1, 2026 16:40

add SPDX header and adapted-from notice to MiniCPM-o 4.5 source files

4bd3f1d

Signed-off-by: tc-mb <tianchi_cai@icloud.com>

Merge branch 'main' into Support-MiniCPM-o-4.5

a02a52a

Gaohan123 approved these changes Jun 1, 2026

View reviewed changes

Gaohan123 merged commit 2b7249f into vllm-project:main Jun 1, 2026
6 of 8 checks passed

tc-mb deleted the Support-MiniCPM-o-4.5 branch June 1, 2026 14:23

tc-mb mentioned this pull request Jun 2, 2026

add MiniCPM-o 4.5 recipe under recipes/OpenBMB #4067

Merged

amy-why-3459 mentioned this pull request Jun 3, 2026

[BugFix] Fix the issue of vllm failing to start. #4105

Merged

5 tasks

linyueqian mentioned this pull request Jun 4, 2026

[New Model]: Add MiniCPM-o-4_5 support #3337

Closed

5 tasks

tc-mb mentioned this pull request Jun 12, 2026

[Model] Support MiniCPM-o 2.6 #4387

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model]Support MiniCPM-o 4.5#3642

[Model]Support MiniCPM-o 4.5#3642
Gaohan123 merged 44 commits into
vllm-project:mainfrom
tc-mb:Support-MiniCPM-o-4.5

tc-mb commented May 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented May 15, 2026

Uh oh!

tc-mb commented May 15, 2026

Uh oh!

lishunyang12 commented May 15, 2026 •

edited

Loading

Uh oh!

zhumingjue138 commented May 26, 2026

Uh oh!

tc-mb commented May 26, 2026

Uh oh!

Nightwing-77 May 31, 2026 •

edited

Loading

Uh oh!

hsliuustc0106 Jun 1, 2026

Uh oh!

Uh oh!

Gaohan123 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

tc-mb commented May 15, 2026

Purpose

What's added

Notes

Test Plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented May 15, 2026

Uh oh!

tc-mb commented May 15, 2026

Uh oh!

lishunyang12 commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhumingjue138 commented May 26, 2026

Uh oh!

tc-mb commented May 26, 2026

Uh oh!

Nightwing-77 May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsliuustc0106 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

lishunyang12 commented May 15, 2026 •

edited

Loading

Nightwing-77 May 31, 2026 •

edited

Loading