[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio by y123456y78 · Pull Request #2750 · vllm-project/vllm-omni

y123456y78 · 2026-04-13T20:34:20Z

Purpose

ref_audio (mm_data) + text input fails bc VoxtralTTSMultiModalProcessor doesn't work with HF _apply_hf_processor_mm_only directly (the class didn't inherit from Transformers ProcessorMixin since it use mistral tokenizer to handle preprocess)
prefix voice clone still work bc it send text + voice id (no mm_input)

Test Plan

pytest -s -v   tests/model_executor/stage_input_processors/test_voxtral_tts_async_chunk.py   \
tests/model_executor/models/voxtral_tts/test_cuda_graph_acoustic_transformer.py   \
tests/model_executor/models/voxtral_tts/test_audio_tokenizer_parsing.py   \
tests/e2e/online_serving/test_voxtral_tts.py \
tests/model_executor/models/voxtral_tts/test_text_preprocess.py  \
tests/e2e/offline_inference/test_voxtral_tts.py

Test Result

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

chatgpt-codex-connector · 2026-04-13T20:40:49Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

hsliuustc0106 · 2026-04-13T21:05:58Z

PR #2750 - [Voxtral TTS] Fix input with text and ref_audio

OVERALL: NO BLOCKERS
VERDICT: COMMENT

Correctness: PASS, Reliability: PASS, Breaking: PASS, Tests: PASS, Docs: PASS, Security: PASS

Summary: Bugfix override _apply_hf_processor_mm_only with dummy text for mm_input. 23 lines. Gates pass, tests pass. No blockers.

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

codeHackeR321 · 2026-04-15T12:20:29Z

Hi @y123456y78, Can you please tell me which voxtral tts opensource model are you using ? I could not find voice cloning support in official hf model page. Commuity discussions say they have not released voice cloning weights. https://huggingface.co/mistralai/Voxtral-4B-TTS-2603/discussions/17.

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

y123456y78 added 2 commits April 13, 2026 20:28

Fix voxtral tts mm + text input

5e5f33b

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Update comment

0f350ff

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

y123456y78 changed the title ~~[Voxtral TTS] Fix Voxtral TTS mm + text input~~ [Voxtral TTS] Fix Voxtral TTS input with text and mm data Apr 13, 2026

y123456y78 changed the title ~~[Voxtral TTS] Fix Voxtral TTS input with text and mm data~~ [Voxtral TTS] Fix Voxtral TTS input with text and ref_audio Apr 13, 2026

y123456y78 marked this pull request as ready for review April 13, 2026 20:40

y123456y78 requested a review from hsliuustc0106 as a code owner April 13, 2026 20:40

ywang96 approved these changes Apr 13, 2026

View reviewed changes

Merge branch 'main' into fix-voxtral-tts-mm-input

91e9653

ywang96 enabled auto-merge (squash) April 13, 2026 22:04

ywang96 disabled auto-merge April 13, 2026 22:04

ywang96 enabled auto-merge (squash) April 13, 2026 22:04

linyueqian added the ready label to trigger buildkite CI label Apr 13, 2026

ywang96 merged commit dd13891 into vllm-project:main Apr 13, 2026
7 of 8 checks passed

Celeste-jq pushed a commit to IsleOfDawnlight/vllm-omni-voxcpm that referenced this pull request Apr 14, 2026

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio (vllm-pro…

f617a73

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

alex-jw-brooks pushed a commit to alex-jw-brooks/vllm-omni that referenced this pull request Apr 14, 2026

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio (vllm-pro…

6a1e90b

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio (vllm-pro…

5f8d923

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio (vllm-pro…

56a62b9

…ject#2750) Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio#2750

[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio#2750
ywang96 merged 3 commits into
vllm-project:mainfrom
y123456y78:fix-voxtral-tts-mm-input

y123456y78 commented Apr 13, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Apr 13, 2026

Uh oh!

hsliuustc0106 commented Apr 13, 2026

Uh oh!

Uh oh!

codeHackeR321 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

y123456y78 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot commented Apr 13, 2026

Uh oh!

hsliuustc0106 commented Apr 13, 2026

Uh oh!

Uh oh!

codeHackeR321 commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

y123456y78 commented Apr 13, 2026 •

edited

Loading