Skip to content

[Bugfix][Refactor] Migrate Voxtral TTS config and parser registry#3065

Merged
princepride merged 6 commits into
vllm-project:mainfrom
yuanheng-zhao:fix/voctral-tts-registry
Apr 25, 2026
Merged

[Bugfix][Refactor] Migrate Voxtral TTS config and parser registry#3065
princepride merged 6 commits into
vllm-project:mainfrom
yuanheng-zhao:fix/voctral-tts-registry

Conversation

@yuanheng-zhao
Copy link
Copy Markdown
Collaborator

@yuanheng-zhao yuanheng-zhao commented Apr 23, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Part of #3066

This PR

  • refactored voxtral TTS config and parser registration to a unified place under vllm_omni/transformers_utils/, as consistent to upstream vllm
  • removed the following unrelated logging/warning even if we serve models other than voxtral TTS
(APIServer pid=928653) WARNING 04-23 05:55:32 [config.py:347] Config format `mistral` is already registered, and will be overwritten by the new parser class `<class 'vllm_omni.model_executor.models.voxtral_tts.configuration_voxtral_tts.VoxtralTTSConfigParser'>`.
(APIServer pid=928653) INFO 04-23 05:55:32 [config.py:358] Registered config parser `<class 'vllm_omni.model_executor.models.voxtral_tts.configuration_voxtral_tts.VoxtralTTSConfigParser'>` with config format `mistral`

NOTE:
The following error happened on main branch and voxtral TTS repo resolution is utilizing a fallback path rather than it:

(APIServer pid=1263900) ERROR 04-23 14:31:03 [repo_utils.py:47] Error retrieving safetensors: 'mistralai/Voxtral-4B-TTS-2603' is not a safetensors repo. Couldn't find 'model.safetensors.index.json' or 'model.safetensors' files., retrying 1 of 2
(APIServer pid=1263900) ERROR 04-23 14:31:05 [repo_utils.py:45] Error retrieving safetensors: 'mistralai/Voxtral-4B-TTS-2603' is not a safetensors repo. Couldn't find 'model.safetensors.index.json' or 'model.safetensors' files.

This PR keep the path intact and won't include the above error fallback in scope.

cc @princepride , @linyueqian , @y123456y78

Test Plan

  1. Running any model other than Voxtral-TTS to check if any unrelated voxtral-tts logging/warnings appear in init logs or not.
  2. Existing e2e tests for Voxtral-TTS
pytest -s tests/e2e/offline_inference/test_voxtral_tts.py tests/e2e/online_serving/test_voxtral_tts.py
  1. Voxtral-TTS offline examples

Test Result

  1. Run with Ming
vllm serve Jonathan1909/Ming-flash-omni-2.0 \
    --omni \
    --stage-configs-path vllm_omni/model_executor/stage_configs/ming_flash_omni_tts.yaml \
    --port 8091 \
    --log-stats

Voxtral-TTS loggings/warnings no longer appear in init logs.

  1. pytest
# pytest -s tests/e2e/offline_inference/test_voxtral_tts.py tests/e2e/online_serving/test_voxtral_tts.py

8 passed, 16 warnings in 537.44s (0:08:57)
  1. offline example results: please check my subsequent comments in this PR

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
@yuanheng-zhao yuanheng-zhao marked this pull request as ready for review April 23, 2026 15:18
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d2d5acc830

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread vllm_omni/transformers_utils/configs/voxtral_tts.py
@yuanheng-zhao yuanheng-zhao changed the title [Bugfix][Misc] Fix Voctral TTS config and parser registry [Bugfix][Refactor] Fix Voctral TTS config and parser registry Apr 24, 2026
@yuanheng-zhao yuanheng-zhao changed the title [Bugfix][Refactor] Fix Voctral TTS config and parser registry [Bugfix][Refactor] Migrate Voxtral TTS config and parser registry Apr 24, 2026
@yuanheng-zhao
Copy link
Copy Markdown
Collaborator Author

yuanheng-zhao commented Apr 24, 2026

Voxtral-TTS offline examples

Script: examples/offline_inference/voxtral_tts/README.md

Streaming with neutral_female voice preset:

python3 examples/offline_inference/voxtral_tts/end2end.py \
    --num-prompts 32 --concurrency 8 --streaming --write-audio --voice neutral_female \
    --model mistralai/Voxtral-4B-TTS-2603 \
    --text "That eerie silence after the first storm was just the calm before another round of chaos, wasn't it?"

Out logs (partial)

Request 0: saved 174720 samples (7.28s) to output_audio/tts_output_0.wav
  Request 0 chunk 0: no_wait | arrived=0.0ms | chunk_dur=400.0ms
  Request 0 chunk 1: no_wait | arrived=42.2ms | chunk_dur=400.0ms
  Request 0 chunk 2: no_wait | arrived=99.4ms | chunk_dur=400.0ms
  Request 0 chunk 3: no_wait | arrived=135.7ms | chunk_dur=400.0ms
  Request 0 chunk 4: no_wait | arrived=209.9ms | chunk_dur=400.0ms
  Request 0 chunk 5: no_wait | arrived=514.1ms | chunk_dur=2000.0ms
  Request 0 chunk 6: no_wait | arrived=798.3ms | chunk_dur=2000.0ms
  Request 0 chunk 7: no_wait | arrived=1039.9ms | chunk_dur=1280.0ms
Request 0: TTFA=0.5367s | Generation=1.5769s | Audio=7.28s | RTF=4.6167 | WaitRate=0.00% (0/8)

All requests: Generation=1.5870s | TotalAudio=7.28s | Concurrency=1 | AvgTTFA=0.5367s | RTF(total)=4.5872 | RTF(per-request)=4.6167 | WaitRate=0.00% (0/8)

tts_output_0.wav

32 prompts, 8 concurrent requests per wave, streaming with neutral_female voice:

python3 examples/offline_inference/voxtral_tts/end2end.py \
    --num-prompts 32 --concurrency 8 --streaming --write-audio --voice neutral_female \
    --model mistralai/Voxtral-4B-TTS-2603 \
    --text "That eerie silence after the first storm was just the calm before another round of chaos, wasn't it?"

Out logs (partial)

All requests: Generation=9.6359s | TotalAudio=239.68s | Concurrency=8 | AvgTTFA=0.5348s | RTF(total)=24.8738 | RTF(per-request)=3.4928 | WaitRate=0.00% (0/262)

@yuanheng-zhao
Copy link
Copy Markdown
Collaborator Author

PTAL @princepride , @Gaohan123 , @linyueqian
cc @y123456y78

Copy link
Copy Markdown
Collaborator

@princepride princepride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lishunyang12 lishunyang12 added the ready label to trigger buildkite CI label Apr 25, 2026
@princepride princepride merged commit 6b52db9 into vllm-project:main Apr 25, 2026
8 checks passed
@yuanheng-zhao yuanheng-zhao deleted the fix/voctral-tts-registry branch April 25, 2026 10:20
xiaohajiayou pushed a commit to xiaohajiayou/vllm-omni that referenced this pull request Apr 30, 2026
lengrongfu pushed a commit to lengrongfu/vllm-omni that referenced this pull request May 1, 2026
sphinxkkkbc pushed a commit to sphinxkkkbc/vllm-omni that referenced this pull request May 4, 2026
…lm-project#3065)

Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: sphinxkkkbc <binchengkang8@gmail.com>
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants