[Voxtral TTS] Support voxtral tts cfg_alpha sampling params via temperature by y123456y78 · Pull Request #2243 · vllm-project/vllm-omni

y123456y78 · 2026-03-26T22:56:34Z

Purpose

Support voxtral tts cfg_alpha sampling params via temperature
Add corresponding unit test
Clean up cuda graph & cuda graph test code a bit

Testing Plan

pytest -s -v \
  tests/model_executor/stage_input_processors/test_voxtral_tts_async_chunk.py \
  tests/model_executor/models/voxtral_tts/test_cuda_graph_acoustic_transformer.py \
  tests/model_executor/models/voxtral_tts/test_audio_tokenizer_parsing.py \
  tests/e2e/online_serving/test_voxtral_tts.py \
  tests/model_executor/models/voxtral_tts/test_text_preprocess.py \
  tests/e2e/offline_inference/test_voxtral_tts.py

Result

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

lishunyang12

left a couple comments

lishunyang12 · 2026-04-02T15:55:27Z

+        sampling_metadata = kwargs.get("sampling_metadata")
+        if sampling_metadata is None or sampling_metadata.temperature is None:
+            raise ValueError(
+                "VoxtralTTS requires a non-zero 'temperature' sampling parameter (used as cfg_alpha for flow-matching)."


This will silently accept temperature=0, which would zero out the conditional velocity entirely (pure unconditional generation). Should probably validate temperature > 0 here, or at least != 0.

lishunyang12 · 2026-04-02T15:55:27Z

    final_output_type: text
    default_sampling_params:
-      temperature: 0.0
+      # NOTE: VoxtralTTS repurposes 'temperature' as the CFG alpha


Nit: might be worth adding a user-facing note somewhere (CLI help, docs) that temperature controls CFG strength for voxtral-tts — otherwise people will set temperature=0.7 expecting normal sampling behavior and get confused.

lishunyang12 · 2026-04-02T15:55:27Z

        padded_size = self._get_padded_size(actual_size)
        if padded_size is None or padded_size not in self.graphs:
-            return self.model.compute_mm_logits(hidden_states)
+            return self.model.compute_mm_logits(hidden_states, cfg_alpha=cfg_alpha)


The 1D -> 2D reshape (unsqueeze(1)) happens inside decode_one_frame, but in the graph path static_cfg_alpha is already (size, 1). This means the eager fallback via compute_mm_logits will unsqueeze, but the graph path skips it. Works today but the shape contract is fragile — a comment on the expected shape at this interface would help.

chatgpt-codex-connector · 2026-04-20T19:50:53Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

lishunyang12 · 2026-04-23T06:42:47Z

Resolve conflicts.

Support voxtral tts cfg_alpha sampling params in eager mode

03f3d9b

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

y123456y78 force-pushed the chenyo/voxtral-tts-sampling-params branch from ea989bd to 03f3d9b Compare March 26, 2026 22:57

y123456y78 added 15 commits March 26, 2026 23:11

Support cuda graph

dcb0466

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Update help message

f16176e

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Supprot cfg_alpha in gradio_demo

ddd4939

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Update test to support cfg_alpha

ad68fff

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Merge branch 'main' into chenyo/voxtral-tts-sampling-params

c0cfb59

add test code

6ad25bd

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

clean up

59f0036

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

clean up

966a821

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Update test

7f93b24

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Align variable name for test and actual model

c694424

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Clean up and add test

8c04ea6

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Set temperature in end2end.py

b7d62d1

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Cleanup log

e899b90

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Clean up duplicate

8140668

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Fix format

f98a647

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

y123456y78 changed the title ~~[Voxtral TTS] Support voxtral tts cfg_alpha sampling params~~ [Voxtral TTS] Support voxtral tts cfg_alpha sampling params via temperature Mar 27, 2026

y123456y78 added 4 commits March 27, 2026 18:22

Fix test

1470b0b

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Revert code cleanup

bb7c395

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Fix

80689c3

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

Fix format

40d2163

Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

linyueqian mentioned this pull request Mar 30, 2026

[Feature][Voxtral TTS] Support per-model extra sampling params & add cfg_alpha for Voxtral TTS #2338

Merged

lishunyang12 reviewed Apr 2, 2026

View reviewed changes

y123456y78 marked this pull request as ready for review April 20, 2026 19:50

y123456y78 requested a review from hsliuustc0106 as a code owner April 20, 2026 19:50

y123456y78 marked this pull request as draft April 20, 2026 19:50

linyueqian added this to the v0.20.0 milestone Apr 22, 2026

linyueqian added the ready label to trigger buildkite CI label Apr 22, 2026

y123456y78 closed this Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Voxtral TTS] Support voxtral tts cfg_alpha sampling params via temperature#2243

[Voxtral TTS] Support voxtral tts cfg_alpha sampling params via temperature#2243
y123456y78 wants to merge 20 commits into
vllm-project:mainfrom
y123456y78:chenyo/voxtral-tts-sampling-params

y123456y78 commented Mar 26, 2026 •

edited

Loading

Uh oh!

lishunyang12 left a comment

Uh oh!

lishunyang12 Apr 2, 2026

Uh oh!

lishunyang12 Apr 2, 2026

Uh oh!

lishunyang12 Apr 2, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

lishunyang12 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

y123456y78 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Testing Plan

Result

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

lishunyang12 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

y123456y78 commented Mar 26, 2026 •

edited

Loading