[Bugfix] Fix left_context_size type mismatch in non-async Base Code2Wav path by NickCao · Pull Request #2052 · vllm-project/vllm-omni

NickCao · 2026-03-20T16:36:45Z

Purpose

Disclaimer: this PR contains AI-generated code

Qwen3TTSCode2Wav.forward() compares ctx_frames against 0 (line 287: "if ctx_frames > 0"), but the non-async Base path passes left_context_size as a single-element list [ref_code_len] to survive serialize_additional_information(), which only supports tensor and list values (plain ints are dropped). The async chunk path bypasses serialization and passes a plain int directly.

The list wrapper in talker2code2wav() is intentional — without it the serializer drops the key and ctx_frames silently falls back to 0, causing ref_code context to never be trimmed from the output audio.

Fix the consumer (Qwen3TTSCode2Wav.forward) to unwrap the list when present, handling both the serialized list form (non-async) and the plain int form (async chunk path).

The bug was introduced in PR #1731 (761eff9) which added ref_code support to the non-async path but did not account for the type mismatch between serialized list and the int comparison downstream. It was masked by the token overflow crash (max_model_len=32768 < prompt tokens) which prevented the code from reaching the comparison.

Fixes: #203

Test Plan

After modifying qwen3_tts_no_async_chunk.yaml to increase the max_model_len, run

/workspace/.venv/bin/python -m pytest -sv tests/e2e/online_serving/test_qwen3_tts_base_expansion.py --run-level "advanced_model"

Test Result

TBD

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c8add1d79d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

…av path Qwen3TTSCode2Wav.forward() compares ctx_frames against 0 (line 287: "if ctx_frames > 0"), but the non-async Base path passes left_context_size as a single-element list [ref_code_len] to survive serialize_additional_information(), which only supports tensor and list values (plain ints are dropped). The async chunk path bypasses serialization and passes a plain int directly. The list wrapper in talker2code2wav() is intentional — without it the serializer drops the key and ctx_frames silently falls back to 0, causing ref_code context to never be trimmed from the output audio. Fix the consumer (Qwen3TTSCode2Wav.forward) to unwrap the list when present, handling both the serialized list form (non-async) and the plain int form (async chunk path). The bug was introduced in PR vllm-project#1731 (761eff9) which added ref_code support to the non-async path but did not account for the type mismatch between serialized list and the int comparison downstream. It was masked by the token overflow crash (max_model_len=32768 < prompt tokens) which prevented the code from reaching the comparison. Fixes: vllm-project#2030 Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

linyueqian · 2026-03-21T05:13:22Z

@Sy0307 @yenuo26 ptal.

Sy0307 · 2026-03-22T17:07:42Z

LGTM. Confirm there is type mismatch indeed introduced by #1731 . Thanks for fix :)

lishunyang12

Fix looks correct. A couple of suggestions.

lishunyang12 · 2026-03-22T17:27:29Z

+                    # plain ints); async chunk path sends a plain int.
+                    # Handle both.
+                    if isinstance(val, list):
+                        val = val[0] if val else 0


This silently discards extra elements if the list has len > 1. Not currently possible, but a defensive assert or warning would save debugging time if the producer changes.

Suggested change

val = val[0] if val else 0

val = val[0] if len(val) == 1 else val[0]

Actually, simpler — just index [0] and let it raise IndexError on empty rather than silently returning 0:

Suggested change

val = val[0] if val else 0

val = val[0]

lishunyang12 · 2026-03-22T17:27:29Z

+                    # Handle both.
+                    if isinstance(val, list):
+                        val = val[0] if val else 0
+                    left_context_size[i] = int(val)


Nit: the int() cast is good but consider doing the normalization in a small helper — fish_speech and any future consumer that passes scalars through additional_information will hit the same list-vs-scalar issue. A shared unwrap_scalar(val) in e.g. serialization.py would avoid duplicating this pattern.

lishunyang12 · 2026-03-22T17:27:29Z

        codec_codes = audio_codes.transpose(0, 1).cpu().reshape(-1).tolist()
+        # Wrap ref_code_len in a list: serialize_additional_information()
+        # only preserves tensor and list values; plain ints are dropped.
+        # The consumer (Qwen3TTSCode2Wav.forward) unwraps the list.


The real fix should be in serialize_additional_information — it should support scalar int/float values instead of silently dropping them. Wrapping in a list at the producer and unwrapping at the consumer is a workaround that every caller has to know about. Would you consider adding scalar support to AdditionalInformationEntry (e.g. a scalar_data field) in a follow-up?

NickCao · 2026-03-23T13:15:39Z

Closing in favor of @Sy0307's pending fix of the root cause.

Sy0307 · 2026-03-25T12:40:38Z

Plz continue as #2104 has been merged. @NickCao

NickCao requested a review from hsliuustc0106 as a code owner March 20, 2026 16:36

chatgpt-codex-connector Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread vllm_omni/model_executor/stage_input_processors/qwen3_tts.py Outdated

NickCao force-pushed the fix/left-context-size-type branch from c8add1d to f70ed06 Compare March 20, 2026 16:43

NickCao changed the title ~~[Bugfix] Fix left_context_size passed as list instead of int in non-async Base path~~ [Bugfix] Fix left_context_size type mismatch in non-async Base Code2Wav path Mar 20, 2026

linyueqian added the ready label to trigger buildkite CI label Mar 22, 2026

lishunyang12 reviewed Mar 22, 2026

View reviewed changes

NickCao closed this Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix left_context_size type mismatch in non-async Base Code2Wav path#2052

[Bugfix] Fix left_context_size type mismatch in non-async Base Code2Wav path#2052
NickCao wants to merge 1 commit into
vllm-project:mainfrom
NickCao:fix/left-context-size-type

NickCao commented Mar 20, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

linyueqian commented Mar 21, 2026

Uh oh!

Sy0307 commented Mar 22, 2026

Uh oh!

lishunyang12 left a comment

Uh oh!

lishunyang12 Mar 22, 2026

Uh oh!

lishunyang12 Mar 22, 2026

Uh oh!

lishunyang12 Mar 22, 2026

Uh oh!

NickCao commented Mar 23, 2026

Uh oh!

Sy0307 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	val = val[0] if val else 0
	val = val[0] if len(val) == 1 else val[0]

Conversation

NickCao commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

linyueqian commented Mar 21, 2026

Uh oh!

Sy0307 commented Mar 22, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

lishunyang12 Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

NickCao commented Mar 23, 2026

Uh oh!

Sy0307 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

NickCao commented Mar 20, 2026 •

edited

Loading