Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,14 @@ def forward(
if i >= len(left_context_size):
break
if "left_context_size" in info:
left_context_size[i] = info["left_context_size"]
val = info["left_context_size"]
# Non-async path sends a list (required by
# serialize_additional_information which drops
# plain ints); async chunk path sends a plain int.
# Handle both.
if isinstance(val, list):
val = val[0] if val else 0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This silently discards extra elements if the list has len > 1. Not currently possible, but a defensive assert or warning would save debugging time if the producer changes.

Suggested change
val = val[0] if val else 0
val = val[0] if len(val) == 1 else val[0]

Actually, simpler — just index [0] and let it raise IndexError on empty rather than silently returning 0:

Suggested change
val = val[0] if val else 0
val = val[0]

left_context_size[i] = int(val)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the int() cast is good but consider doing the normalization in a small helper — fish_speech and any future consumer that passes scalars through additional_information will hit the same list-vs-scalar issue. A shared unwrap_scalar(val) in e.g. serialization.py would avoid duplicating this pattern.

for i, req_ids in enumerate(request_ids_list):
if req_ids.numel() < 1:
parsed.append((0, 0))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@ def talker2code2wav(
ref_code_len = 0
# Code2Wav expects codebook-major flat: [Q*num_frames]
codec_codes = audio_codes.transpose(0, 1).cpu().reshape(-1).tolist()
# Wrap ref_code_len in a list: serialize_additional_information()
# only preserves tensor and list values; plain ints are dropped.
# The consumer (Qwen3TTSCode2Wav.forward) unwraps the list.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real fix should be in serialize_additional_information — it should support scalar int/float values instead of silently dropping them. Wrapping in a list at the producer and unwrapping at the consumer is a workaround that every caller has to know about. Would you consider adding scalar support to AdditionalInformationEntry (e.g. a scalar_data field) in a follow-up?

additional_information = {"left_context_size": [ref_code_len]} if ref_code_len > 0 else None
code2wav_inputs.append(
OmniTokensPrompt(
Expand Down
Loading