fix(fish_speech): use from_indices() instead of decode() for DAC decoder by ianliuy · Pull Request #2668 · vllm-project/vllm-omni

ianliuy · 2026-04-10T04:19:15Z

Summary

Fix TypeError: DAC.decode() takes 2 positional arguments but 3 were given when running fish_speech TTS online server.

Root Cause

fish_speech_dac_decoder.py:298 calls self._codec.decode(codes_bqf, feature_lengths), but:

Wrong method: DAC.decode(z) expects a continuous latent tensor. The code passes discrete codebook indices (torch.long). The correct method is from_indices(indices).
Wrong arg count: Both decode() and from_indices() accept only 1 positional argument. The extra feature_lengths causes the TypeError.
Wrong return unpacking: decode() and from_indices() return a single tensor, not a (wav, lengths) tuple.

Fix

# Before
wav_batch, audio_lengths = self._codec.decode(codes_bqf, feature_lengths)

# After
wav_batch = self._codec.from_indices(codes_bqf)
audio_lengths = torch.clamp(
    feature_lengths * self._hop_length,
    max=wav_batch.shape[-1],
)

from_indices() internally does quantizer.decode(indices) decoder(z) the correct indicesaudio path
audio_lengths computed from feature_lengths * hop_length (mathematically exact for this architecture)
torch.clamp(max=...) as defensive bound

Also updated _FakeCodec in tests to match the new API.

Fixes #2643

chatgpt-codex-connector · 2026-04-10T04:19:20Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

hsliuustc0106 · 2026-04-10T08:16:17Z

DCO check failed (ACTION_REQUIRED). Please amend your commit with git commit -s --amend to add the Signed-off-by line.

The DAC codec's decode() method accepts only a continuous latent tensor (z), but the decoder was passing discrete codebook indices along with feature_lengths -- causing: TypeError: DAC.decode() takes 2 positional arguments but 3 were given Switch to from_indices() which correctly handles discrete codebook indices by first dequantizing through the RVQ, then decoding to waveform. Compute audio_lengths from feature_lengths * hop_length since from_indices() returns a single tensor (not a tuple). Update _FakeCodec in tests to match the new calling convention. Fixes vllm-project#2643 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yiyang Liu <yiyangliu@microsoft.com>

ianliuy · 2026-04-10T17:28:03Z

Amended the commit with git commit -s --amend to add the Signed-off-by line. DCO check should pass now. cc - @hsliuustc0106

ianliuy · 2026-04-12T00:30:14Z

The ReadTheDocs failure is unrelated to this PR it timed out during pip install .[docs] (15min limit). Other PRs (#2377, #2517) are hitting the same RTD timeout. This PR only touches fish_speech_dac_decoder.py and its test file no docs changes.

Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>

ianliuy · 2026-04-14T03:58:10Z

Hi @hsliuustc0106, gentle ping 🙏 This PR and two others are all approved by @lishunyang12 with green CI, waiting on your review as the remaining requested reviewer.

All three are small fixes:

This PR (fix(fish_speech): use from_indices() instead of decode() for DAC decoder #2668): +8/-6, DAC decoder API fix (from_indices() instead of decode())
fix: do not apply FP8 quant config to vision/audio encoders for pre-quantized checkpoints #2702: FP8 encoder quant guard (tested locally by lishunyang12)
[Bugfix] Make mrope kwargs optional in HunyuanImage3 get_mrope_input_positions #2654: +3/-3, signature-only change (mrope kwargs optional)

Happy to address any feedback just wanted to batch these together to keep them on your radar whenever you have a moment. Thanks!

lishunyang12

LGTM. The fix is correct and well-documented in the PR description.

What was wrong:

DAC.decode(z) expects a continuous latent tensor, but the code passed discrete codebook indices (torch.long). The correct entry point for indices is from_indices().
decode() / from_indices() both accept only 1 positional arg, so the extra feature_lengths caused the TypeError.
The old code unpacked a (wav, lengths) tuple, but both methods return a single tensor.

Why the fix is correct:

from_indices(codes_bqf) is the right DAC API for discrete codebook indices -> audio waveform.
audio_lengths = clamp(feature_lengths * hop_length, max=wav_batch.shape[-1]) is the standard way to recover sample-domain lengths from frame-domain lengths for a fixed-hop-length codec, with a defensive upper bound.
The _FakeCodec test mock is correctly updated to match the new single-tensor return.

No concerns.

Replacing with inline comments

ianliuy · 2026-04-30T06:16:46Z

Hi @hsliuustc0106 and @lishunyang12, just a polite follow-up on this small DAC decoder fix.

The branch is mergeable, DCO/pre-commit/build checks are green, and the prior approval was dismissed after the force-push. Would you be willing to take another look when you have a chance?

ianliuy requested a review from hsliuustc0106 as a code owner April 10, 2026 04:19

ianliuy force-pushed the fix/issue-2643 branch from c11353a to 67cfe97 Compare April 10, 2026 17:09

lishunyang12 previously approved these changes Apr 11, 2026

View reviewed changes

Merge branch 'main' into fix/issue-2643

591d980

chore: retrigger CI (build timed out)

92299ab

Signed-off-by: Yiyang Liu <37043548+ianliuy@users.noreply.github.com>

lishunyang12 previously approved these changes Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(fish_speech): use from_indices() instead of decode() for DAC decoder#2668

fix(fish_speech): use from_indices() instead of decode() for DAC decoder#2668
ianliuy wants to merge 3 commits into
vllm-project:mainfrom
ianliuy:fix/issue-2643

ianliuy commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 10, 2026

Uh oh!

hsliuustc0106 commented Apr 10, 2026

Uh oh!

ianliuy commented Apr 10, 2026 •

edited

Loading

Uh oh!

ianliuy commented Apr 12, 2026

Uh oh!

ianliuy commented Apr 14, 2026

Uh oh!

lishunyang12 left a comment

Uh oh!

ianliuy commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ianliuy commented Apr 10, 2026

Summary

Root Cause

Fix

Uh oh!

chatgpt-codex-connector Bot commented Apr 10, 2026

Uh oh!

hsliuustc0106 commented Apr 10, 2026

Uh oh!

ianliuy commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianliuy commented Apr 12, 2026

Uh oh!

ianliuy commented Apr 14, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

ianliuy commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ianliuy commented Apr 10, 2026 •

edited

Loading