The Minion randomly intruded into my audio #783

Haoran1272 · 2024-12-25T10:46:17Z

Self Checks

This template is only for bug reports. For questions, please visit Discussions.
I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
I have searched for existing issues, including closed ones. Search issues
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

Nvidia3090, Python 3.10, torch==2.4.1, torchvision==0.19.1, torchaudio==2.4.1

Steps to Reproduce

/root/miniconda3/bin/python -m tools.api_server --listen 0.0.0.0:6006 --llama-checkpoint-path "/usr/github/fish-speech/checkpoints/fish-speech-1.5" --decoder-checkpoint-path "/usr/github/fish-speech/checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" --decoder-config-name firefly_gan_vq --compile

✔️ Expected Behavior

No response

❌ Actual Behavior

Please listen to the last few seconds of this audio, where a Minion's voice appears.
https://saysay-bucket1.s3.us-west-1.amazonaws.com/uploads/default/20241224/4f75658f38eb0c163acced94328a73b6e78275bb.mp3
Text: By the end of this century, we will have reached a technological singularity, where quantum computing leads to a paradigm shift in epistemology.
This is a probabilistic issue. Out of my 100 audio files, 13 have similar occurrences.
Please help me, how should I solve this problem?

The text was updated successfully, but these errors were encountered:

Haoran1272 · 2025-01-02T09:13:35Z

listen this: https://saysay-bucket1.s3.us-west-1.amazonaws.com/uploads/default/20250102/6f025e260633d038db74c5fd218098975e505a77.mp3

mashdragon · 2025-01-07T02:38:58Z

This happens all the time for me. Generate a few and choose the median length one.

20km-shimakaze · 2025-01-10T02:04:49Z

In my case, I need to preprocess to remove * , for example *args-> args. This way some error sounds will not be generated, but there are more steps that may require preprocessing that I didn't see.

mashdragon · 2025-01-10T16:03:06Z

@20km-shimakaze Which characters have you found that you need to remove? Or instead, which character sets do you keep?

Ginzyl · 2025-01-13T02:12:31Z

I got the same problem when generate Chinese, and it seems to occur randomly, the same sentence could generated correctly when you test it.

Haoran1272 · 2025-01-13T02:16:55Z

I haven't found a solution yet, but increasing the audio duration of the material seems to reduce the probability of occurrence.

20km-shimakaze · 2025-01-17T14:46:52Z

@20km-shimakaze Which characters have you found that you need to remove? Or instead, which character sets do you keep?您发现了哪些需要删除的角色？或者，您保留哪些字符集？

I deleted the character '*'.Anyway, this character is not pronounced during TTS, and I myself have found that even after generating the audio several times, the pronunciation will be wrong after reading this character. So I deleted it.

Haoran1272 added the bug Something isn't working label Dec 25, 2024

Haoran1272 closed this as completed Jan 13, 2025

Haoran1272 reopened this Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Minion randomly intruded into my audio #783

The Minion randomly intruded into my audio #783

Haoran1272 commented Dec 25, 2024 •

edited

Loading

Haoran1272 commented Jan 2, 2025 •

edited

Loading

mashdragon commented Jan 7, 2025

20km-shimakaze commented Jan 10, 2025

mashdragon commented Jan 10, 2025

Ginzyl commented Jan 13, 2025

Haoran1272 commented Jan 13, 2025

20km-shimakaze commented Jan 17, 2025

The Minion randomly intruded into my audio #783

The Minion randomly intruded into my audio #783

Comments

Haoran1272 commented Dec 25, 2024 • edited Loading

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

Haoran1272 commented Jan 2, 2025 • edited Loading

mashdragon commented Jan 7, 2025

20km-shimakaze commented Jan 10, 2025

mashdragon commented Jan 10, 2025

Ginzyl commented Jan 13, 2025

Haoran1272 commented Jan 13, 2025

20km-shimakaze commented Jan 17, 2025

Haoran1272 commented Dec 25, 2024 •

edited

Loading

Haoran1272 commented Jan 2, 2025 •

edited

Loading