【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

suzakuwcx · 2024-12-17T01:07:18Z

PR types

New features

PR changes

APIs

Describe

paddle-bot · 2024-12-17T01:07:23Z

Thanks for your contribution!

suzakuwcx · 2024-12-17T01:42:48Z

In the original DAC repository, the training data is generated randomly. To assess accuracy, I sampled ten loss values and saved them in PyTorch tensor (.pt) format. The original repository showed no numerical errors, but in the Paddle implementation, bias is observed and I still tracing it . Here is the test result

index	AudioSignal_STFT	AudioSignal_MEL	Tensor_STFT	Tensor_MEL
0	2.86102294921875e-06	9.5367431640625e-07	6.103515625e-05	6.198883056640625e-05
1	9.5367431640625e-07	4.76837158203125e-07	6.103515625e-05	0.000194549560546875
2	0.0	9.5367431640625e-07	3.0517578125e-05	0.0005588531494140625
3	9.5367431640625e-07	4.76837158203125e-07	2.09808349609375e-05	4.9114227294921875e-05
4	4.76837158203125e-07	4.76837158203125e-07	0.00012445449829101562	7.05718994140625e-05
5	4.76837158203125e-06	0.0	0.03401947021484375	0.012701988220214844
6	4.76837158203125e-07	9.5367431640625e-07	0.00014448165893554688	0.0003781318664550781
7	3.814697265625e-06	9.5367431640625e-07	0.2675790786743164	0.03282880783081055
8	9.5367431640625e-07	0.0	0.22318553924560547	0.03190279006958008
9	2.384185791015625e-06	0.0	0.22658443450927734	0.03314781188964844

suzakuwcx · 2024-12-18T07:25:03Z

With these patch, the new test result is below

diff --git a/paddlespeech/t2s/modules/losses.py b/paddlespeech/t2s/modules/losses.py
index 029ad1be..ce5f441d 100644
--- a/paddlespeech/t2s/modules/losses.py
+++ b/paddlespeech/t2s/modules/losses.py
@@ -501,7 +502,7 @@ def stft(x,
     real = x_stft.real()
     imag = x_stft.imag()
 
-    return paddle.sqrt(paddle.clip(real**2 + imag**2, min=1e-7)).transpose(
+    return paddle.clip(paddle.sqrt(real**2 + imag**2), min=clamp_eps).transpose(
         [0, 2, 1])

@@ -930,7 +930,7 @@ class MelSpectrogram(nn.Layer):
         real = real.transpose([0, 2, 1])
         imag = imag.transpose([0, 2, 1])
         x_power = real**2 + imag**2
-        x_amp = paddle.sqrt(paddle.clip(x_power, min=self.eps))
+        x_amp = paddle.clip(paddle.sqrt(x_power), min=self.eps)

index	Tensor_STFT	Tensor_MEL
0	2.86102294921875e-06	9.5367431640625e-07
1	9.5367431640625e-07	4.76837158203125e-07
2	9.5367431640625e-07	9.5367431640625e-07
3	9.5367431640625e-07	4.76837158203125e-07
4	4.76837158203125e-07	4.76837158203125e-07
5	3.814697265625e-06	0.0
6	0.0	9.5367431640625e-07
7	2.86102294921875e-06	4.76837158203125e-06
8	0.0	1.430511474609375e-06
9	2.384185791015625e-06	4.76837158203125e-07

zxcd · 2024-12-19T02:52:25Z

开发者你好，感谢你的参与！由于你的黑客松赛题完成度较高，其PR已被锁定，请尽快完善锁定的PR，并确保在2025年1月3日前完成合入。逾期未合入PR将无法获得奖金发放。

…ust the clipping threshold

…n calculation methods - Change precision threshold to ’1e-5‘ - Use relative error instead of absolute error

luotao1 · 2025-01-03T02:29:35Z

📢：请尽快完善锁定的PR，并确保在2025年1月10日（不再延期）前完成合入。逾期未合入PR将无法获得奖金发放。

zxcd · 2025-01-13T03:51:43Z

paddlespeech/t2s/modules/losses.py


 import librosa
 import numpy as np
 import paddle
 from paddle import nn
 from paddle.nn import functional as F
+from paddleaudio.audiotools import AudioSignal
+from paddleaudio.audiotools import STFTParams


miss SISDRLoss?

zxcd · 2025-01-13T03:55:03Z

tests/unit/tts/test_losses.py

+        loss_1.backward()
+        loss_1_grad = signal.audio_data.grad.sum()
+
+        assert abs(


suggest use np.testing.assert_allclose, these losses can pass 1e-6?

Currently not. After debugging, I find out that the loss is generated by 'paddle.signal.stft' (without cuda), so I have to compare the implement with '_VF' and paddle. I'm sure that the loss can decrease to 0 if fixing this

[Hackathon 7th No.56] 在 PaddleSpeech 中复现 DAC 训练需要用到的 loss

2c04e0a

paddle-bot bot added the contributor label Dec 17, 2024

mergify bot added T2S Test labels Dec 17, 2024

Using pre-commit to format code

b741545

luotao1 mentioned this pull request Dec 17, 2024

【Hackathon 7th】开源贡献个人挑战赛 PaddlePaddle/Paddle#68244

Closed

suzakuwcx added 3 commits December 29, 2024 00:09

t2s/modules/losses.py: Add a 'clamp_eps' parameter to dynamically adj…

37f60d6

…ust the clipping threshold

tests/unit/tts/test_losses.py: Add gradient tests and update precisio…

c1a8f99

…n calculation methods - Change precision threshold to ’1e-5‘ - Use relative error instead of absolute error

tests/unit/tts/test_losses.py: Add error message on assert failed

fd5365c

zxcd reviewed Jan 13, 2025

View reviewed changes

luotao1 changed the title ~~[Hackathon 7th No.56] 在 PaddleSpeech 中复现 DAC 训练需要用到的 loss~~ 【Hackathon 8th No.9] 在 PaddleSpeech 中复现 DAC 训练需要用到的 loss Jan 14, 2025

luotao1 changed the title ~~【Hackathon 8th No.9] 在 PaddleSpeech 中复现 DAC 训练需要用到的 loss~~ 【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss Jan 14, 2025

luotao1 mentioned this pull request Jan 14, 2025

【Hackathon 8th】开源贡献个人挑战赛（尝鲜版） PaddlePaddle/Paddle#70746

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

suzakuwcx commented Dec 17, 2024

paddle-bot bot commented Dec 17, 2024

suzakuwcx commented Dec 17, 2024 •

edited

Loading

suzakuwcx commented Dec 18, 2024 •

edited

Loading

zxcd commented Dec 19, 2024

luotao1 commented Jan 3, 2025

zxcd Jan 13, 2025

zxcd Jan 13, 2025

suzakuwcx Jan 13, 2025

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

Are you sure you want to change the base?

【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss #3954

Conversation

suzakuwcx commented Dec 17, 2024

PR types

PR changes

Describe

paddle-bot bot commented Dec 17, 2024

suzakuwcx commented Dec 17, 2024 • edited Loading

suzakuwcx commented Dec 18, 2024 • edited Loading

zxcd commented Dec 19, 2024

luotao1 commented Jan 3, 2025

zxcd Jan 13, 2025

Choose a reason for hiding this comment

zxcd Jan 13, 2025

Choose a reason for hiding this comment

suzakuwcx Jan 13, 2025

Choose a reason for hiding this comment

suzakuwcx commented Dec 17, 2024 •

edited

Loading

suzakuwcx commented Dec 18, 2024 •

edited

Loading