You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, I am trying to train waveglow model from scratch to implement the Inverse STFT function. I am using 20K samples of noise+speech to train the system. I am attaching the configuration of waveglow model I am training. After 420K iteration, I synthesized the audio waveform for given input as STFT. The obtained results have a whistling sound in it, Can anyone suggest to me, approx for how many iterations I should train the model ? and if 20K number of samples are sufficient to train the system? and any other guidelines to improve the model.
I observed that on iteration number 345484: the loss suddenly increased to 2059746816.000000000, which explains the spike. I also tried to plot loss for shorter range of iterations and observed that there are spikes in between.
Furthermore, I am attaching the code for training and dataloader for stft based model for your reference.
The main purpose of creating Inverse STFT based waveglow is to use it as a pretrained model to train it further in the context of speech enhancement.
Can you suggest what should I do to optimize well the model?
Hello
Currently, I am trying to train waveglow model from scratch to implement the Inverse STFT function. I am using 20K samples of noise+speech to train the system. I am attaching the configuration of waveglow model I am training. After 420K iteration, I synthesized the audio waveform for given input as STFT. The obtained results have a whistling sound in it, Can anyone suggest to me, approx for how many iterations I should train the model ? and if 20K number of samples are sufficient to train the system? and any other guidelines to improve the model.
Thanks
{
"train_config": {
"fp16_run": false,
"output_directory": "checkpoints",
"epochs": 100000,
"learning_rate": 1e-4,
"sigma": 1.0,
"iters_per_checkpoint": 20000,
"batch_size": 1,
"seed": 1234,
"checkpoint_path": "",
"with_tensorboard": false
},
"data_config": {
"training_files": "train_list.txt",
"segment_length": 16000,
"sampling_rate": 16000,
"filter_length": 511,
"hop_length": 256,
"win_length": 511,
"mel_fmin": 0.0,
"mel_fmax": 8000.0
},
"dist_config": {
"dist_backend": "nccl",
"dist_url": "tcp://localhost:54321"
},
}
The text was updated successfully, but these errors were encountered: