Vertical bars in audio from model output #24
-
Hi, i was checking the outputs of my Mel-Band Roformer model and i see lots of vertical lines in the audio that aren't there in the input. I am using this code for inference https://github.com/ZFTurbo/Music-Source-Separation-Training/blob/main/inference.py. Here are my model settings model:
I am training with a batch size of 12, 8 seconds of audio, learning rate 5.0e-05 Picture of the vertical bars Has anyone else experienced this and managed to fix it? (My model has only trained for 24 hours with a single GPU so its possible as the model trains longer this problem goes away.) |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
Hey Kimberley Jensen! Love your work in the source separation space. Please try again with a different window function for the (i)stft. I think the authors of the original paper used hann, which should be much more appropriate than what you have right now. The current default is all ones, which (I think) leads to those unwanted little clicks at a regular interval. The click interval matches the hop length. It would be nice to have this fixed upstream too. Pytorch is in the process of switching their default window function to hann, but it takes time, so until then we have to set it explicitly. |
Beta Was this translation helpful? Give feedback.
-
Update: @lucidrains patched it in 0.3.7. Excellent response time 🤘😄 |
Beta Was this translation helpful? Give feedback.
-
There is a problem now:
We need to do something with device. I fixed it in ugly way becasue self.device is incorrect during initialization:
|
Beta Was this translation helpful? Give feedback.
Hey Kimberley Jensen! Love your work in the source separation space.
Please try again with a different window function for the (i)stft. I think the authors of the original paper used hann, which should be much more appropriate than what you have right now. The current default is all ones, which (I think) leads to those unwanted little clicks at a regular interval. The click interval matches the hop length. It would be nice to have this fixed upstream too.
Pytorch is in the process of switching their default window function to hann, but it takes time, so until then we have to set it explicitly.