You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking through onnx_export.py and onnx_bench.py and I was wondering how to run it end to end in a standalone Colab notebook.
Specifically, how do we replace dummy_specs = torch.rand(1, 257, 60) with a mp3/wav audio (of variable time length) converted to a torch Tensor (by rmvpe model? I'm really new to speech model architectures so not sure) with the ONNX converted checkpoint.
Thanks
The text was updated successfully, but these errors were encountered:
I believe you are referring to the pitch estimation (rmvpe) and might be looking at the MMVC1.5 branch (v1.5.0.0_SiFiGAN).
In this context, the size of the input tensor is intentionally fixed, rather than dynamic. This is because the input size in the time dimension varies between specs and sin, d0, d1, d2, d3. ONNX cannot dynamically handle such inputs, which is why the size is fixed.
Hi! Thanks for the amazing open source work!
I was looking through
onnx_export.py
andonnx_bench.py
and I was wondering how to run it end to end in a standalone Colab notebook.Specifically, how do we replace
dummy_specs = torch.rand(1, 257, 60)
with amp3/wav
audio (of variable time length) converted to a torch Tensor (byrmvpe
model? I'm really new to speech model architectures so not sure) with the ONNX converted checkpoint.Thanks
The text was updated successfully, but these errors were encountered: