-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++ infer #5
Comments
@maozhiqiang That's unexpectedly slow. On my computer (6 year old laptop) with the same hparams it runs a little slower than real time. Let's check a few things:
Sorry about the lack of detailed instructions. I'll get it done... |
hi @geneing ! Thank you for your reply. the training log is |
Run ccmake or cmake-gui . Switch to advanced mode ("t" in ccmake / a checkbox in cmake-gui). Find CMAKE_BUILD_TYPE entry and type RelWithDebInfo. Find CMAKE_CXX_FLAGS_RELWITHDEBINFO and edit to include -ffast-math flag. You can also set build type from cmake command line: https://cmake.org/pipermail/cmake/2008-March/020347.html |
I just add |
Sounds right. "Eigen3" library that I use, employs every templating trick to get best performance. When optimized, it's performance is excellent. In debug mode it is super inefficient. A few more flags to play with "-ffast-math" "-march=native". |
thank you @geneing ! but The output is all noise, Is my place wrong! |
sys.path.insert(0,'lib/build-src-RelDebInfo') #use correct path to the shared library WaveRNNVocoder...so import WaveRNNVocoder |
thank you! |
Hi @geneing ! |
@maozhiqiang Could you, please, attach the mel data you are using as an input. Then I can try to reproduce your problem. |
@geneing thank you! test mels as follows |
@maozhiqiang Works for me with b2f5fc1. Commands: import numpy as np import WaveRNNVocoder vocoder=WaveRNNVocoder.Vocoder() The speech is a bit noisy and quiet. Cantonese? |
@geneing thank you! my result is also noise, I saw the same code! I don't know why! |
I'm not sure how to help you. Here's the weight file I'm using. |
thank you! I will try! |
Hi! Thank you for this awesome work! I successfully trained the Pytorch WaveRNN model:
Now I am trying to run inference in CPU using the C++ library. I compiled the library and run the convert_model.py, but when I try to run inference I get Aborted (core dumped). If I use the model weights you shared in the above comment it runs perfectly fine. Anything I might've missed here? Thanks for your help :) |
@alexdemartos Would it be possible for you to obtain the stack trace when this error happens. You may have to recompile in debug mode and either run with gdb or open the core file. It will make it a lot easier to find the cause. |
Hi @geneing , thanks for your fast response. Sorry, I am not very experienced with C++ code debugging. I compiled the library in debug mode but I don't really know how to get any detailed info. This is the error from the .so library:
I tried to debug with gdb and the vocoder binary, but it crashes when loading the mel (npy) file (even with your model):
Nevertheless, I noticed loading your model gives the following details:
While loading mine, the last part of the model is not there:
|
@geneing I tested out the model you have in It's taking me 9.5 seconds to generate 6 seconds of audio. Also the audio output quality is pretty poor
|
@acrosson The network in this repo is designed for best performance on CPU - low op count, branching and memory access optimized for pipelined processors. For best performance on GPU you would use something like WaveGlow - no branching, massive op count amortized over thousands of simple compute cores. For the sound quality, let's check if it's due to pruning. I observe that the quality drop with pruning is quite sharp past some "critical" pruning fraction. This "critical" fraction depends on the dataset used for training. When training with noisier datasets, I observe that I have to keep more weights after pruning to maintain sound quality. If you go to your checkpoints/eval directory, you should have wav outputs every 10K steps or so. Listen to the output at around step 40000. If it sounds Ok, then check later steps. The step at which it sounds bad will tell you what fraction of the weights you can prune. For training with https://github.com/mozilla/TTS/ I can prune up to 90% of the weights with little impact on quality. I applied a FIR filter with a window between 95 and 7600 Hz to the M-AILABS dataset I used for training. Here's one synthesized from mels: https://drive.google.com/open?id=1T-D3jHrI8tlb9EwohaAEdFP0ddwK7LfJ |
@maozhiqiang how to export to C++ inference,thank you |
Hi all, |
@geneing hello, i use a hparams like belows,i can get a good result, but the inference time which i take is about 8s, is there any method can speed up inference, thank you. model parameters
|
hi @geneing ! I used the C++ code for infer! but the speed is so slowly
mels shape: (80, 500)
take times:53.759968996047974
Seven seconds of audio takes about 53 seconds!
my hparams
The text was updated successfully, but these errors were encountered: