Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common toolbox issues and how to fix them #431

Closed
ghost opened this issue Jul 19, 2020 · 10 comments
Closed

Common toolbox issues and how to fix them #431

ghost opened this issue Jul 19, 2020 · 10 comments

Comments

@ghost
Copy link

ghost commented Jul 19, 2020

This issue will be used to document common toolbox issues and how to fix them. Please do not reply here to keep the signal/noise high. Instead, report problems and suggest additions/improvements by opening a new issue.

@ghost
Copy link
Author

ghost commented Jul 19, 2020

NoBackendError

Summary

Typically fixed by installing ffmpeg. Windows users follow these instructions: https://video.stackexchange.com/a/20496

If you still experience NoBackendError after installing ffmpeg, try the below instructions and get support at librosa if needed.

More information

This error message occurs when opening a mp3 file. Audioread (a dependency of librosa) needs additional software to open mp3 files. The following is taken from https://github.com/librosa/librosa#audioread and may be helpful:

To fuel audioread with more audio-decoding power (e.g., for reading MP3 files),
you may need to install either ffmpeg or GStreamer.

Note that on some platforms, audioread needs at least one of the programs to work properly.

If you are using Anaconda, install ffmpeg by calling

conda install -c conda-forge ffmpeg

If you are not using Anaconda, here are some common commands for different operating systems:

  • Linux (apt-get): apt-get install ffmpeg or apt-get install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Linux (yum): yum install ffmpeg or yum install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Mac: brew install ffmpeg or brew install gstreamer
  • Windows: download binaries from this website

For GStreamer, you also need to install the Python bindings with

pip install pygobject

@ghost
Copy link
Author

ghost commented Jul 24, 2020

All questions about Colab Notebook

The Colab Notebook is a community-developed resource to enable users to run the toolbox without having a GPU or going through a complicated setup. Recently, these issues have been resolved with CPU support being added (#366) and the installation process streamlined (#375). We recommend that you use a normal Python environment.

Users who still prefer using Colab Notebook should understand that no official support will be provided, though you are welcome to ask questions to get help from the community. If you believe you've found a bug with the underlying toolbox code, please try to replicate the issue in a normal Python environment. We are not Colab Notebook users and are unable to troubleshoot Colab Notebook errors.

@ghost
Copy link
Author

ghost commented Jul 26, 2020

No module named 'tensorflow.contrib'

Summary

The toolbox requires Tensorflow 1.15 and this error message occurs when Tensorflow 2.x is installed.

Solution

Install Tensorflow 1.15.

If you get a pip error Could not find a version that satisfies the requirement tensorflow==1.15 then you are likely using Python 3.8+ which is incompatible with TF 1.15. To resolve that, you will need to switch to Python 3.6 or 3.7.

@ghost
Copy link
Author

ghost commented Jul 27, 2020

GPU support for the toolbox

Configuring GPU support for the toolbox is difficult. Fortunately, the toolbox can run on the CPU. Download the 423_add_cpu_mode branch on my fork. Then you can run python demo_toolbox.py --cpu and it will override the CUDA_VISIBLE_DEVICES environment variable to force the toolbox to run in CPU-only mode.

If you must have GPU support, questions about CUDA installation should be submitted to a different support channel. Try asking your question in the CUDA setup and installation section of the NVIDIA developer forums: https://forums.developer.nvidia.com/c/accelerated-computing/cuda/cuda-setup-and-installation

@ghost
Copy link
Author

ghost commented Jul 28, 2020

Pip error: Could not find a version that satisfies the requirement tensorflow==1.15

You are likely using Python 3.8+ which is incompatible with Tensorflow 1.15. To resolve this, you will need to switch to Python 3.6 or 3.7.

The supported Python versions are listed in README.md. Please follow the setup instructions carefully to avoid subsequent problems.

Python 3.8+ will be supported when the synthesizer code is upgraded to a PyTorch-only implementation. This is currently in the final stages of development and testing (#472).

@ghost
Copy link
Author

ghost commented Jul 28, 2020

OSError: [WinError 193] %1 is not a valid Win32 application

Summary

This error message occurs when you have multiple python environments and they are conflicting with each other.

For example, this is the traceback for #163:

Traceback (most recent call last):
  File "D:\Real-Time-Voice-Cloning-master\demo_cli.py", line 2, in <module>
    from utils.argutils import print_args
  File "D:\Real-Time-Voice-Cloning-master\utils\argutils.py", line 2, in <module>
    import numpy as np
  File "C:\Users\username\AppData\Roaming\Python\Python37\site-packages\numpy\__init__.py", line 140, in <module>
    from . import _distributor_init
  File "C:\Users\username\AppData\Roaming\Python\Python37\site-packages\numpy\_distributor_init.py", line 26, in <module>
    WinDLL(os.path.abspath(filename))
  File "C:\Users\username\AppData\Local\Programs\Python\Python37\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 193] %1 is not a valid Win32 application
>>> 

From the traceback you see these python paths causing a conflict with each other:
C:\Users\username\AppData\Roaming\Python\Python37\
C:\Users\username\AppData\Local\Programs\Python\Python37\

Solution

Look at your traceback and identify the conflicting python environments. Then check your PATH and remove one of the environments. Of course the environment you keep still needs to have all of the requirements. A similar issue is reported here: pytorch/pytorch#27693

@ghost
Copy link
Author

ghost commented Aug 4, 2020

Invalid syntax def print_args(args: argparse.Namespace, parser=None)

Summary

On many systems (Linux in particular), python defaults to Python 2.x which is not compatible with the toolbox. You will need to replace python with python3 in all commands to use the correct version.

More information

The error message looks like this.

$: python demo_cli.py
Traceback (most recent call last):
  File "demo_cli.py", line 2, in <module>
    from utils.argutils import print_args
  File "Real-Time-Voice-Cloning-master/utils/argutils.py", line 22
    def print_args(args: argparse.Namespace, parser=None):
                       ^
SyntaxError: invalid syntax

@ghost
Copy link
Author

ghost commented Aug 13, 2020

How to train your own models

First thing, we do not provide any official support for training a model with your own data, it is expected that anyone who trains a model is capable of coding in Python and solving the inevitable error messages on their own. You are welcome to open an issue if you get stuck but no one will walk you through the entire process.

Most users will want to train a synthesizer model. In most cases the pretrained encoder and vocoder can be reused.

1. Practice training with LibriSpeech

Corentin has a wiki page for replicating the training of the pretrained models: https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training

I recommend that you work through the preprocessing and training steps for the synthesizer, using the LibriSpeech train-clean-100 and train-clean-360 datasets. It is not necessary to wait until the model is fully trained but you should verify that the code works on your platform before switching to your own dataset.

2. Dataset preparation

Assembling the dataset is perhaps the hardest part for most users. For training with your own data, you will need to get your dataset into this format: #437 (comment)

Once you successfully preprocess the data then the training commands will work as before.

Additional notes

Finetuning a single-speaker model

If you do not require a multi-speaker model, use the process in #437 to finetune the existing models to a single speaker. This can be done in a reasonable amount of time on CPU.

Considerations - languages other than English

  1. Update synthesizer/utils/symbols.py to contain all valid characters in your text transcripts (the characters you want to train on). This is an example for Swedish: https://github.com/blue-fish/Real-Time-Voice-Cloning/commit/3eb96df1c6b4b3e46c28c6e75e699bffc6dd43be However, be aware that in order for someone to run the model you've created, they will also need to make the same changes to the symbols file.
  2. For best results with multi-speaker models (for voice cloning)
    • The speaker encoder is trained on English and may not work well for other languages. If you have a large number of voice samples in your target language, you may wish to train a new encoder or at least finetune an existing one. Data preprocessing for encoder is not a smooth process, so set your expectations accordingly.
    • There are some very good speaker encoders shared in Training from scratch #126 but the model size of 768 is too big to be practical for cloning. You can use this process to import the relevant weights from the model and finetune to a more useful dimension: Training a new encoder model #458 (comment)

@ghost
Copy link
Author

ghost commented Oct 25, 2020

For over 2 months, there has been nothing asked frequently enough to update the FAQ and the support questions have subsided, so there may not be a need for this anymore. Though I am closing this, the contents are still searchable.

@lenover12
Copy link

Is it possible to get a phoneme printed output live? (even inaccurate)

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant