-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation? #306
Comments
Have you installed the NVIDIA container toolkit? Does the sample NVIDIA container run and detect the GPU?
|
|
In your original post, I don't see "--runtime=nvidia --gpus all" in your docker command |
No difference |
Me personally, I didn't see much advantage to running training in docker so I just ran the training in my bare metal host OS. But I was curious why you're having so much trouble. Your docker command shows that you're running a docker container called cloning. My guess is that container doesn't have piper installed so, as expected, you can't run piper_train in that container. You could either install piper modules into that container interactively or build a new container. I looked at TRAINING.md and it recommends the following: It is highly recommended to train with the following Dockerfile:
But this seems incomplete to me too, since the container built from that Dockerfile also doesn't have the tools installed. If you build a container from the above Dockerfile, run it and do pip list from the command line, you'll see there's no piper-* modules. My guess is, the 3 lines in the above Dockerfile will not build a container that's ready to do piper training. There are numerous ways to fix this. Here's a couple:
Either way, once the piper environment is configured in the container, you should be able to run piper_train in your container. |
re baremetal: I tried, and got errors, so I was hoping doing it inside a container would resolve that.
|
Well the TRAINING.md Dockerfile example starts with: FROM nvcr.io/nvidia/pytorch:22.03-py3 And if you run that container, and run python --version, you'll see it's python 3.8, which I know from experience can't build the piper modules. I've settled on using Python 3.10 since 3.9, 3.11 and 3.12 all didn't work for me. This error specifically: 11.19 ERROR: No matching distribution found for piper-phonemize~=1.1.0 Seems to happen with python 3.8 I've created a new Dockerfile which installs Python 3.10. It's still building, but once it does, I can verify that piper_train runs properly with GPU support. |
Ok, that worked after rebuilding the container with python 3.10. Specifically I added the following in the Dockerfile before installing piper or creating the venv:
Earlier in the Dockerfile I install some other apt pacjages so that python compiles, those include:
Some of those should be removed after python is compiled, to reduce the size of the container. After the above modifications, the container sees the GPU and piper_train loads correctly. I didn't do any training since the GPU is already busy. |
any chance you can push that container to docker-hub? I'm stuck here:
And this is what I currently have for the Dockerfile:
|
I don't see a git clone for the piper github repo in your docker file |
I didn't see that in the code you shared, nor i can figure out how to incorporate it in my Dockefile that I'm trying to put together. This is clearly above my head. I will keep my fingers crossed for a public container, Dockerfile, or improved documentation. Thanks for the help |
|
I used to think docker was amazing for people like me, this experience has changed my mind:
|
This seems like a transient issue. They happen. Did you try running the container build again? |
Tried to build it once more time, without changing anything, and it worked. Alas:
|
While in the container, before running "python3 -m piper_train", try running source .venv/bin/activate |
Thanks. Getting closser I think:
I don't understand the error given that i'm running that command inside the container and I'm running this to get in the container:
In case it is relevant:
|
I'm guessing your dataset.jsonl has the paths from your local machine, like /home/igancio. Have a look in it and see. Either pre-process your wav files IN the container or use the same paths in the container, or maybe do a search and replace in the dataset.jsonl. I think it's easier to just use the exact same paths in the container. So: -v /home/ignacio/piper-checkpoints:/piper-checkpoints becomes: -v /home/ignacio/piper-checkpoints:/home/ignacio/piper-checkpoints and the same for other -v options too. You're almost there. This should work. |
Thank you so much. On addition to the Dockefile above, these are the steps I followed (in case someone else is having the same troubles I did):
|
Glad you were able to get it sorted. Hopefully anyone trying to train with docker container will find the thread. Not sure if you can change the title of the thread, but that would help them find it. |
I think I got to the point of exporting the model:
Where do i find model.ckpt and model.onnx? I tried running this:
|
You probably need something like:
Don't forget to copy the json file too.
|
Can't help there. Don't know anything at all about Home Assistant, never used it. Hopefully someone who has will chime in. Maybe a post to the Home Assistant forums will help you get the custom voice added. My guess is there's voices in a config file somewhere or some scan operation needs to be performed. |
Ignoring home assistant, get it to work inside piper you only need to copy the file next to the other ones? That is: When I look at the log of my piper container it seems like piper is not aware that the file is there:
But if i get inside the piper container the files are clearly there:
|
I'm trying to follow https://github.com/rhasspy/piper/blob/master/TRAINING.md and I got the step of getting into the container you recommend. However, it is not clear from the documentation if I have to install additonal things, or if I have to go to some given directory. This is what I did:
I'm probably missing something obvious, but it would be great if that obvious step was documented. Thanks!
The text was updated successfully, but these errors were encountered: