Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wizards & preprocess : issues when using a multi folder deep path to store the files ( WAV file location and extra ../logs_and_checkpoints & return code ) ) #127

Closed
marctessier opened this issue Oct 13, 2023 · 2 comments
Assignees

Comments

@marctessier
Copy link
Collaborator

When you use the wizard and tell it where to save the files, ( not the default in ./ ) but a few folders deep.

EX: below ./test/newfolder/ With project named FOLDER

pwd
/home/tes001/DT/tes001/AM
(EveryVoice) $ everyvoice new-dataset
What would you like to call this project? FOLDER
Great! Launching New Dataset Wizard 🧙 for project named 'FOLDER'.
? Where should the New Dataset Wizard save your files? ./test/newfolder/
New Dataset Wizard 🧙 will put your files here: 'test/newfolder/FOLDER'
? Where are your audio files? ./wavs
? Where is your data filelist? ./ALL.tsv

Then I run the preprocess:

EveryVoice) $ cd test/newfolder/FOLDER
(EveryVoice) $ psub -N Tfolder -cpus 16 -mem 64G everyvoice preprocess --cpus 16  config/everyvoice-text-to-spec.yaml
Submitted batch job 6328
6328

It did not use the right "wav" folder , it should have used --> /home/tes001/DT/tes001/AM/wav
(EveryVoice) $ cat Tfolder.e6328

============ Starting job 6328 on Fri Oct 13 15:12:06 EDT 2023 on node ib18be-008.collab.science.gc.ca OS "CentOS Linux 7 (Core)"
2023-10-13 15:12:10.591 | INFO     | everyvoice.config.shared_types:convert_path:126 - Directory at ../logs_and_checkpoints does not exist. Creating...
2023-10-13 15:12:10.861 | ERROR    | everyvoice.preprocessor:dataset_sanity_checks:519 - Data directory '/gpfs/fs3c/nrc/dt/tes001/AM/test/newfolder/wavs' does not exist. Please check your config file.
============ Finished job 6328 on Fri Oct 13 15:12:11 EDT 2023 with rc=0

( Also noticed the rc=0 It should have been rc=1 or something else since it was not a "real" success.
Below we see the extra , extra empty ../logs_and_checkpoints that was created and mentioned in the logs.

 pwd
/home/tes001/DT/tes001/AM/test/newfolder/FOLDER
(EveryVoice) $ cd ../
(EveryVoice) $ ll
total 1.0K
drwxrwxr-x 5 tes001 nrc_ict 4.0K Oct 13 15:12 FOLDER
drwxrwxr-x 2 tes001 nrc_ict 4.0K Oct 13 15:12 logs_and_checkpoints
@roedoejet
Copy link
Member

@marctessier and @SamuelLarkin - has this been fixed in #128 ?

@marctessier
Copy link
Collaborator Author

Nice, I just re-tested this using the latest from main branch.

It does look resolved now :-) . It created the folder structure where I was expecting it ( nested far away..) preprocess , train and synth worked and produced a wav files with the data that I quickly trained.

Closing this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants