Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heavily CPU Dependent #19

Open
Teravus opened this issue Mar 5, 2020 · 5 comments
Open

Heavily CPU Dependent #19

Teravus opened this issue Mar 5, 2020 · 5 comments

Comments

@Teravus
Copy link

Teravus commented Mar 5, 2020

I'm not sure if this would be a bug or an implementation detail... and training using the train.py script seems to be very heavily CPU dependent. I checked that it is processing some parts on the GPU. It goes through 16 seconds of processing on the CPU and then a tiny amount of CUDA activity per step for me. This has essentially CPU bound the training for me. I've been training one for 14 days on a NVidia GeForce GTX 2080 ti and it just reached epoch 76,000

@Teravus
Copy link
Author

Teravus commented Mar 5, 2020

Side note, this seems to be cpu time spent loading the .wav files.
Would it be better to have an array/map of the loaded wav file and then simply clone the data in memory instead of reloading it from the disk? (This is loading from a SSD)

@Teravus
Copy link
Author

Teravus commented Mar 7, 2020

Note: It runs at about twice the speed per batch with Librosa caching enabled.
Environment variables: LIBROSA_CACHE_DIR = dir, LIBROSA_CACHE_LEVEL = 50

@zakajd
Copy link

zakajd commented Mar 31, 2020

@Teravus you could consider rewriting their dataloader to NVIDIA DALI framework, which can decode wav files and convert them to spectograms on GPU instead of CPU.
See: NVIDIA documentation for details

@ark626
Copy link

ark626 commented May 31, 2020

@Teravus THANKS for that hint, the Librosa Cache is for me a HUGE improvement, since im learning on an Jetson AGX Xavier, where the CPU is a lot weaker it is more than twice as fast.

Now the GPU is almost all the time running since it has the cache where it can fastly reload and not recalcualting the stuff all the time

ms/batch is now around 1200-1000 and 15 Seconds to generate the samples
Improvement is around 4,5 times faster cache uses around 2,5 GB.

@fdb
Copy link

fdb commented Mar 20, 2023

I fixed this by loading all audio in memory. Depending on your input dataset, this might be feasible.

In dataset.py, add the following code add the end of the AudioDataset __init__ method:

# Load all audio files into memory
self.audio_data = []
for audio_file in self.audio_files:
    audio, _ = self.load_wav_to_torch(audio_file)
    self.audio_data.append(audio)

Then in __getitem__, replace:

# Read audio
filename = self.audio_files[index]
audio, sampling_rate = self.load_wav_to_torch(filename)

With this:

# Get audio from memory
audio = self.audio_data[index]

fdb added a commit to fdb/melgan-neurips that referenced this issue Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants