Heavily CPU Dependent #19

Teravus · 2020-03-05T05:12:03Z

I'm not sure if this would be a bug or an implementation detail... and training using the train.py script seems to be very heavily CPU dependent. I checked that it is processing some parts on the GPU. It goes through 16 seconds of processing on the CPU and then a tiny amount of CUDA activity per step for me. This has essentially CPU bound the training for me. I've been training one for 14 days on a NVidia GeForce GTX 2080 ti and it just reached epoch 76,000

Teravus · 2020-03-05T06:39:29Z

Side note, this seems to be cpu time spent loading the .wav files.
Would it be better to have an array/map of the loaded wav file and then simply clone the data in memory instead of reloading it from the disk? (This is loading from a SSD)

Teravus · 2020-03-07T19:29:52Z

Note: It runs at about twice the speed per batch with Librosa caching enabled.
Environment variables: LIBROSA_CACHE_DIR = dir, LIBROSA_CACHE_LEVEL = 50

zakajd · 2020-03-31T15:11:10Z

@Teravus you could consider rewriting their dataloader to NVIDIA DALI framework, which can decode wav files and convert them to spectograms on GPU instead of CPU.
See: NVIDIA documentation for details

ark626 · 2020-05-31T13:55:47Z

@Teravus THANKS for that hint, the Librosa Cache is for me a HUGE improvement, since im learning on an Jetson AGX Xavier, where the CPU is a lot weaker it is more than twice as fast.

Now the GPU is almost all the time running since it has the cache where it can fastly reload and not recalcualting the stuff all the time

ms/batch is now around 1200-1000 and 15 Seconds to generate the samples
Improvement is around 4,5 times faster cache uses around 2,5 GB.

fdb · 2023-03-20T16:05:42Z

I fixed this by loading all audio in memory. Depending on your input dataset, this might be feasible.

In dataset.py, add the following code add the end of the AudioDataset __init__ method:

# Load all audio files into memory
self.audio_data = []
for audio_file in self.audio_files:
    audio, _ = self.load_wav_to_torch(audio_file)
    self.audio_data.append(audio)

Then in __getitem__, replace:

# Read audio
filename = self.audio_files[index]
audio, sampling_rate = self.load_wav_to_torch(filename)

With this:

# Get audio from memory
audio = self.audio_data[index]

This fixes descriptinc#19

fdb added a commit to fdb/melgan-neurips that referenced this issue Mar 20, 2023

Speed up training by loading all audio in RAM.

b8a0ff8

This fixes descriptinc#19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heavily CPU Dependent #19

Heavily CPU Dependent #19

Teravus commented Mar 5, 2020

Teravus commented Mar 5, 2020

Teravus commented Mar 7, 2020

zakajd commented Mar 31, 2020

ark626 commented May 31, 2020 •

edited

Loading

fdb commented Mar 20, 2023

Heavily CPU Dependent #19

Heavily CPU Dependent #19

Comments

Teravus commented Mar 5, 2020

Teravus commented Mar 5, 2020

Teravus commented Mar 7, 2020

zakajd commented Mar 31, 2020

ark626 commented May 31, 2020 • edited Loading

fdb commented Mar 20, 2023

ark626 commented May 31, 2020 •

edited

Loading