Batch extraction for kaldi features #947

desh2608 · 2023-01-18T14:26:13Z

This PR adds extract_batch() method for the kaldi compatible features implemented in Lhotse.

NOTE: We rename Torchaudio's Spectrogram feature to TorchaudioSpectrogram to be consistent with TorchaudioMfcc and TorchaudioFbank.

pzelasko

Thanks! LGTM once the tests pass. You might need to try tweaking some settings, maybe the configurations for both extractors in the test are not identical.

desh2608 · 2023-01-18T14:37:32Z

test/features/test_kaldi_features.py

+    assert feats.shape == (1604, 257)
+
+
+def test_kaldi_spectrogram_extractor_vs_torchaudio(recording):


This test is currently failing with the following message:

E AssertionError: Tensor-likes are not close! E E Mismatched elements: 3874 / 412228 (0.9%) E Greatest absolute difference: 22.66506841033697 at index (826, 0) (up to 1e-05 allowed) E Greatest relative difference: 17694.401589361405 at index (1029, 0) (up to 0.0001 allowed)

On changing the assertion to torch.testing.assert_allclose(feats, np.exp(feats_ta)), I get:

E AssertionError: Tensor-likes are not close! E E Mismatched elements: 1604 / 412228 (0.4%) E Greatest absolute difference: 8.267975032700633 at index (325, 0) (up to 1e-05 allowed) E Greatest relative difference: 0.999999999856555 at index (826, 0) (up to 0.0001 allowed)

1604 is the number of frames, and it seems they only differ in the 0th dimension.

Ahh, the first dimension is the log energy for both, so we don't need to exponentiate that for comparing.

desh2608 · 2023-01-18T17:27:54Z

Thanks! LGTM once the tests pass. You might need to try tweaking some settings, maybe the configurations for both extractors in the test are not identical.

@pzelasko Made a small fix to the test and now it's passing. Ready to merge.

pzelasko

LGTM - please do the honors :)

desh2608 · 2023-01-18T21:36:36Z

@pzelasko I noticed some room for optimization in the batched feature extraction. Before this PR, we have 2 extractors which support batch extraction: kaldifeat and S3PRL. Both of these take a list of tensors and perform padding/collation internally, so we don't need to perform collation before-hand.

In this PR, we add batch extraction method for the nn.Module based Kaldi-compatible extractors (originally implemented by Jesus). These extractors take a batch of padded/collated samples as input. An easy solution is to pad and batch the list of tensors before passing them to the extractor.

However, recall that the extract_batch() method is most often used from CutSet.compute_and_store_features_batch(), which uses PyTorch dataloaders to create batches in the background. Therefore, we add a collate option to this method (set as False by default), which specifies whether to collate the batch of samples. This can be set to True for the extractors which work with batched tensors, so that no additional GPU time is spent on collation.

Another small optimization is to off-load the task of saving the extracted features to a background thread.

I will write some tests for the newly added features if they seem reasonable to you.

desh2608 · 2023-01-18T21:46:37Z

lhotse/cut/set.py


+                futures.append(executor.submit(_save_worker, cuts, features))
                progress.update(len(cuts))



Something that I find weird is that this works even though I have not called future.result() anywhere?

When the executor is called as a context manager it blocks the execution on __exit__ until all threads have finished (calls .join())

Makes sense, thanks!

pzelasko · 2023-01-18T22:04:17Z

Sounds good to me! Once you finish I think I will use your changes to build an example how to use Lhotse Shar (so the data will be stored sequentially in shards) for re-storing recordings and then computing the features. I am postponing writing the tutorials for quite a while now but that data format works fantastically in real world settings.

desh2608 · 2023-01-18T22:17:56Z

Sounds good to me! Once you finish I think I will use your changes to build an example how to use Lhotse Shar (so the data will be stored sequentially in shards) for re-storing recordings and then computing the features. I am postponing writing the tutorials for quite a while now but that data format works fantastically in real world settings.

Awesome! I'm mostly done with this PR now, but I will sit on it until tomorrow perhaps and use the new batch extractors on some of my tasks to make sure everything works seamlessly. I will ping you when I think it is ready to review/merge.

desh2608 · 2023-01-19T21:36:28Z

@pzelasko I think this can be merged now.

pzelasko · 2023-01-19T21:42:17Z

Thanks!!

The outputs can be numpy arrays due to the logic in `FeatureExtractor.extract_batch` added in #947

batch extraction for kaldi features

2035225

pzelasko previously approved these changes Jan 18, 2023

View reviewed changes

desh2608 commented Jan 18, 2023

View reviewed changes

fix test

ba64a52

desh2608 dismissed pzelasko’s stale review via ba64a52 January 18, 2023 16:48

desh2608 changed the title ~~Batch extraction for kaldi features~~ [WIP] Batch extraction for kaldi features Jan 18, 2023

add gpu support

c180102

desh2608 changed the title ~~[WIP] Batch extraction for kaldi features~~ Batch extraction for kaldi features Jan 18, 2023

pzelasko previously approved these changes Jan 18, 2023

View reviewed changes

desh2608 added 6 commits January 18, 2023 15:05

allow collated audio to be passed to extract_batch

68319bf

bucketing sampler in batch feature extraction

7d3ff99

minor fix

05cb7cd

use dynamic bucketing sampler

d4c7f7a

add buffer_size option

16b7094

background thread to store features

fde9368

desh2608 dismissed pzelasko’s stale review via fde9368 January 18, 2023 21:26

desh2608 commented Jan 18, 2023

View reviewed changes

desh2608 added 2 commits January 18, 2023 17:08

add tests for collated batch feature extraction

01af80a

remove future.result

da1c43e

desh2608 added 4 commits January 18, 2023 17:40

add more tests

c8bcfe7

more tests

d8d490b

add compute_energy method

e3d3f2b

fix spectrogram mix method

159387e

pzelasko approved these changes Jan 19, 2023

View reviewed changes

pzelasko merged commit 302013d into lhotse-speech:master Jan 19, 2023

pzelasko added this to the v1.13 milestone Jan 20, 2023

pzelasko mentioned this pull request Feb 1, 2023

Fix for OnTheFlyFeatures with batched inference #965

Merged

pzelasko added a commit that referenced this pull request Feb 1, 2023

Fix for OnTheFlyFeatures with batched inference (#965)

aba4a4f

The outputs can be numpy arrays due to the logic in `FeatureExtractor.extract_batch` added in #947

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch extraction for kaldi features #947

Batch extraction for kaldi features #947

desh2608 commented Jan 18, 2023

pzelasko left a comment

desh2608 Jan 18, 2023

desh2608 Jan 18, 2023

desh2608 Jan 18, 2023

desh2608 commented Jan 18, 2023

pzelasko left a comment

desh2608 commented Jan 18, 2023

desh2608 Jan 18, 2023

pzelasko Jan 18, 2023

desh2608 Jan 18, 2023

pzelasko commented Jan 18, 2023

desh2608 commented Jan 18, 2023

desh2608 commented Jan 19, 2023

pzelasko commented Jan 19, 2023

		assert feats.shape == (1604, 257)


		def test_kaldi_spectrogram_extractor_vs_torchaudio(recording):


		futures.append(executor.submit(_save_worker, cuts, features))
		progress.update(len(cuts))

Batch extraction for kaldi features #947

Batch extraction for kaldi features #947

Conversation

desh2608 commented Jan 18, 2023

pzelasko left a comment

Choose a reason for hiding this comment

desh2608 Jan 18, 2023

Choose a reason for hiding this comment

desh2608 Jan 18, 2023

Choose a reason for hiding this comment

desh2608 Jan 18, 2023

Choose a reason for hiding this comment

desh2608 commented Jan 18, 2023

pzelasko left a comment

Choose a reason for hiding this comment

desh2608 commented Jan 18, 2023

desh2608 Jan 18, 2023

Choose a reason for hiding this comment

pzelasko Jan 18, 2023

Choose a reason for hiding this comment

desh2608 Jan 18, 2023

Choose a reason for hiding this comment

pzelasko commented Jan 18, 2023

desh2608 commented Jan 18, 2023

desh2608 commented Jan 19, 2023

pzelasko commented Jan 19, 2023