Speech Commands v0.01 & v0.02 dataset #996

yfyeung · 2023-03-15T07:16:07Z

Speech Commands is an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.
paper: https://arxiv.org/pdf/1804.03209.pdf

yfyeung · 2023-03-15T10:46:09Z

Hi, I am building a wake word recipe in icefall for this dataset.
Hope for your review of this data preparation part. @pzelasko

desh2608 · 2023-03-15T13:47:28Z

Perhaps the recipe should be called "speech_commands" instead of "speech_commands001", and the version should be provided as a parameter for download/prepare.

yfyeung · 2023-03-15T14:12:52Z

Perhaps the recipe should be called "speech_commands" instead of "speech_commands001", and the version should be provided as a parameter for download/prepare.

Ok, I will implement this.

csukuangfj · 2023-03-15T14:16:12Z

Perhaps the recipe should be called "speech_commands" instead of "speech_commands001", and the version should be provided as a parameter for download/prepare.

In that case, I suggest that we support both v1 and v2.

yfyeung · 2023-03-15T14:19:16Z

Perhaps the recipe should be called "speech_commands" instead of "speech_commands001", and the version should be provided as a parameter for download/prepare.

In that case, I suggest that we support both v1 and v2.

Ok, I have implemented the recipe "speech_commands002" locally. I will merge v1 and v2 into one recipe.

desh2608 · 2023-03-15T14:59:44Z

Thanks! You can look at the VoxCeleb recipe for an example where we support v1 and v2.

yfyeung · 2023-03-16T07:48:34Z

@desh2608 Hi, I have finished this recipe, supporting Speech Commands v0.01 and v0.02 dataset.
Hope for your review of this data preparation part.

pzelasko

Thanks, I left two comments. Can you address them before we merge?

pzelasko · 2023-03-16T12:48:51Z

lhotse/recipes/speechcommands.py

+    :return: the path to downloaded and extracted directory with data.
+    """
+
+    return _download_speechcommands(


Do we really need v1 and v2 to have separate download functions even though they share the same logic? I think it's cleaner to rename _download_speechcommands into download_speechcommands and set the default version to the latest one, and then expose the version argument in the CLI (so that we have one CLI program for this rather than two).

pzelasko · 2023-03-16T12:53:12Z

lhotse/recipes/speechcommands.py

+    )
+
+
+def _prepare_train_valid(


could you split this function into separate prepare_train and prepare_valid? I don't think you really need a generator here if you restructure the code, and the way to use this as-is is a little confusing.

OK, I will split it.

yfyeung · 2023-03-16T16:20:49Z

@pzelasko I have addressed them, please re check it.

pzelasko · 2023-03-16T21:23:00Z

LGTM!

yfyeung force-pushed the speechcommand branch 2 times, most recently from 2d94ad4 to aff5912 Compare March 15, 2023 08:02

yfyeung changed the title ~~Speech Commands v0.01 dataset~~ Speech Commands v0.01 & v0.02 dataset Mar 15, 2023

yfyeung changed the title ~~Speech Commands v0.01 & v0.02 dataset~~ [WIP] Speech Commands v0.01 & v0.02 dataset Mar 15, 2023

Add Speech Commands v0.01 & v0.02, merging into one recipe

4068076

yfyeung force-pushed the speechcommand branch from 0b524b9 to 4068076 Compare March 16, 2023 07:20

Yifan Yang added 4 commits March 16, 2023 15:22

Fix for isort

58283c5

Fix for black

5758a06

Fix flake8

b72e83c

Fix flake8

4dce856

yfyeung changed the title ~~[WIP] Speech Commands v0.01 & v0.02 dataset~~ Speech Commands v0.01 & v0.02 dataset Mar 16, 2023

Remove debug information

0e8a557

pzelasko reviewed Mar 16, 2023

View reviewed changes

Yifan Yang and others added 4 commits March 16, 2023 23:36

Restructure code

16298a8

Fix for flake8

2d57e04

Fix for black

01d1087

Merge branch 'master' into speechcommand

3fc791e

pzelasko merged commit 7e8d6b0 into lhotse-speech:master Mar 16, 2023

yfyeung deleted the speechcommand branch March 17, 2023 01:14

pzelasko added this to the v1.13 milestone Mar 21, 2023

yfyeung mentioned this pull request Mar 22, 2023

mutli train k2-fsa/icefall#955

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech Commands v0.01 & v0.02 dataset #996

Speech Commands v0.01 & v0.02 dataset #996

yfyeung commented Mar 15, 2023 •

edited

Loading

yfyeung commented Mar 15, 2023

desh2608 commented Mar 15, 2023

yfyeung commented Mar 15, 2023

csukuangfj commented Mar 15, 2023

yfyeung commented Mar 15, 2023

desh2608 commented Mar 15, 2023

yfyeung commented Mar 16, 2023

pzelasko left a comment

pzelasko Mar 16, 2023

yfyeung Mar 16, 2023

pzelasko Mar 16, 2023

yfyeung Mar 16, 2023

yfyeung commented Mar 16, 2023

pzelasko commented Mar 16, 2023

		)


		def _prepare_train_valid(

Speech Commands v0.01 & v0.02 dataset #996

Speech Commands v0.01 & v0.02 dataset #996

Conversation

yfyeung commented Mar 15, 2023 • edited Loading

yfyeung commented Mar 15, 2023

desh2608 commented Mar 15, 2023

yfyeung commented Mar 15, 2023

csukuangfj commented Mar 15, 2023

yfyeung commented Mar 15, 2023

desh2608 commented Mar 15, 2023

yfyeung commented Mar 16, 2023

pzelasko left a comment

Choose a reason for hiding this comment

pzelasko Mar 16, 2023

Choose a reason for hiding this comment

yfyeung Mar 16, 2023

Choose a reason for hiding this comment

pzelasko Mar 16, 2023

Choose a reason for hiding this comment

yfyeung Mar 16, 2023

Choose a reason for hiding this comment

yfyeung commented Mar 16, 2023

pzelasko commented Mar 16, 2023

yfyeung commented Mar 15, 2023 •

edited

Loading