-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speech Commands v0.01 & v0.02 dataset #996
Conversation
2d94ad4
to
aff5912
Compare
Hi, I am building a wake word recipe in icefall for this dataset. |
Perhaps the recipe should be called "speech_commands" instead of "speech_commands001", and the version should be provided as a parameter for download/prepare. |
Ok, I will implement this. |
In that case, I suggest that we support both v1 and v2. |
Ok, I have implemented the recipe "speech_commands002" locally. I will merge v1 and v2 into one recipe. |
Thanks! You can look at the VoxCeleb recipe for an example where we support v1 and v2. |
@desh2608 Hi, I have finished this recipe, supporting Speech Commands v0.01 and v0.02 dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I left two comments. Can you address them before we merge?
lhotse/recipes/speechcommands.py
Outdated
:return: the path to downloaded and extracted directory with data. | ||
""" | ||
|
||
return _download_speechcommands( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need v1 and v2 to have separate download functions even though they share the same logic? I think it's cleaner to rename _download_speechcommands
into download_speechcommands
and set the default version to the latest one, and then expose the version argument in the CLI (so that we have one CLI program for this rather than two).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
lhotse/recipes/speechcommands.py
Outdated
) | ||
|
||
|
||
def _prepare_train_valid( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you split this function into separate prepare_train and prepare_valid? I don't think you really need a generator here if you restructure the code, and the way to use this as-is is a little confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will split it.
@pzelasko I have addressed them, please re check it. |
LGTM! |
Speech Commands is an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.
paper: https://arxiv.org/pdf/1804.03209.pdf