Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dash in SequentialJsonlWriter #967

Merged
merged 1 commit into from
Feb 7, 2023

Conversation

desh2608
Copy link
Collaborator

@desh2608 desh2608 commented Feb 7, 2023

This makes piping possible for several CLI binaries such as trim-to-supervisions etc. An example piped workflow for combining LibriSpeech train set, cutting into shorter segments, and shuffling:

lhotse combine data/manifests/librispeech_cuts_train* - |\
  lhotse cut trim-to-alignments --type word --max-pause 0.2 - - |\
  shuf | gzip -c > data/manifests/librispeech_cuts_train_trimmed.jsonl.gz

@pzelasko
Copy link
Collaborator

pzelasko commented Feb 7, 2023

And so we have went full cycle bash -> python -> bash 😄

Thanks LGTM!

@pzelasko pzelasko merged commit 311203a into lhotse-speech:master Feb 7, 2023
@pzelasko pzelasko added this to the v1.13 milestone Mar 21, 2023
@desh2608 desh2608 deleted the cli/trim branch November 2, 2023 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants