Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to save audio in the original format when exporting to shar #1422

Merged
merged 2 commits into from
Nov 23, 2024

Conversation

anteju
Copy link
Collaborator

@anteju anteju commented Nov 22, 2024

Added an option to use the original audio format when exporting to shar.

Example

Use --audio original when calling lhotse shar export.
For example:

lhotse shar export --num-jobs 1 --verbose --shard-size 2 --audio original --no-shuffle ${INPUT_MANIFEST} ${OUTPUT_DIR}

@anteju anteju requested a review from pzelasko November 22, 2024 03:10
@anteju anteju force-pushed the pr/shar-audio-original branch from 36e005e to abb7fcb Compare November 22, 2024 03:10
# generate 1kHz sine wave
f_sine = 1000
assert f_sine < sampling_rate / 2, "Sine wave frequency exceeds Nyquist frequency"
data = torch.sin(2 * np.pi * f_sine / sampling_rate * torch.arange(num_samples))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pzelasko, including a bugfix here: frequency was not normalized.

@@ -87,6 +87,7 @@ def test_tar_writer_pipe(tmp_path: Path):
),
],
)
# TODO: check if this should be removed?
def test_audio_tar_writer(tmp_path: Path, format: str):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pzelasko, this test is repeated twice test_audio_tar_writer, so only the latter is running.
It seems that this (top) version is older, but please check if one should be removed (or renamed if both should be kept).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well spotted! yes, please remove it. thanks

@anteju anteju force-pushed the pr/shar-audio-original branch from 8dadd01 to a3e2bf6 Compare November 22, 2024 03:34
Copy link
Collaborator

@pzelasko pzelasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Let's remove that duplicated test you found and merge.

@pzelasko pzelasko added this to the v1.29.0 milestone Nov 22, 2024
@pzelasko pzelasko merged commit 36ce63e into lhotse-speech:master Nov 23, 2024
9 checks passed
yfyeung pushed a commit to yfyeung/lhotse that referenced this pull request Jan 8, 2025
…hotse-speech#1422)

* Option to save audio in the original format when exporting to shar

* Removed duplicate test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants