Skip to content

Commit

Permalink
AudioBackend specific save_audio and info, managing missing SoX…
Browse files Browse the repository at this point in the history
… in torchaudio, Python 3.12 / PyTorch 2.2 support, using `libsndfile` as preferred audio backend (#1288)

* AudioBackend supports save_audio() [code cleanup]

* Fix for Path handling; skip some tests on older torchaudio

* Move info() implementation to each AudioBackend

* Update CI configurations, fix more tests

* Conditionally build kaldifeat, fix some more tests

* Fix more tests, remove dead code, bump version

* Remaining fixes, legacy OPUS reading mode env var

* Fixes for torchaudio==2.0.0

* Fixes for save_audio/save_audios

* Skip backends for audio saving that are not applicable (e.g. torchaudio backends when it's not installed)

* Prefer LibsndfileBackend as the default lhotse backend (except for some special cases) + fix CutSet.copy_data()
  • Loading branch information
pzelasko authored Feb 13, 2024
1 parent 2473491 commit 769c273
Show file tree
Hide file tree
Showing 21 changed files with 642 additions and 412 deletions.
17 changes: 12 additions & 5 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,20 @@ jobs:
matrix:
include:
- python-version: "3.8"
torch-install-cmd: "pip install torch==1.8.2+cpu torchaudio==0.8.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html"
torch-install-cmd: "pip install torch==1.12.1 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cpu"
extra_deps: kaldifeat
- python-version: "3.9"
torch-install-cmd: "pip install torch==1.8.2+cpu torchaudio==0.8.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html"
torch-install-cmd: "pip install torch==2.0 torchaudio==2.0 --extra-index-url https://download.pytorch.org/whl/cpu"
extra_deps: kaldifeat
- python-version: "3.10"
torch-install-cmd: "pip install torch==1.12.1 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cpu"
torch-install-cmd: "pip install torch==2.1 torchaudio==2.1 --extra-index-url https://download.pytorch.org/whl/cpu"
extra_deps: ""
- python-version: "3.11"
torch-install-cmd: "pip install torch==2.0 torchaudio==2.0 --extra-index-url https://download.pytorch.org/whl/cpu"
torch-install-cmd: "pip install torch==2.2 torchaudio==2.2 --extra-index-url https://download.pytorch.org/whl/cpu"
extra_deps: ""
- python-version: "3.12"
torch-install-cmd: "pip install torch==2.2 torchaudio==2.2 --extra-index-url https://download.pytorch.org/whl/cpu"
extra_deps: ""

fail-fast: false

Expand All @@ -50,7 +57,7 @@ jobs:
# the torchaudio env var does nothing when torchaudio is installed, but doesn't require it's presence when it's not
pip install '.[tests]'
# Enable some optional tests
pip install h5py dill smart_open[http] kaldifeat kaldi_native_io webdataset==0.2.5 s3prl scipy nara_wpe pyloudnorm
pip install h5py dill smart_open[http] kaldi_native_io webdataset==0.2.5 s3prl scipy nara_wpe pyloudnorm ${{ matrix.extra_deps }}
- name: Install sph2pipe
run: |
lhotse install-sph2pipe # Handle sphere files.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ Lhotse uses several environment variables to customize it's behavior. They are a
- `LHOTSE_AUDIO_BACKEND` - may be set to any of the values returned from CLI `lhotse list-audio-backends` to override the default behavior of trial-and-error and always use a specific audio backend.
- `LHOTSE_AUDIO_LOADING_EXCEPTION_VERBOSE` - when set to `1` we'll emit full exception stack traces when every available audio backend fails to load a given file (they might be very large).
- `LHOTSE_DILL_ENABLED` - when it's set to `1|True|true|yes`, we will enable `dill`-based serialization of `CutSet` and `Sampler` across processes (it's disabled by default even when `dill` is installed).
- `LHOTSE_LEGACY_OPUS_LOADING` - (`=1`) reverts to a legacy OPUS loading mechanism that triggered a new ffmpeg subprocess for each OPUS file.
- `LHOTSE_PREPARING_RELEASE` - used internally by developers when releasing a new version of Lhotse.
- `TORCHAUDIO_USE_BACKEND_DISPATCHER` - when set to `1` and torchaudio version is below 2.1, we'll enable the experimental ffmpeg backend of torchaudio.
- `RANK`, `WORLD_SIZE`, `WORKER`, and `NUM_WORKERS` are internally used to inform Lhotse Shar dataloading subprocesses.
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.20.0
1.21.0
2 changes: 2 additions & 0 deletions docs/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ Lhotse uses several environment variables to customize it's behavior. They are a

* ``LHOTSE_DILL_ENABLED`` - when it's set to ``1|True|true|yes``, we will enable ``dill``-based serialization of ``CutSet`` and ``Sampler`` across processes (it's disabled by default even when ``dill`` is installed).

* ``LHOTSE_LEGACY_OPUS_LOADING`` - (``=1``) reverts to a legacy OPUS loading mechanism that triggered a new ffmpeg subprocess for each OPUS file.

* ``LHOTSE_PREPARING_RELEASE`` - used internally by developers when releasing a new version of Lhotse.

* ``TORCHAUDIO_USE_BACKEND_DISPATCHER`` - when set to 1 and torchaudio version is below 2.1, we'll enable the experimental ffmpeg backend of torchaudio.
Expand Down
Loading

0 comments on commit 769c273

Please sign in to comment.