[Refactor] Drop direct dependency on librosa by NickCao · Pull Request #39079 · vllm-project/vllm

NickCao · 2026-04-06T14:18:42Z

Purpose

Drop dependency on librosa due to license concerns.

Test Plan

N/A, the load_audio/resample wrapper functions has been validated in existing code, and the melscale_fbanks function from torch audio is numerically equivalent to it's librosa counterpart.

Test Result

N/A

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2026-04-06T14:19:27Z

Documentation preview: https://vllm--39079.org.readthedocs.build/en/39079/

gemini-code-assist

Code Review

This pull request removes the librosa dependency from the codebase, replacing its functionality with internal utilities and torchaudio. Key changes include replacing librosa.load and librosa.get_duration with load_audio and get_audio_duration, as well as migrating mel-filterbank generation to torchaudio.functional.melscale_fbanks. Documentation, examples, and requirement files have been updated to reflect these changes and the shift toward soundfile and PyAV as the primary backends. I have no feedback to provide.

robertgshaw2-redhat · 2026-04-06T14:27:20Z

could you run an performance sanity check?

robertgshaw2-redhat · 2026-04-06T14:27:31Z

thanks for making this change, its long overdue

DarkLight1337 · 2026-04-06T14:30:13Z

Actually for the main code we have already dropped the dependency: #37058

But it's nice to remove it from the example and testing code as well!

NickCao · 2026-04-06T14:32:38Z

could you run an performance sanity check?

torchaudio.functional.melscale_fbanks in in the __init__ function, not on the hot path, the remaining changes are in the tests/examples, and these wrapper functions (load_audio, etc.) are already used in the main code, so there should be none performance regressions.

NickCao · 2026-04-06T14:45:53Z

Still pulled in by a third party dep....

__________________________________ test_wer_correctness[D4nt3/esb-datasets-earnings22-validation-tiny-filtered-model_config0] __________________________________
self = Audio(sampling_rate=16000, mono=True, decode=True, id=None)
value = {'bytes': b'RIFF&\xd1\x06\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00\x80>\x00\x00\x00}\x00\x00\x02\x00\x10\x00data\x0...x01H\x02\xa2\x01\xbe\x00\xe4\xffn\x00]\x01\x9e\x01j\xffS\xfd\t\xfd\xd9\xff\x1c\x02D\x03\xf3\x00\x04\xfd', 'path': None}
token_per_repo_id = None
    def decode_example(
        self, value: dict, token_per_repo_id: Optional[Dict[str, Union[str, bool, None]]] = None
    ) -> dict:
        """Decode example audio file into audio data.
        Args:
            value (`dict`):
                A dictionary with keys:
                - `path`: String with relative audio file path.
                - `bytes`: Bytes of the audio file.
            token_per_repo_id (`dict`, *optional*):
                To access and decode
                audio files from private repositories on the Hub, you can pass
                a dictionary repo_id (`str`) -> token (`bool` or `str`)
        Returns:
            `dict`
        """
        if not self.decode:
            raise RuntimeError("Decoding is disabled for this feature. Please use Audio(decode=True) instead.")
        path, file = (value["path"], BytesIO(value["bytes"])) if value["bytes"] is not None else (value["path"], None)
        if path is None and file is None:
            raise ValueError(f"An audio sample should have one of 'path' or 'bytes' but both are None in {value}.")
        try:
>           import librosa
E           ModuleNotFoundError: No module named 'librosa'
/usr/local/lib/python3.12/dist-packages/datasets/features/audio.py:153: ModuleNotFoundError

NickCao · 2026-04-06T14:50:16Z

Dropped the commit changing requirements, let's handle this later.

NickCao · 2026-04-06T14:57:11Z

Still pulled in by a third party dep....

        try:
>           import librosa
E           ModuleNotFoundError: No module named 'librosa'
/usr/local/lib/python3.12/dist-packages/datasets/features/audio.py:153: ModuleNotFoundError

datasets drops the liborsa dependency in favor of torchcodec in huggingface/datasets@161f99d, we need to update it to 4.0.0+.

Isotr0py · 2026-04-06T15:03:20Z

datasets drops the liborsa dependency in favor of torchcodec

But I think torchcodec is optional requirements actually? https://github.com/huggingface/datasets/blob/161f99d94a1daf8380eabdb826048a0652510ee6/setup.py#L210-L212

NickCao · 2026-04-06T15:05:02Z

datasets drops the liborsa dependency in favor of torchcodec

But I think torchcodec is optional requirements actually? https://github.com/huggingface/datasets/blob/161f99d94a1daf8380eabdb826048a0652510ee6/setup.py#L210-L212

We are pinning to datasets 3:

requirements/test.in
74:# Newer versions of datasets require torchcoded, that makes the tests fail in CI because of a missing library.
76:datasets>=3.3.0,<=3.6.0

Thus still using librosa.

…_audio Signed-off-by: Nick Cao <ncao@redhat.com>

…esampler Signed-off-by: Nick Cao <ncao@redhat.com>

…t_audio_duration Signed-off-by: Nick Cao <ncao@redhat.com>

…scale_fbanks Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

Signed-off-by: Nick Cao <ncao@redhat.com>

Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

NickCao requested review from DarkLight1337, NickLucche, aarnphm, robertgshaw2-redhat, tjtanaa and ywang96 as code owners April 6, 2026 14:18

mergify Bot added documentation Improvements or additions to documentation ci/build multi-modality Related to multi-modality (#4194) rocm Related to AMD ROCm labels Apr 6, 2026

github-project-automation Bot added this to AMD Apr 6, 2026

github-project-automation Bot moved this to Todo in AMD Apr 6, 2026

gemini-code-assist Bot reviewed Apr 6, 2026

View reviewed changes

NickCao force-pushed the drop-librosa branch 2 times, most recently from db5219a to 38068e7 Compare April 6, 2026 14:26

robertgshaw2-redhat added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 6, 2026

DarkLight1337 requested a review from Isotr0py April 6, 2026 14:29

NickCao force-pushed the drop-librosa branch from 38068e7 to 0ddb9ca Compare April 6, 2026 14:49

NickCao changed the title ~~[Refactor] Drop dependency on librosa~~ [Refactor] Drop direct dependency on librosa Apr 6, 2026

Isotr0py reviewed Apr 6, 2026

View reviewed changes

Comment thread vllm/transformers_utils/processors/cohere_asr.py

NickCao mentioned this pull request Apr 6, 2026

[Refactor] Remove dependency on librosa vllm-project/vllm-omni#2273

Merged

5 tasks

NickCao and others added 5 commits April 17, 2026 13:11

[Refactor] Replace librosa.load with vllm.multimodal.media.audio.load…

caa7ab7

…_audio Signed-off-by: Nick Cao <ncao@redhat.com>

[Refactor] Replace librosa.resample with vllm.multimodal.audio.AudioR…

8fb60fb

…esampler Signed-off-by: Nick Cao <ncao@redhat.com>

[Refactor] Replace librosa.get_duration with vllm.multimodal.audio.ge…

5c99708

…t_audio_duration Signed-off-by: Nick Cao <ncao@redhat.com>

[Refactor] Replace librosa.filters.mel with torchaudio.functional.mel…

c626ebf

…scale_fbanks Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>

[Refactor] Update comments referencing librosa

d509a23

Signed-off-by: Nick Cao <ncao@redhat.com>

NickCao force-pushed the drop-librosa branch from 0ddb9ca to d509a23 Compare April 17, 2026 17:11

Merge branch 'main' into drop-librosa

bbbfb0d

ywang96 approved these changes Apr 18, 2026

View reviewed changes

ywang96 enabled auto-merge (squash) April 18, 2026 05:22

ywang96 merged commit 153ba7f into vllm-project:main Apr 18, 2026
55 of 56 checks passed

github-project-automation Bot moved this from Todo to Done in AMD Apr 18, 2026

bnellnm pushed a commit to neuralmagic/vllm that referenced this pull request Apr 20, 2026

[Refactor] Drop direct dependency on librosa (vllm-project#39079)

22c51fb

Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

baonudesifeizhai pushed a commit to baonudesifeizhai/vllm that referenced this pull request Apr 23, 2026

[Refactor] Drop direct dependency on librosa (vllm-project#39079)

a7c1f10

Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

whk-lab pushed a commit to whk-lab/vllm that referenced this pull request Apr 23, 2026

[Refactor] Drop direct dependency on librosa (vllm-project#39079)

aeaf342

Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] Drop direct dependency on librosa#39079

[Refactor] Drop direct dependency on librosa#39079
ywang96 merged 6 commits intovllm-project:mainfrom
NickCao:drop-librosa

NickCao commented Apr 6, 2026 •

edited by github-actions Bot

Loading

Uh oh!

mergify Bot commented Apr 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

DarkLight1337 commented Apr 6, 2026 •

edited

Loading

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

Isotr0py commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

NickCao commented Apr 6, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

N/A

Uh oh!

mergify Bot commented Apr 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

robertgshaw2-redhat commented Apr 6, 2026

Uh oh!

DarkLight1337 commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

Isotr0py commented Apr 6, 2026

Uh oh!

NickCao commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

NickCao commented Apr 6, 2026 •

edited by github-actions Bot

Loading

DarkLight1337 commented Apr 6, 2026 •

edited

Loading

NickCao commented Apr 6, 2026 •

edited

Loading