Skip to content

[Refactor] Remove dependency on librosa#2273

Merged
Isotr0py merged 8 commits intovllm-project:mainfrom
NickCao:feature/replace-librosa-with-resampy
Apr 10, 2026
Merged

[Refactor] Remove dependency on librosa#2273
Isotr0py merged 8 commits intovllm-project:mainfrom
NickCao:feature/replace-librosa-with-resampy

Conversation

@NickCao
Copy link
Copy Markdown
Contributor

@NickCao NickCao commented Mar 27, 2026

Purpose

Replace librosa.load() with vllm.multimodal.media.audio.load_audio() and librosa.resample() with vllm.multimodal.audio.resample_audio_resampy(), reusing the functions introduced in vllm#37058.

See also: #1725

NOTE: there are still a few other references to librosa to be removed in followup PRs.
NOTE: Do not merge until the release of vllm 0.18.1 where these helper functions are introduced.

Test Plan

# Pytest tests
pytest tests/entrypoints/openai_api/test_serving_speech.py -v
pytest tests/entrypoints/openai_api/test_serving_speech_stream.py -v
pytest tests/model_executor/models/qwen3_tts/test_cuda_graph_decoder.py -v
pytest tests/engine/test_arg_utils.py -v
# CosyVoice3 E2E
python examples/offline_inference/cosyvoice3/verify_e2e_cosyvoice.py \
  --model Fun-CosyVoice3-0.5B \
  --tokenizer Fun-CosyVoice3-0.5B/CosyVoice-BlankEN \
  --audio-path "" \
  --prompt-text "Mary had a little lamb, its fleece was white as snow, and everywhere that Mary went, the lamb was sure to go." \
  --prompt "The quick brown fox jumps over the lazy dog. This is a test of clear speech synthesis."
# Qwen2.5-Omni E2E
python examples/offline_inference/qwen2_5_omni/end2end.py --query-type text
# Speaker embedding extraction
python examples/online_serving/qwen3_tts/speaker_embedding_interpolation.py \
  --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --device cuda \
  extract --audio voice.wav --output embedding.json

Test Result

PASS


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, couple small things

Comment thread vllm_omni/model_executor/models/qwen3_tts/qwen3_tts_tokenizer.py Outdated
@NickCao NickCao force-pushed the feature/replace-librosa-with-resampy branch 2 times, most recently from 1180487 to 4ffbb8f Compare April 6, 2026 13:15
@NickCao NickCao marked this pull request as ready for review April 6, 2026 13:15
@NickCao NickCao requested a review from hsliuustc0106 as a code owner April 6, 2026 13:15
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@NickCao NickCao requested a review from lishunyang12 April 6, 2026 13:15
@NickCao
Copy link
Copy Markdown
Contributor Author

NickCao commented Apr 6, 2026

main branch is now targeting vllm 0.19.0, this PR can be merged.

@NickCao
Copy link
Copy Markdown
Contributor Author

NickCao commented Apr 6, 2026

Ok it seems the remainder of the librosa functions also have counterparts in torchaudio, I'm expanding the scope of this PR to completely drop the dependency on librosa.

@NickCao NickCao force-pushed the feature/replace-librosa-with-resampy branch from 4ffbb8f to 008c0fc Compare April 6, 2026 19:26
@NickCao NickCao changed the title [Refactor] Replace librosa.load/resample with upstream vllm audio utilities [Refactor] Remove dependency on librosa Apr 6, 2026
@NickCao
Copy link
Copy Markdown
Contributor Author

NickCao commented Apr 6, 2026

Done, the dependency on librosa is now fully dropped. See also vllm-project/vllm#39079

@NickCao NickCao force-pushed the feature/replace-librosa-with-resampy branch 2 times, most recently from 2d7454e to c1a598e Compare April 7, 2026 14:08
@tzhouam tzhouam added the ready label to trigger buildkite CI label Apr 9, 2026
@tzhouam tzhouam self-requested a review April 9, 2026 07:30
Copy link
Copy Markdown
Collaborator

@tzhouam tzhouam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update the Dockerfiles, thanks.

@NickCao NickCao force-pushed the feature/replace-librosa-with-resampy branch from df97f2b to 8204fe1 Compare April 10, 2026 01:53
@NickCao
Copy link
Copy Markdown
Contributor Author

NickCao commented Apr 10, 2026

Please also update the Dockerfiles, thanks.

Done, manually inspected the wheels (for x86_64 linux), they do contain the native libraries.

@tzhouam tzhouam self-requested a review April 10, 2026 03:00
Copy link
Copy Markdown
Collaborator

@tzhouam tzhouam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tzhouam tzhouam added the nightly-test label to trigger buildkite nightly test CI label Apr 10, 2026
@tzhouam tzhouam requested a review from gcanlin April 10, 2026 07:43
@tzhouam
Copy link
Copy Markdown
Collaborator

tzhouam commented Apr 10, 2026

@gcanlin Please have a look. The CI failure seems to be caused by the main branch instead of this PR.

@gcanlin
Copy link
Copy Markdown
Collaborator

gcanlin commented Apr 10, 2026

It seems the failure is related to this PR?

@tzhouam
Copy link
Copy Markdown
Collaborator

tzhouam commented Apr 10, 2026

It seems the failure is related to this PR?

@gcanlin
similar errors can be found in the main branch nightly tests:
https://buildkite.com/vllm/vllm-omni/builds/6235/steps/canvas
image

Comment thread vllm_omni/utils/audio.py
NickCao and others added 8 commits April 10, 2026 08:30
…_audio

Signed-off-by: Nick Cao <ncao@redhat.com>
…le_audio_resampy

Signed-off-by: Nick Cao <ncao@redhat.com>
…scale_fbanks

Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
Uses of librosa has been replaced with wrapper functions from vllm
(using soundfile and pyav internally). Both soundfile and pyav wheels
have the native libraries bundled (libsndfile and ffmpeg), thus no
additional installation is required on the host, making these docs
outdated and misleading.

Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
These libraries are already bundled in the pyav and soundfile wheels

Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
@NickCao NickCao force-pushed the feature/replace-librosa-with-resampy branch from f2b6b6b to 2560d54 Compare April 10, 2026 12:38
Comment thread docker/Dockerfile.cuda
Copy link
Copy Markdown
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@gcanlin gcanlin removed the nightly-test label to trigger buildkite nightly test CI label Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@linyueqian linyueqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Isotr0py Isotr0py merged commit 2bc183f into vllm-project:main Apr 10, 2026
8 checks passed
@tjtanaa tjtanaa mentioned this pull request Apr 12, 2026
5 tasks
Comment thread docker/Dockerfile.xpu
apt-get install -y --no-install-recommends --fix-missing \
curl \
espeak-ng \
ffmpeg \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xuechendi Could you check whether XPU have the same issue #2708?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

daixinning pushed a commit to daixinning/vllm-omni that referenced this pull request Apr 13, 2026
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants