[Refactor] Drop direct dependency on librosa#39079
[Refactor] Drop direct dependency on librosa#39079ywang96 merged 6 commits intovllm-project:mainfrom
Conversation
|
Documentation preview: https://vllm--39079.org.readthedocs.build/en/39079/ |
There was a problem hiding this comment.
Code Review
This pull request removes the librosa dependency from the codebase, replacing its functionality with internal utilities and torchaudio. Key changes include replacing librosa.load and librosa.get_duration with load_audio and get_audio_duration, as well as migrating mel-filterbank generation to torchaudio.functional.melscale_fbanks. Documentation, examples, and requirement files have been updated to reflect these changes and the shift toward soundfile and PyAV as the primary backends. I have no feedback to provide.
db5219a to
38068e7
Compare
|
could you run an performance sanity check? |
|
thanks for making this change, its long overdue |
|
Actually for the main code we have already dropped the dependency: #37058 But it's nice to remove it from the example and testing code as well! |
torchaudio.functional.melscale_fbanks in in the |
|
Still pulled in by a third party dep.... |
|
Dropped the commit changing requirements, let's handle this later. |
|
But I think |
We are pinning to datasets 3: Thus still using librosa. |
…_audio Signed-off-by: Nick Cao <ncao@redhat.com>
…esampler Signed-off-by: Nick Cao <ncao@redhat.com>
…t_audio_duration Signed-off-by: Nick Cao <ncao@redhat.com>
…scale_fbanks Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
Purpose
Drop dependency on librosa due to license concerns.
Test Plan
N/A, the load_audio/resample wrapper functions has been validated in existing code, and the melscale_fbanks function from torch audio is numerically equivalent to it's librosa counterpart.
Test Result
N/A
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.