[Refactor] Remove sox from dependencies#2745
Conversation
Drop-in replacement for sox.Transformer().norm(db_level=...). Scales audio so peak amplitude reaches a target dB level. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>
Replace sox.Transformer().norm(db_level=-6) with peak_normalize, removing the last runtime usage of pysox from the codebase. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>
Update ROCm installation docs and Qwen3-TTS README to no longer list sox as a required dependency. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>
Drop sox>=1.5.0 from requirements/common.txt and remove sox/libsox-fmt-all from CI, CUDA, and ROCm Dockerfiles. Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Nick Cao <ncao@redhat.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
PR #2745 - [Refactor] Remove sox from dependencies OVERALL: NO BLOCKERS Correctness: PASS Summary: Refactor to remove sox dependency, replace with native peak_normalize. 9 files, 286 add, 86 del. Tests added, doc updated. Gates pass. No blockers. |
|
BLOCKER scan:
OVERALL: NO BLOCKERS VERDICT: COMMENT Good refactoring. Removing sox dependency simplifies the installation process. The peak_normalize() implementation is correct and the unit tests verify the edge cases (silence, peak normalization). One note: The db_level parameter defaults to -6.0, which matches the previous sox default. Consider documenting this in the function docstring for future reference. |
gcanlin
left a comment
There was a problem hiding this comment.
The code Looks clean to me. Running nightly-test.
lishunyang12
left a comment
There was a problem hiding this comment.
Clean removal of the sox / pysox dependency. The changes are correct and complete:
-
peak_normalizeimplementation invllm_omni/utils/audio.pyis a faithful pure-NumPy replacement forsox.Transformer().norm(db_level=...). The math (target / peakscaling) matches sox's peak normalization behavior, and the PR description includes a one-off verification script confirming numerical equivalence withinatol=1e-7. -
All references removed consistently: Python dependency (
requirements/common.txt), system packages (sox,libsox-fmt-all) from all three Dockerfiles (CI, CUDA, ROCm), documentation (ROCm install guide, Qwen3 TTS example and README), and theimport sox/sox.Transformerusage inspeech_vq.py. -
Tests cover the key edge cases: silence (all-zero input returns all-zero) and peak-reaches-target (output peak matches the requested dB level within tolerance).
-
No behavioral change:
sox_norm()method signature and call sites are preserved; only the internal implementation swaps from sox to the newpeak_normalize.
One minor observation: the sox_norm method in SpeechVQ could be replaced by a direct call to peak_normalize at the call site since it's now a trivial wrapper, but that's a follow-up cleanup, not a blocker.
LGTM.
|
@NickCao Could you please check whether this failed CI is related to this PR? https://buildkite.com/vllm/vllm-omni/builds/6682/steps/canvas?sid=019d8eb6-2af0-4470-9871-73e31b225185&tab=output |
The error is: Looks unrelated to me. |
|
@NickCao Thanks! Looks great. Let's merge it ASAP. |
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
#1725
Test Plan
Test Result
All PASS
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)