[Bugfix] Fix qwen3-omni audio truncation issue#26815
[Bugfix] Fix qwen3-omni audio truncation issue#26815DarkLight1337 merged 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
There was a problem hiding this comment.
Code Review
This pull request addresses an audio truncation issue in the Qwen3-Omni model where inputs were always limited to 30 seconds. The changes introduce a conditional truncation mechanism based on the truncation argument in mm_kwargs, which correctly allows for processing longer audio clips. The associated refactoring, which includes hoisting the feature_extractor and hop_length initializations and removing redundant code, is clean and necessary for the fix. The implementation is sound and effectively resolves the reported bug.
|
After second thought, I think we can still add some patch here so that user doesn't need to wait Transformers patch release. |
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
| return x | ||
|
|
||
| # NOTE: WhisperFeatureExtractor cannot handle empty list of audios | ||
| feature_extractor = self.info.get_feature_extractor() |
There was a problem hiding this comment.
Can you add a comment so we can revert once it's fixed on transformers side?
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Purpose
truncationtoFalsein Qwen3Omni to avoid default truncation huggingface/transformers#41473, Qwen3-omni will still truncate audio to 30s.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.