[Test] Fix 4 broken Qwen3-TTS async chunk unit tests#2351
[Test] Fix 4 broken Qwen3-TTS async chunk unit tests#2351linyueqian merged 2 commits intovllm-project:mainfrom
Conversation
Fix test assertions and mocks that fell out of sync with source code changes in qwen3_tts.py across PRs vllm-project#1930, vllm-project#1852, and vllm-project#2104. - test_flush_on_finish: `finished` is now a plain bool, not a tensor; remove `.item()` call - test_ic_load_change_mid_request: IC is cached per request since vllm-project#1930; update expected emission frames to match current logic - test_non_async_processor_prepends_ref_code_and_sets_trim_context: add missing `finished=True` and `token_ids` to mock (required since vllm-project#2104) - test_non_async_processor_filters_out_of_range_codec_values: same fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: linyueqian <linyueqian@outlook.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 37828fae97
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| p4 = _call(tm, "new_req", n_frames=16) | ||
| assert p4 is not None |
There was a problem hiding this comment.
Validate IC=16 for high-load new request
The new check in test_ic_load_change_mid_request only asserts that p4 is emitted, but that does not prove the cached/dynamic IC value is 16 as the test comment claims. With chunk_frames=25, frame 16 also emits for IC values 2/4/8, so a regression in load-based IC selection could slip through while this test still passes. Please assert an IC-dependent field (for example left_context_size) so the high-load behavior is actually verified.
Useful? React with 👍 / 👎.
Add negative assertion: frame 2 must NOT emit under IC=16 (it would emit under IC=2), proving the load-based IC selection is correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: linyueqian <linyueqian@outlook.com>
34165a5 to
4ea0e8b
Compare
Signed-off-by: linyueqian <linyueqian@outlook.com>
Summary
test_qwen3_tts_async_chunk.pythat were introduced by source changes in PRs [Bug][Qwen3TTS][Streaming] remove dynamic initial chunk and only compute on initial request #1930, [Optim][Qwen3TTS] big boost model throughput+latency high concurrency #1852, and [Fix] Qwen3 TTS audio handling for long ref_audio #2104 without corresponding test updatestest_flush_on_finish:finishedis now a plainbool, not a tensor; removed.item()calltest_ic_load_change_mid_request: IC is cached per request since [Bug][Qwen3TTS][Streaming] remove dynamic initial chunk and only compute on initial request #1930; updated expected emission framestest_non_async_processor_*(x2): added missingfinished=Trueandtoken_idsto mocks (required since [Fix] Qwen3 TTS audio handling for long ref_audio #2104 addedtalker_output.finishedcheck)Test plan
test_qwen3_tts_async_chunk.pypass (was 32 pass / 4 fail)Note: these tests are not currently run in CI I add the marker.