Skip to content

[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test#37138

Merged
DarkLight1337 merged 5 commits intovllm-project:mainfrom
ROCm:akaratza_stabilize_voxtral
Mar 16, 2026
Merged

[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test#37138
DarkLight1337 merged 5 commits intovllm-project:mainfrom
ROCm:akaratza_stabilize_voxtral

Conversation

@AndreasKaratzas
Copy link
Collaborator

@AndreasKaratzas AndreasKaratzas commented Mar 16, 2026

Fixes two independent failures in test_voxtral_realtime_forward and test_voxtral_realtime_generator observed on ROCm CI runs.

ROCm non-determinism (test_voxtral_realtime_forward)

  • Adds ROCM_ENGINE_KWARGS to tests/utils.py as the Python-API equivalent of the existing ROCM_EXTRA_ARGS (CLI flags).
  • Applies max_num_seqs=1 and enable_prefix_caching=False on ROCm via **ROCM_ENGINE_KWARGS in ENGINE_CONFIG, eliminating batch variance and prefix-cache interference that caused non-deterministic outputs even at temperature=0.0.

Missing text normalization (test_voxtral_realtime_forward)

  • The model occasionally transcribes "OBS" as "a base hit" and "oh, my" as "oh my", which are acoustically valid alternatives.
  • test_voxtral_realtime_generator already handled this inline. Extracted into a shared _normalize() helper and applied to both tests to eliminate the drift.

Engine teardown / OOM (test_voxtral_realtime_generator)

  • The engine fixture had no teardown, so the LLM held GPU memory when the async engine's subprocess tried to start, causing Engine core initialization failed with Failed core proc(s): {}.
  • Fixed by converting the fixture to a yield fixture that calls llm.llm_engine.shutdown() and torch.cuda.empty_cache() after the test.

Assertion clarity (both tests)

  • Replaced single-list assert texts == EXPECTED_TEXT with per-element assertions that print got / expected on failure, removing the need for -vv to see the actual diff.

cc @kenroche

…ation

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…ation

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…ation

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…ation

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@mergify mergify bot added multi-modality Related to multi-modality (#4194) rocm Related to AMD ROCm labels Mar 16, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Mar 16, 2026
@AndreasKaratzas AndreasKaratzas added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 16, 2026
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 16, 2026 03:45
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several important fixes to stabilize the voxtral tests on ROCm. The changes include adding ROCm-specific engine arguments to ensure deterministic outputs, refactoring text normalization into a shared helper function, and improving assertion messages for better debugging. A key part of this PR is the fix for an engine teardown issue that caused OOM errors, which is addressed by converting fixtures to yield fixtures with cleanup logic.

My review focuses on the correctness of this new teardown logic. I've identified a critical issue in the shutdown sequence for the synchronous engine fixture that will likely prevent the resource leak from being fixed. Please see my detailed comment.

…ation

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
auto-merge was automatically disabled March 16, 2026 03:49

Head branch was pushed to by a user without write access

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 16, 2026 03:57
@DarkLight1337 DarkLight1337 merged commit d4c5786 into vllm-project:main Mar 16, 2026
21 of 22 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in AMD Mar 16, 2026
@AndreasKaratzas AndreasKaratzas deleted the akaratza_stabilize_voxtral branch March 16, 2026 06:40
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
…xtral test (vllm-project#37138)

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
…xtral test (vllm-project#37138)

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
…xtral test (vllm-project#37138)

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants