[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test by AndreasKaratzas · Pull Request #37138 · vllm-project/vllm

AndreasKaratzas · 2026-03-16T03:42:35Z

Fixes two independent failures in test_voxtral_realtime_forward and test_voxtral_realtime_generator observed on ROCm CI runs.

ROCm non-determinism (`test_voxtral_realtime_forward`)

Adds ROCM_ENGINE_KWARGS to tests/utils.py as the Python-API equivalent of the existing ROCM_EXTRA_ARGS (CLI flags).
Applies max_num_seqs=1 and enable_prefix_caching=False on ROCm via **ROCM_ENGINE_KWARGS in ENGINE_CONFIG, eliminating batch variance and prefix-cache interference that caused non-deterministic outputs even at temperature=0.0.

Missing text normalization (`test_voxtral_realtime_forward`)

The model occasionally transcribes "OBS" as "a base hit" and "oh, my" as "oh my", which are acoustically valid alternatives.
test_voxtral_realtime_generator already handled this inline. Extracted into a shared _normalize() helper and applied to both tests to eliminate the drift.

Engine teardown / OOM (`test_voxtral_realtime_generator`)

The engine fixture had no teardown, so the LLM held GPU memory when the async engine's subprocess tried to start, causing Engine core initialization failed with Failed core proc(s): {}.
Fixed by converting the fixture to a yield fixture that calls llm.llm_engine.shutdown() and torch.cuda.empty_cache() after the test.

Assertion clarity (both tests)

Replaced single-list assert texts == EXPECTED_TEXT with per-element assertions that print got / expected on failure, removing the need for -vv to see the actual diff.

cc @kenroche

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

gemini-code-assist

Code Review

This pull request introduces several important fixes to stabilize the voxtral tests on ROCm. The changes include adding ROCm-specific engine arguments to ensure deterministic outputs, refactoring text normalization into a shared helper function, and improving assertion messages for better debugging. A key part of this PR is the fix for an engine teardown issue that caused OOM errors, which is addressed by converting fixtures to yield fixtures with cleanup logic.

My review focuses on the correctness of this new teardown logic. I've identified a critical issue in the shutdown sequence for the synchronous engine fixture that will likely prevent the resource leak from being fixed. Please see my detailed comment.

tests/models/multimodal/generation/test_voxtral_realtime.py

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

…xtral test (vllm-project#37138) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

AndreasKaratzas added 4 commits March 15, 2026 22:25

[ROCm][CI] stabilize voxtral -- fix engine teardown and text normaliz…

4ff6d4b

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

[ROCm][CI] stabilize voxtral -- fix engine teardown and text normaliz…

68b098a

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

[ROCm][CI] stabilize voxtral -- fix engine teardown and text normaliz…

89da3b8

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

[ROCm][CI] stabilize voxtral -- fix engine teardown and text normaliz…

ab39d40

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

AndreasKaratzas requested review from DarkLight1337 and ywang96 as code owners March 16, 2026 03:42

mergify bot added multi-modality Related to multi-modality (#4194) rocm Related to AMD ROCm labels Mar 16, 2026

github-project-automation bot added this to AMD Mar 16, 2026

github-project-automation bot moved this to Todo in AMD Mar 16, 2026

AndreasKaratzas added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 16, 2026

DarkLight1337 approved these changes Mar 16, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) March 16, 2026 03:45

gemini-code-assist bot reviewed Mar 16, 2026

View reviewed changes

tests/models/multimodal/generation/test_voxtral_realtime.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Mar 16, 2026

View reviewed changes

tests/models/multimodal/generation/test_voxtral_realtime.py Outdated Show resolved Hide resolved

[ROCm][CI] stabilize voxtral -- fix engine teardown and text normaliz…

dc01797

…ation Signed-off-by: Andreas Karatzas <akaratza@amd.com>

auto-merge was automatically disabled March 16, 2026 03:49
Head branch was pushed to by a user without write access

DarkLight1337 enabled auto-merge (squash) March 16, 2026 03:57

DarkLight1337 merged commit d4c5786 into vllm-project:main Mar 16, 2026
21 of 22 checks passed

github-project-automation bot moved this from Todo to Done in AMD Mar 16, 2026

AndreasKaratzas deleted the akaratza_stabilize_voxtral branch March 16, 2026 06:40

Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026

[ROCm][CI] Fix engine teardown and text normalization to stabilize vo…

e08688e

…xtral test (vllm-project#37138) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

[ROCm][CI] Fix engine teardown and text normalization to stabilize vo…

4afd1ef

…xtral test (vllm-project#37138) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[ROCm][CI] Fix engine teardown and text normalization to stabilize vo…

c46e0b5

…xtral test (vllm-project#37138) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test#37138

[ROCm][CI] Fix engine teardown and text normalization to stabilize voxtral test#37138
DarkLight1337 merged 5 commits intovllm-project:mainfrom
ROCm:akaratza_stabilize_voxtral

AndreasKaratzas commented Mar 16, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AndreasKaratzas commented Mar 16, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ROCm non-determinism (test_voxtral_realtime_forward)

Missing text normalization (test_voxtral_realtime_forward)

Engine teardown / OOM (test_voxtral_realtime_generator)

Assertion clarity (both tests)

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AndreasKaratzas commented Mar 16, 2026 •

edited by github-actions bot

Loading

ROCm non-determinism (`test_voxtral_realtime_forward`)

Missing text normalization (`test_voxtral_realtime_forward`)

Engine teardown / OOM (`test_voxtral_realtime_generator`)