[ROCm][CI] Fix flaky Cohere/OpenAI embedding parity test by AndreasKaratzas · Pull Request #37616 · vllm-project/vllm

AndreasKaratzas · 2026-03-19T23:57:24Z

Follow-up for:

[ROCm][CI] Cleaning and restructuring amd-ci legacy pipeline #34839

Stabilizes Cohere test that was failing to due batch invariance issues on ROCm. Addresses failure in mi325_1: Entrypoints Integration (Pooling)

Motivation: https://buildkite.com/vllm/amd-ci/builds/6701/steps/canvas?sid=019d07a7-1a2e-4d29-91e7-9eb765bc4904&tab=output

[Feature][Scheduler] Add split prefix caching feature to eliminate bf16 GEMM tiling divergence across cache-hit/miss paths #34046
[Bug][ROCm]: Prefix caching produces different output on first request (cache miss) vs subsequent requests (cache hit) #33123

cc @kenroche

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

gemini-code-assist

Code Review

This pull request aims to fix a flaky test for Cohere/OpenAI embedding parity on ROCm by adding ROCM_EXTRA_ARGS to the test server's configuration. This introduces arguments to disable prefix caching and limit the maximum number of sequences to one on ROCm platforms. While this change successfully stabilizes the test, I have a concern that limiting sequences to one effectively disables batch processing, which undermines the purpose of the test_batch_parity test. My review includes a suggestion to handle this more explicitly to maintain test integrity.

tests/entrypoints/pooling/embed/test_cohere_openai_parity.py

AndreasKaratzas · 2026-03-20T00:01:07Z

Testing MI325 to see if issue is resolved (added rocm and ready labels).

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

AndreasKaratzas · 2026-03-20T17:25:52Z

Test has been confirmed green: https://buildkite.com/vllm/amd-ci/builds/6732/steps/canvas?sid=019d0bd9-2a24-4529-a7c3-4c16a3f66397&tab=output

DarkLight1337 · 2026-03-21T03:57:20Z

vllm/entrypoints/pooling/base/serving.py

        await self._prepare_generators(ctx)
-        await self._collect_batch(ctx)
+        try:
+            await self._collect_batch(ctx)


Why is this needed? We now use app level error handlers to convert error responses

@DarkLight1337 The app-level Exception handler at api_server.py:270 handles dimensions=-1 correctly (returns 400), but for the immediately following dimensions=16 request on the same connection, the same ValueError from pooling_params.verify() escapes the Starlette ExceptionMiddleware and crashes the ASGI app. The client gets APIConnectionError instead of BadRequestError.

https://buildkite.com/vllm/amd-ci/builds/6711/steps/canvas?sid=019d088b-f229-4de8-923d-b4c48a62c6fb&tab=output

cc @andyxning

[ROCm][CI] Fix flaky Cohere/OpenAI embedding parity test

de7d9d7

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

mergify bot added the rocm Related to AMD ROCm label Mar 19, 2026

github-project-automation bot added this to AMD Mar 19, 2026

github-project-automation bot moved this to Todo in AMD Mar 19, 2026

gemini-code-assist bot reviewed Mar 19, 2026

View reviewed changes

tests/entrypoints/pooling/embed/test_cohere_openai_parity.py Show resolved Hide resolved

AndreasKaratzas added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 20, 2026

[ROCm][CI] Fix flaky Cohere/OpenAI embedding parity test

21ad38c

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

mergify bot added the frontend label Mar 20, 2026

AndreasKaratzas marked this pull request as ready for review March 20, 2026 17:25

AndreasKaratzas requested a review from noooop as a code owner March 20, 2026 17:25

AndreasKaratzas requested a review from DarkLight1337 March 20, 2026 21:43

DarkLight1337 reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm][CI] Fix flaky Cohere/OpenAI embedding parity test#37616

[ROCm][CI] Fix flaky Cohere/OpenAI embedding parity test#37616
AndreasKaratzas wants to merge 2 commits intovllm-project:mainfrom
ROCm:akaratza_fixcohere_openai

AndreasKaratzas commented Mar 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

AndreasKaratzas commented Mar 20, 2026

Uh oh!

AndreasKaratzas commented Mar 20, 2026

Uh oh!

DarkLight1337 Mar 21, 2026

Uh oh!

AndreasKaratzas Mar 21, 2026

Uh oh!

noooop Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AndreasKaratzas commented Mar 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

AndreasKaratzas commented Mar 20, 2026

Uh oh!

AndreasKaratzas commented Mar 20, 2026

Uh oh!

DarkLight1337 Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

AndreasKaratzas Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

noooop Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants