Limit gpu utils and lower max BS on test_transcription_api_correctness.py by ekagra-ranjan · Pull Request #41649 · vllm-project/vllm

ekagra-ranjan · 2026-05-04T16:17:51Z

#41478 added a fix to lower BS in the test. The CI passed in that PR but failed later on in https://buildkite.com/vllm/ci/builds/64258/canvas?sid=019df193-a942-4d4b-aeb0-15f160336dfa&tab=output. This PR adds more aggressive limit on max memory size since the CI machine is a 18GB MIG H200.

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request modifies the transcription correctness tests to prevent OOM errors on 18GB GPUs. Specifically, it reduces the MAX_SEQS_FOR_TRANSCRIPTION_TEST from 32 to 8 and introduces a GPU_UTIL_FOR_TRANSCRIPTION_TEST constant set to 0.5, which is now passed as a --gpu_memory_utilization argument to the test server. I have no feedback to provide.

…s.py (vllm-project#41649) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…s.py (vllm-project#41649) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: hongbolv <33214277+hongbolv@users.noreply.github.com>

…s.py (vllm-project#41649) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Signed-off-by: Ifta Khairul Alam Adil <ikaadil007@gmail.com>

ekagra-ranjan added 2 commits May 4, 2026 16:15

prevent OOM

e90f4cb

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

prevent OOM

901acc2

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

ekagra-ranjan requested review from DarkLight1337, NickLucche, aarnphm and robertgshaw2-redhat as code owners May 4, 2026 16:17

claude Bot reviewed May 4, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

DarkLight1337 approved these changes May 4, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 4, 2026 16:20

github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 4, 2026

DarkLight1337 merged commit 321fa2d into vllm-project:main May 4, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limit gpu utils and lower max BS on test_transcription_api_correctness.py#41649

Limit gpu utils and lower max BS on test_transcription_api_correctness.py#41649
DarkLight1337 merged 2 commits into
vllm-project:mainfrom
ekagra-ranjan:er-cohere-asr-ci-oom-2

ekagra-ranjan commented May 4, 2026

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ekagra-ranjan commented May 4, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants