[Bugfix]: fixed ServerDisconnectedError in benchmark test (reapply #1683, fixes #1374) by NumberWan · Pull Request #1841 · vllm-project/vllm-omni

NumberWan · 2026-03-12T07:43:25Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

This PR fixes #1374, where vllm bench serve --omni occasionally fails with aiohttp.client_exceptions.ServerDisconnectedError under high concurrency (e.g. max-concurrency=10) even though the Omni server is still healthy and continues to return 200 OK.

Root cause:

The Omni benchmark client uses aiohttp.ClientSession with keep‑alive connections.
Under short-request, high‑concurrency load, some idle connections are closed by the server / proxies between requests (e.g. keep‑alive idle timeout).
When the client reuses such half‑closed connections, ServerDisconnectedError is raised, and the current benchmark path treats this as a fatal error without retry.

Fixes in this PR:

Client-side retry for transient HTTP transport errors
In async_request_openai_chat_omni_completions (vllm_omni/benchmarks/patch/patch.py), wrap the HTTP request in a bounded retry loop
In tests/perf/tests/test.json, restore the perf matrix that was previously reduced in eeb393f (which temporarily lowered max_concurrency and num_prompts to avoid [Bug]: [Benchmark] Server disconnected error #1374):

num_prompts: [10, 40, 100]
max_concurrency: [1, 4, 10]
This effectively recovers the original high‑concurrency perf cases for Qwen3‑Omni (both normal and async‑chunk).

CI-only keep‑alive tuning for nightly perf

In .buildkite/test-nightly.yml, for the 🌕 Omni Model Perf Test & Test Case Statistics step, set:

VLLM_HTTP_TIMEOUT_KEEP_ALIVE=60

This change is scoped only to the nightly perf Buildkite job, to reduce idle‑timeout‑induced disconnects during long‑running, high‑concurrency benchmarks, without changing the default keep‑alive behavior for normal users.

Test Plan

use 50 max-concurrency and 10000 num-prompts

export VLLM_HTTP_TIMEOUT_KEEP_ALIVE=120
vllm serve /home/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --omni --port 28985 


vllm bench serve     --omni   --dataset-name random   --port 28985   --max-concurrency 50   --model /home/models/Qwen/Qwen3-Omni-30B-A3B-Instruct   --endpoint /v1/chat/completions   --backend openai-chat-omni   --num-prompts 10000   --random-input-len 10 --ignore-eos   --percentile-metrics ttft,tpot,itl,e2el,audio_ttfp,audio_rtf   --random-output-len 10 --extra_body '{"modalities": ["text", "audio"]}'

Test Result

Passed. all 10000 requests successfully replied.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1e902af223

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 10997e6453

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Gaohan123 · 2026-03-12T09:13:55Z

@amy-why-3459 PTAL

amy-why-3459 · 2026-03-12T09:15:21Z

LGTM

hsliuustc0106

Review Blocked - Gate Failure

pre-commit: FAILED ❌

Please fix the pre-commit issues before review:

pre-commit run --all-files

Once pre-commit passes, I'll review the ServerDisconnectedError retry logic.

NumberWan · 2026-03-13T07:51:32Z

@codex review

chatgpt-codex-connector

💡 Codex Review

vllm-omni/vllm_omni/benchmarks/patch/patch.py

Line 1 in fddc222

import asyncio

Re-encode benchmark patch module as UTF-8 text

This commit rewrites vllm_omni/benchmarks/patch/patch.py as UTF-16 (starts with BOM 0xFF 0xFE and embedded NUL bytes), which makes the module unloadable by Python (ValueError: source code string cannot contain null bytes when compiling/importing), so any benchmark path importing this patch will fail before execution.

vllm-omni/tests/perf/tests/test.json

Line 1 in fddc222

[

Keep perf test matrix JSON in UTF-8 encoding

tests/perf/tests/test.json is also converted to UTF-16 in this commit, but the perf harness reads it with open(..., encoding="utf-8") in tests/perf/scripts/run_benchmark.py, so loading the config now raises UnicodeDecodeError and prevents the nightly perf suite from starting.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

…m/NumberWan/vllm-omni into fix_server_disconnected_benchmark

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

congw729

LGTM

Gaohan123 · 2026-03-19T01:25:10Z

@amy-why-3459 @R2-Y PTAL

R2-Y · 2026-03-19T01:38:21Z

LGTM, thanks

congw729 · 2026-03-19T03:47:09Z

@hsliuustc0106 @Gaohan123 @david6666666 Please add a ready label.

…chmark test Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

…m/NumberWan/vllm-omni into fix_server_disconnected_benchmark

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Gaohan123

LGTM. Thanks

…lm-project#1683, fixes vllm-project#1374) (vllm-project#1841) Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan marked this pull request as ready for review March 12, 2026 07:59

NumberWan requested a review from hsliuustc0106 as a code owner March 12, 2026 07:59

chatgpt-codex-connector Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread vllm_omni/benchmarks/patch/patch.py

NumberWan force-pushed the fix_server_disconnected_benchmark branch 2 times, most recently from f33fd41 to 4083e17 Compare March 12, 2026 08:23

NumberWan marked this pull request as draft March 12, 2026 08:30

NumberWan marked this pull request as ready for review March 12, 2026 08:59

chatgpt-codex-connector Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread vllm_omni/benchmarks/patch/patch.py

Gaohan123 added this to the v0.18.0 milestone Mar 12, 2026

hsliuustc0106 reviewed Mar 12, 2026

View reviewed changes

NumberWan mentioned this pull request Mar 13, 2026

[Bug]: Benchmark - Server disconnected error JiusiServe/vllm-omni#141

Closed

1 task

NumberWan force-pushed the fix_server_disconnected_benchmark branch from e928439 to f6aeb1d Compare March 13, 2026 07:47

NumberWan force-pushed the fix_server_disconnected_benchmark branch 2 times, most recently from 6a0b521 to 2498f04 Compare March 13, 2026 08:54

NumberWan closed this Mar 13, 2026

NumberWan reopened this Mar 13, 2026

NumberWan marked this pull request as draft March 13, 2026 09:02

NumberWan force-pushed the fix_server_disconnected_benchmark branch 3 times, most recently from f6aeb1d to fddc222 Compare March 13, 2026 09:28

NumberWan marked this pull request as ready for review March 13, 2026 09:28

chatgpt-codex-connector Bot reviewed Mar 13, 2026

View reviewed changes

fix: make omni benchmark resilient to ServerDisconnectedError

d3a4941

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan force-pushed the fix_server_disconnected_benchmark branch from fddc222 to d3a4941 Compare March 13, 2026 10:03

NumberWan added 3 commits March 16, 2026 09:30

Merge branch 'main' into fix_server_disconnected_benchmark

50e091c

fix: make omni benchmark resilient to ServerDisconnectedError

3c0fefa

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Merge branch 'fix_server_disconnected_benchmark' of https://github.co…

f6a7aeb

…m/NumberWan/vllm-omni into fix_server_disconnected_benchmark

chore: trigger CI

75b53f8

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan requested a review from hsliuustc0106 March 17, 2026 06:16

congw729 reviewed Mar 18, 2026

View reviewed changes

david6666666 added the ready label to trigger buildkite CI label Mar 19, 2026

NumberWan added 4 commits March 19, 2026 16:52

edit: changed VLLM_HTTP_TIMEOUT_KEEP_ALIVE to 120 aligned to vllm ben…

3f36ceb

…chmark test Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Merge branch 'main' into fix_server_disconnected_benchmark

109a055

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Fix: format

d91142b

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Merge branch 'fix_server_disconnected_benchmark' of https://github.co…

23f6b2b

…m/NumberWan/vllm-omni into fix_server_disconnected_benchmark

NumberWan changed the title ~~fix: make omni benchmark resilient to ServerDisconnectedError (reapply #1683, fixes #1374)~~ [Bugfix]: make omni benchmark resilient to ServerDisconnectedError (reapply #1683, fixes #1374) Mar 20, 2026

NumberWan added 2 commits March 20, 2026 16:42

Optimize log message

20d433a

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

Fix: format

4da5d24

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

NumberWan changed the title ~~[Bugfix]: make omni benchmark resilient to ServerDisconnectedError (reapply #1683, fixes #1374)~~ [Bugfix]: fixed ServerDisconnectedError in benchmark test (reapply #1683, fixes #1374) Mar 20, 2026

Merge branch 'main' into fix_server_disconnected_benchmark

6fa7a43

Gaohan123 enabled auto-merge (squash) March 20, 2026 09:10

Fix: format

2b3782d

Signed-off-by: NumberWan <wantszkin2003@gmail.com>

auto-merge was automatically disabled March 20, 2026 09:32
Head branch was pushed to by a user without write access

Gaohan123 approved these changes Mar 21, 2026

View reviewed changes

Gaohan123 merged commit 7007217 into vllm-project:main Mar 21, 2026
8 checks passed

Conversation

NumberWan commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Gaohan123 commented Mar 12, 2026

Uh oh!

amy-why-3459 commented Mar 12, 2026

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Review Blocked - Gate Failure

Uh oh!

NumberWan commented Mar 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

congw729 left a comment

Choose a reason for hiding this comment

Uh oh!

Gaohan123 commented Mar 19, 2026

Uh oh!

R2-Y commented Mar 19, 2026

Uh oh!

congw729 commented Mar 19, 2026

Uh oh!

Gaohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

NumberWan commented Mar 12, 2026 •

edited

Loading