[Core] Avoid using extra thread in `UniProcExecutor` by njhill · Pull Request #40891 · vllm-project/vllm

njhill · 2026-04-26T00:37:17Z

Clear performance benefit, especially in low latency / high concurrency case.

Benchmark

--tensor-parallel-size 1 --distributed-executor-backend uni on a single NVIDIA GB200 GPU.
Model: Qwen/Qwen3-0.6B. Each side is mean ± population std across 3 timed runs sharing one server process; each run uses its own seed (1, 2, 3) and is preceded by a fresh warmup batch.
Δ = relative change of with-mean vs. without-mean (✓ = improvement, ✗ = regression).
256 in / 2048 out; ignore-eos, no prefix cache.

Metric	Concurrency 1 (32 prompts)			Concurrency 512 (1024 prompts)
	Without	With	Δ	Without	With	Δ
Output throughput (tok/s)	631.21 ±0.64	648.73 ±14.60	+2.78% ✓	17221.96 ±168.39	18974.27 ±176.49	+10.17% ✓
P50 TTFT (ms)	19.00 ±0.16	19.21 ±0.19	+1.10% ✗	910.12 ±36.45	851.30 ±69.16	-6.46% ✓
P90 TTFT (ms)	19.48 ±0.27	19.88 ±0.32	+2.05% ✗	1309.86 ±49.79	1113.85 ±85.18	-14.96% ✓
P50 TPOT (ms)	1.572 ±0.001	1.530 ±0.032	-2.66% ✓	28.946 ±0.199	26.383 ±0.352	-8.85% ✓
P90 TPOT (ms)	1.588 ±0.005	1.558 ±0.033	-1.88% ✓	29.512 ±0.214	26.935 ±0.070	-8.73% ✓
Mean ITL (ms)	1.576 ±0.002	1.534 ±0.035	-2.68% ✓	29.112 ±0.099	26.563 ±0.243	-8.75% ✓
Mean E2EL (ms)	3244.52 ±3.29	3158.44 ±71.05	-2.65% ✓	60292.61 ±232.44	55010.05 ±510.73	-8.76% ✓

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request replaces the ThreadPoolExecutor in UniProcExecutor with a custom AsyncOutputFuture class to manage asynchronous model outputs. Review feedback identifies a thread-safety issue in the AsyncOutputFuture.result method that could cause multiple calls to get_output(), violating its contract. Suggestions include implementing a double-checked locking pattern, using NotImplementedError for the timeout parameter, and adding the necessary threading import.

@zhuohan123

@zhuohan123's idea Signed-off-by: Nick Hill <nickhill123@gmail.com>

) Signed-off-by: Nick Hill <nickhill123@gmail.com>

) Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Libin Tang <libin.tang@intel.com>

claude Bot reviewed Apr 26, 2026

View reviewed changes

mergify Bot added the v1 label Apr 26, 2026

gemini-code-assist Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread vllm/v1/executor/uniproc_executor.py

Comment thread vllm/v1/executor/uniproc_executor.py

njhill force-pushed the uniproc-thread branch from a24dd60 to 364f54a Compare May 6, 2026 19:52

[Core] Avoid extra thread in UniProcExecutor

5b3d457

@zhuohan123's idea Signed-off-by: Nick Hill <nickhill123@gmail.com>

njhill force-pushed the uniproc-thread branch from 364f54a to 5b3d457 Compare May 7, 2026 02:22

njhill added ready ONLY add when PR is ready to merge/full CI is needed labels May 7, 2026

njhill requested review from zhuohan123 May 7, 2026 02:25

zhuohan123 approved these changes May 7, 2026

View reviewed changes

zhuohan123 merged commit 10ebb40 into vllm-project:main May 7, 2026
49 of 52 checks passed

whytem pushed a commit to whytem/vllm that referenced this pull request May 8, 2026

[Core] Avoid using extra thread in UniProcExecutor (vllm-project#40891

99ef6cd

) Signed-off-by: Nick Hill <nickhill123@gmail.com>

libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026

[Core] Avoid using extra thread in UniProcExecutor (vllm-project#40891

8330bd8

) Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Libin Tang <libin.tang@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core] Avoid using extra thread in `UniProcExecutor`#40891

[Core] Avoid using extra thread in `UniProcExecutor`#40891
zhuohan123 merged 1 commit intovllm-project:mainfrom
njhill:uniproc-thread

njhill commented Apr 26, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

njhill commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njhill commented Apr 26, 2026 •

edited

Loading