[Core][WIP] Check for GPU<->CPU sync during CI by njhill · Pull Request #40561 · vllm-project/vllm

njhill · 2026-04-21T22:43:50Z

vLLM now uses asynchronous scheduling by default and in the majority of cases. Performance relies on the absence of any gpu<->cpu synchronizations on the main cuda stream, but such syncs can be opaque and it is easy for them to creep in accidentally.

This change adds a VLLM_GPU_SYNC_CHECK env var which enables torch.cuda.set_sync_debug_mode for the model forward pass and sampler, so that we can easily check for such syncs.

I'm trying first to enable it globally in the CI to flush out syncs that need to be fixed or where they are unavoidable and the check needs to be suppressed. Will then probably split the fixes into separate PR(s).

Update

Started to open separate PRs fixing identified sync points:

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request introduces a GPU-CPU synchronization check mechanism via the VLLM_GPU_SYNC_CHECK environment variable, which is set to "error" by default in the Dockerfiles. The check is applied to the sample_tokens and execute_model methods in the V1 GPU worker using a new decorator. Feedback indicates that the with_gpu_sync_check decorator should be improved to restore the previous synchronization mode rather than resetting to default and should check the environment variable at runtime to support dynamic disabling.

Signed-off-by: Nick Hill <nickhill123@gmail.com>

njhill requested review from ProExpertProg, WoosukKwon, gshtras, hmellor, houseroad, mgoin, robertgshaw2-redhat, tjtanaa, tlrmchlsmth, yewentao256 and youkaichao as code owners April 21, 2026 22:43

claude Bot reviewed Apr 21, 2026

View reviewed changes

mergify Bot added ci/build rocm Related to AMD ROCm v1 labels Apr 21, 2026

github-project-automation Bot added this to AMD Apr 21, 2026

github-project-automation Bot moved this to Todo in AMD Apr 21, 2026

gemini-code-assist Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread vllm/v1/worker/utils.py Outdated

njhill force-pushed the sync-check branch from 3c59f62 to 4a8fe98 Compare April 21, 2026 22:51

njhill added ready ONLY add when PR is ready to merge/full CI is needed and removed ready ONLY add when PR is ready to merge/full CI is needed labels Apr 21, 2026

njhill requested review from 22quinn, LucasWilkinson, MatthewBonanni, jeejeelee and pavanimajety as code owners April 22, 2026 14:54

njhill added ready ONLY add when PR is ready to merge/full CI is needed and removed ready ONLY add when PR is ready to merge/full CI is needed labels Apr 22, 2026

njhill added 26 commits April 30, 2026 09:00

tokwise pooler

855d3c2

Signed-off-by: Nick Hill <nickhill123@gmail.com>

idefics3

8e722ae

Signed-off-by: Nick Hill <nickhill123@gmail.com>

more mamba_attn

6fa40cf

Signed-off-by: Nick Hill <nickhill123@gmail.com>

fix preemptive lazy_init

6a1fd14

Signed-off-by: Nick Hill <nickhill123@gmail.com>

gemma3 mm

69be8d1

Signed-off-by: Nick Hill <nickhill123@gmail.com>

minor formatting

7a6bba8

Signed-off-by: Nick Hill <nickhill123@gmail.com>

move inductor lazy init to util method

a9e33b1

Signed-off-by: Nick Hill <nickhill123@gmail.com>

fast prefill

df26b02

Signed-off-by: Nick Hill <nickhill123@gmail.com>

lora load adapter

e5b45df

Signed-off-by: Nick Hill <nickhill123@gmail.com>

idefics2

ccb7c03

Signed-off-by: Nick Hill <nickhill123@gmail.com>

phi4mm_audio

e509ea8

Signed-off-by: Nick Hill <nickhill123@gmail.com>

temp

24afeb6

Signed-off-by: Nick Hill <nickhill123@gmail.com>

temp2

bd27f57

Signed-off-by: Nick Hill <nickhill123@gmail.com>

temp3

fb51bda

Signed-off-by: Nick Hill <nickhill123@gmail.com>

fix custom lp

e44ac81

Signed-off-by: Nick Hill <nickhill123@gmail.com>

temp4

ceda006

Signed-off-by: Nick Hill <nickhill123@gmail.com>

temp5

a7f931f

Signed-off-by: Nick Hill <nickhill123@gmail.com>

h2d util

9822304

Signed-off-by: Nick Hill <nickhill123@gmail.com>

use async_tensor_h2d utility function

93ac29e

Signed-off-by: Nick Hill <nickhill123@gmail.com>

avoid circular import

be7b548

Signed-off-by: Nick Hill <nickhill123@gmail.com>

typo

4805d7b

Signed-off-by: Nick Hill <nickhill123@gmail.com>

qwen recompute_mrope_positions; qwen3_vl updates

ddc75a8

Signed-off-by: Nick Hill <nickhill123@gmail.com>

switch gpu_sync_allowed count to first_only bool

610ff40

Signed-off-by: Nick Hill <nickhill123@gmail.com>

remove now-redundant guards in grouped_topk_router.py

b3007c0

Signed-off-by: Nick Hill <nickhill123@gmail.com>

qwen3_asr

bcbbb93

Signed-off-by: Nick Hill <nickhill123@gmail.com>

post-rebase fixups

a926b48

Signed-off-by: Nick Hill <nickhill123@gmail.com>

njhill force-pushed the sync-check branch from 74d9ecd to a926b48 Compare April 30, 2026 17:20

This was referenced May 1, 2026

[Perf][1/n] Eliminate various GPU<->CPU syncs #41429

Open

[Perf][2/n] Eliminate GPU<->CPU syncs in pooling code #41433

Merged

[Perf][3/n] Eliminate GPU<->CPU syncs in attention impls #41434

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core][WIP] Check for GPU<->CPU sync during CI#40561

[Core][WIP] Check for GPU<->CPU sync during CI#40561
njhill wants to merge 71 commits intovllm-project:mainfrom
njhill:sync-check

njhill commented Apr 21, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

njhill commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

njhill commented Apr 21, 2026 •

edited

Loading