[Bugfix] Add missing auto_create_handle_loop to communicator methods by Kangyan-Zhou · Pull Request #19610 · sgl-project/sglang

Kangyan-Zhou · 2026-03-01T05:14:27Z

Summary

Several communicator methods in TokenizerCommunicatorMixin (e.g., get_internal_state, flush_cache, get_load) were missing the self.auto_create_handle_loop() call that ensures the ZMQ response-receiving loop is running.
Without this call, if users use --skip-server-warmup in the server launch args, which skips one forward run for server warmup, these methods hang indefinitely when invoked before any inference request has been processed, because the handle_loop asyncio task is never started, causing confusion.

Motivation

In PD disaggregation mode, the router calls /server_info (which invokes get_internal_state()) immediately after a worker pod starts — before any inference request arrives. Since auto_create_handle_loop() is only called from generate_request(), the handle_loop that receives scheduler responses via ZMQ is never created, causing get_internal_state() to wait forever.

This also affects flush_cache, get_load, get_loads, set_internal_state, dumper_control, and the HiCache management endpoints — any communicator method called on an idle server will hang.

Fix

Add self.auto_create_handle_loop() to the 10 communicator methods that were missing it, matching the pattern already used by generate_request(), slow_down(), and other working methods.

The call is idempotent — it returns immediately if the handle_loop is already running.

Test plan

Verified on a live PD disaggregation cluster (2 engines + 4 decoders) that /server_info hangs before the fix
Confirmed that triggering a generate request (which calls auto_create_handle_loop) unblocks the pending /server_info
After applying the fix, /server_info responds immediately on freshly started pods without any prior inference traffic

🤖 Generated with Claude Code

The handle_loop asyncio task in TokenizerManager is responsible for receiving responses from schedulers via ZMQ and dispatching them to the appropriate _Communicator. However, handle_loop is lazily started by auto_create_handle_loop() and several communicator methods were missing this call. This caused /server_info (and other endpoints like /flush_cache, /get_loads) to hang indefinitely when called on freshly-started servers that had not yet processed any inference request -- because no inference request had triggered auto_create_handle_loop() yet, the scheduler responses were never received. This is particularly critical for PD disaggregation setups where the sglang router's service discovery calls /server_info as the very first interaction with worker pods during the discover_metadata step, before any generate request is sent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist · 2026-03-01T05:14:32Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…gl-project#19610) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Kangyan-Zhou requested review from Ying1123, hnyls2002, merrymercy and xiezhq-hermann as code owners March 1, 2026 05:14

Kangyan-Zhou merged commit 98224de into sgl-project:main Mar 1, 2026
54 of 62 checks passed

Kangyan-Zhou added a commit to Kangyan-Zhou/sglang that referenced this pull request Mar 4, 2026

[Bugfix] Add missing auto_create_handle_loop to communicator methods (s…

b3d4c85

…gl-project#19610) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026

[Bugfix] Add missing auto_create_handle_loop to communicator methods (s…

e9cf25d

…gl-project#19610) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

[Bugfix] Add missing auto_create_handle_loop to communicator methods (s…

923e7ab

…gl-project#19610) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Add missing auto_create_handle_loop to communicator methods#19610

[Bugfix] Add missing auto_create_handle_loop to communicator methods#19610
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
Kangyan-Zhou:fix/server-info-handle-loop

Kangyan-Zhou commented Mar 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kangyan-Zhou commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Fix

Test plan

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kangyan-Zhou commented Mar 1, 2026 •

edited

Loading