[Bug][PD][NIXL] always send aux on is_last; only expects_state when truthy#25699
Merged
ShangmingCai merged 2 commits intoMay 19, 2026
Merged
Conversation
…ruthy Fixes sgl-project#25698. Two asymmetries introduced in sgl-project#24932 (PD hybrid state refactor) hang NIXL disagg for every dense model (LLaMA, Qwen3, etc.): 1. NixlKVManager.transfer_worker gates the aux RDMA write inside `if kv_chunk.is_last and kv_chunk.state_indices:`. For dense models state_indices is [] (falsy) so the whole block short-circuits and send_aux is never called. Decode never gets the {room}_aux notif. Split into two: gate state on state_indices, gate aux on is_last only (matches v0.5.11 behavior and matches mooncake/conn.py:1344-1357). 2. NixlKVReceiver.send_metadata sets expects_state=True whenever state_indices is not None, but decode receives state_indices=[] (non-None, empty) for dense models, so expects_state flips on and is_done() waits forever for a state notif prefill never sends. Switch to a truthy check to match the prefill side at line ~692. Verified end-to-end on Qwen/Qwen3-0.6B (dense), 2x L40S, tp1+tp1, single host. Mooncake unaffected (independent if blocks; no expects_state field).
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ispobock
approved these changes
May 19, 2026
Collaborator
|
/tag-and-rerun-ci |
Collaborator
|
@ishandhanani could you fix the lint? |
Collaborator
Author
Yep. Will do in an hour when I get online again |
| # Mark that we expect state data if state_indices was provided | ||
| if state_indices is not None: | ||
| # Mark that we expect state data if state_indices was provided. | ||
| # Match the prefill-side truthy check (line ~697): an empty list |
Collaborator
There was a problem hiding this comment.
It would be better not to include the code line in the comment, let me fix this and fix the lint
Collaborator
@ishandhanani Let me do it for you. Since nixl is not in the CI, we could just merge. And feel free to add a nixl test in |
4 tasks
Kangyan-Zhou
pushed a commit
that referenced
this pull request
May 19, 2026
Co-authored-by: Shangming Cai <csmthu@gmail.com>
5 tasks
6 tasks
Shunkangz
pushed a commit
to Shunkangz/sglang
that referenced
this pull request
May 27, 2026
…ruthy (sgl-project#25699) Co-authored-by: Shangming Cai <csmthu@gmail.com>
alphabetc1
pushed a commit
to alphabetc1/sglang
that referenced
this pull request
Jun 4, 2026
…ruthy (sgl-project#25699) Co-authored-by: Shangming Cai <csmthu@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Fixes #25698.
Two asymmetries introduced in #24932 (
[PD] Refactor hybrid state transfer) hang every NIXL P/D disagg request for dense models (no Mamba / SWA / NSA — i.e. LLaMA, Qwen3, Gemma, GPT-OSS, …) onv0.5.12. Bisects cleanly to commitd7f4761a4.NixlKVManager.transfer_workergates the aux RDMA write insideif kv_chunk.is_last and kv_chunk.state_indices:. For dense models,state_indicesis[](falsy from the prefill scheduler's emptystate_types), so the whole branch short-circuits andsend_auxis never called. Decode never receives the{room}_auxnotification → hangs.NixlKVReceiver.send_metadataflipsexpects_state=Truewheneverstate_indices is not None. Decode receivesstate_indices=[](non-None, empty) for dense models, soexpects_stateflips on. The corresponding prefill check at line ~692 is truthy (if kv_chunk.state_indices:) and (correctly) never sends a state notif, sois_done()waits forever for a notif that doesn't come.Either bug alone hangs the request; both fixes are required for end-to-end disagg on dense models.
Mooncake is unaffected —
mooncake/conn.py:1344-1357already gates state and aux on independentifblocks, and mooncake has noexpects_statefield (per-chunk return-code polling instead). Mooncake users onv0.5.12can keep running disagg; NIXL users currently can't.Modifications
python/sglang/srt/disaggregation/nixl/conn.py:is_lastandstate_indicesgates intransfer_workerso the aux RDMA write always runs onis_last, and state only runs whenstate_indicesis non-empty. Matches thev0.5.11shape and matchesmooncake/conn.py:1344-1357.NixlKVReceiver.send_metadatafromif state_indices is not None:toif state_indices:soexpects_stateonly flips on when there's actually state to expect.Accuracy Tests
End-to-end manual verification on
Qwen/Qwen3-0.6B(dense Qwen3) with NIXL UCX backend on 2× L40S, single host, tp1+tp1:Before either fix
n_handles=1(aux skipped) and decodeis_donestaysFalseforever. After both fixes the request returns tokens cleanly in <2s.I do not have access to a multi-node setup to exercise heterogeneous-TP staging here, but neither hunk changes any code path that runs only with
enable_staging=True,state_indicestruthy, MLA, ordecode_tp_size != attn_tp_size— they only restore the dense-LLM path to itsv0.5.11behavior.Benchmarking and Profiling
No perf change expected — same RDMA ops on the wire; no allocation or branch-count changes for the stateful paths (Mamba/SWA/NSA), and the dense-LLM path now actually completes instead of looping in
pop_transferreduntilSGLANG_DISAGGREGATION_WAITING_TIMEOUT.Checklist
CI States
Latest PR Test (Base): ❌ Run #26071445981
Latest PR Test (Extra): ❌ Blocked --
run-ciis required first.