Skip to content

add remote dp rank for disaggregation.#18230

Closed
huitianbai wants to merge 3 commits intosgl-project:mainfrom
huitianbai:disagg
Closed

add remote dp rank for disaggregation.#18230
huitianbai wants to merge 3 commits intosgl-project:mainfrom
huitianbai:disagg

Conversation

@huitianbai
Copy link
Copy Markdown

This PR add a remote dp rank parameter for request, which helps routing for PD disaggregation.
In PD disagg mode, decode node simply uses dp_rank from req as prefill_dp_rank, the rank number belongs to decode itself, it will cause handshake failure.
It will close issue #17560.
Related dynamo discussion: ai-dynamo/dynamo#5638

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@huitianbai
Copy link
Copy Markdown
Author

@ishandhanani What should I do next in this PR?

@ishandhanani
Copy link
Copy Markdown
Collaborator

ishandhanani commented Feb 10, 2026

@huitianbai can you confirm that this change works with the sgl-router (no need to wire anything up I just want to make sure that this doesn't break anything) and with the dynamo router?

@huitianbai
Copy link
Copy Markdown
Author

huitianbai commented Feb 11, 2026

@huitianbai can you confirm that this change works with the sgl-router (no need to wire anything up I just want to make sure that this doesn't break anything) and with the dynamo router?

I tested a qwen3-8b P/D disagg example with sgl-router in a A800 server . It works fine. @ishandhanani

@huitianbai
Copy link
Copy Markdown
Author

huitianbai commented Feb 11, 2026

@huitianbai can you confirm that this change works with the sgl-router (no need to wire anything up I just want to make sure that this doesn't break anything) and with the dynamo router?

I also tested dynamo-router, this PR will not break anything, but I found another bug.

Sometimes, I encountered a crush in sglang with "Overflow when unpacking long long".
This happens in https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/disaggregation/utils.py#L237
In sglang, the bootstrap_room metadata is int64 type.
" self.bootstrap_room = torch.zeros(
(size, 8), dtype=torch.int64, device=device
)"

In dynamo, the bootstrap_room is generated with: (https://github.com/ai-dynamo/dynamo/blob/main/lib/llm/src/kv_router/prefill_router.rs#L322)
"let bootstrap_room: u64 = rand::rng().random();"

It may overflow and cause the "Overflow when unpacking long long".

I locally fix it by:
"let mut bootstrap_room: u64 = rand::rng().random();
bootstrap_room = bootstrap_room >> 1;
" (Not a graceful solution)

I think we should limit the bootstrap_room range to avoid overflow.
@ishandhanani

@ishandhanani
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@ishandhanani
Copy link
Copy Markdown
Collaborator

ishandhanani commented Feb 13, 2026

PR LGTM - lets go ahead and get this in.

@huitianbai - do you want to put one up for dynamo and the dynamo boostrap room as well?

@huitianbai
Copy link
Copy Markdown
Author

huitianbai commented Feb 13, 2026

PR LGTM - lets go ahead and get this in.

@huitianbai - do you want to put one up for dynamo and the dynamo boostrap room as well?

Ok, I will.

@ishandhanani
Copy link
Copy Markdown
Collaborator

Can you take a look at #19168

@huitianbai
Copy link
Copy Markdown
Author

Can you take a look at #19168

I think my PR is not required. @ishandhanani

@huitianbai
Copy link
Copy Markdown
Author

See #19168

@huitianbai huitianbai closed this Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants