Expose Model Parallelism Information by JD-ETH · Pull Request #16860 · sgl-project/sglang

JD-ETH · 2026-01-10T06:05:02Z

Motivation

We want to enable model shards to be initialized outside of sglang in a way that's identical to sglang's ModelLoader.
The external initialization workflow will look something like this:

sglang.srt.server_args._global_server_args = server_args
model_parallelism_info = engine.get_parallelism_config(rank) 
with ParallelismContext(RankParallelismConfig.from_dict(model_parallelism_info)):
   model = get_model(
                model_config=model_config,
                load_config=load_config,
                device_config=device_config,
            )

The primary use case of this is to enable train -> inference weight transfer. with 14997, we already have transfer engine and weight registration, but the transfer can only happen between weights of identical shape. One option is to instantiate the model shards outside, register the weights, and send them to sglang via the existing registrations.

I will however leave draft_tp for future work. The test currently verifies: TP, EP and Moe with DP Attention, with the CI test only has the most simple tp case.

Modifications

Whenever transfer engine is enabled, we additionally expose the full parallelism configurations of tp/pp/ep/attn_tp/attn_dp information through http API. The info is propagated to the _global_states just like the rdma weight register does at initialization time, so no performance impact.

A small fix was made to pass dp_parallel_controller's scheduler_infos back to the global engine instance. This allows remote_weight_info APIs to work with dp>1.

Accuracy Tests

python -m sglang.launch_server --model-path qwen/qwen2.5-0.5b-instruct --remote-instance-weight-loader-start-seed-via-transfer-engine
curl http://localhost:30000/get_parallelism_config
returns:
{"rank":0,"tp_size":1,"tp_rank":0,"pp_size":1,"pp_rank":0,"ep_size":1,"ep_rank":0,"attn_tp_size":1,"attn_tp_rank":0,"attn_dp_size":1,"attn_dp_rank":0,"world_size":1,"global_rank":0} etc

the test_parallelism_context_integration.py shows the intended use case and that the model parameters match.

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-10T06:05:05Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

stmatengss · 2026-01-10T06:33:19Z

/tag-and-rerun-ci

JD-ETH · 2026-01-19T22:43:09Z

/rerun-failed-ci

zhaochenyang20 · 2026-01-19T23:31:06Z

add the commands of HTTPS;
modify sglang document;

zhaochenyang20 · 2026-01-19T23:31:24Z

/rerun-failed-ci

JD-ETH · 2026-01-19T23:32:12Z

add to run suite

python/sglang/srt/entrypoints/http_server.py

Conflicts resolved: - model_runner.py: keep both RankParallelismConfig and use_symmetric_memory imports - scheduler.py: add parallelism_config_info to get_init_info() (upstream refactored init info into this method) - http_server.py: add parallelism_config_info to _GlobalState in _setup_and_run_http_server() (upstream extracted server startup into this function) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

python/sglang/srt/entrypoints/engine.py

JD-ETH · 2026-03-19T05:26:43Z

refactored to new PR without all the piping: #20907

JD-ETH requested review from Fridge003, Ying1123, hnyls2002 and merrymercy as code owners January 10, 2026 06:05

JD-ETH requested review from CatherineSue, JustinTong0323, ch-wan, ispobock, slin1237, xiezhq-hermann and yizhang2077 as code owners January 10, 2026 06:05

github-actions bot added the run-ci label Jan 10, 2026

JD-ETH added 4 commits January 18, 2026 17:59

implementing parallelism config within remote instance initialization

7cc4823

naming issue

1cb0dfa

update

d7d6883

add test

14e5c06

JD-ETH force-pushed the feature/parallelism_context_for_model_replica branch from 648bd9d to 14e5c06 Compare January 18, 2026 18:05

JD-ETH mentioned this pull request Jan 18, 2026

[Feature][WIP]: Enable RDMA weight transfer for RL use cases #17311

Open

3 tasks

break circular import

8ef294d

slin1237 requested changes Jan 19, 2026

View reviewed changes

python/sglang/srt/entrypoints/http_server.py Outdated Show resolved Hide resolved

JD-ETH and others added 4 commits January 21, 2026 17:02

move to ci

efcd5c3

move to ci

5817cb0

refactor in progres

faf048c

add new parallelisms

c01816e

amysaq2023 reviewed Mar 12, 2026

View reviewed changes

python/sglang/srt/entrypoints/engine.py Outdated Show resolved Hide resolved

address comment

db0938f

ShangmingCai assigned ShangmingCai and Fridge003 Mar 16, 2026

JD-ETH closed this Mar 19, 2026

JD-ETH mentioned this pull request Mar 19, 2026

Expose Model Parallelism Information #20907

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Model Parallelism Information #16860

Expose Model Parallelism Information #16860
JD-ETH wants to merge 11 commits intosgl-project:mainfrom
JD-ETH:feature/parallelism_context_for_model_replica

JD-ETH commented Jan 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 10, 2026

Uh oh!

stmatengss commented Jan 10, 2026

Uh oh!

JD-ETH commented Jan 19, 2026

Uh oh!

zhaochenyang20 commented Jan 19, 2026

Uh oh!

zhaochenyang20 commented Jan 19, 2026

Uh oh!

JD-ETH commented Jan 19, 2026

Uh oh!

Uh oh!

Uh oh!

JD-ETH commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

JD-ETH commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Jan 10, 2026

Uh oh!

stmatengss commented Jan 10, 2026

Uh oh!

JD-ETH commented Jan 19, 2026

Uh oh!

zhaochenyang20 commented Jan 19, 2026

Uh oh!

zhaochenyang20 commented Jan 19, 2026

Uh oh!

JD-ETH commented Jan 19, 2026

Uh oh!

Uh oh!

Uh oh!

JD-ETH commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

JD-ETH commented Jan 10, 2026 •

edited

Loading