fix: support PP2+CP8+TP8 (PP with context parallelism) by whybeyoung · Pull Request #19548 · sgl-project/sglang

whybeyoung · 2026-02-28T05:54:29Z

AS #19504 (comment) mentioned:
i fixed it on H20 * 8 * 2

scheduler_pp_mixin: only TP0+CP0 rank does pyobj send/recv to next PP stage; after TP broadcast, add CP broadcast so all CP ranks get data.
server_args: set attn_cp_size=tp_size in NSA prefill CP path; allow PP with CP when enable_nsa_prefill_context_parallel is set.

Ref: 98e9ecb (fix pp new cp)

- scheduler_pp_mixin: only TP0+CP0 rank does pyobj send/recv to next PP stage; after TP broadcast, add CP broadcast so all CP ranks get data. - server_args: set attn_cp_size=tp_size in NSA prefill CP path; allow PP with CP when enable_nsa_prefill_context_parallel is set. Ref: 98e9ecb (fix pp new cp)

gemini-code-assist · 2026-02-28T05:54:33Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

python/sglang/srt/server_args.py

ShangmingCai

LGTM for the broadcast part

ShangmingCai

need to fix lint

python/sglang/srt/server_args.py

ShangmingCai · 2026-03-02T03:43:38Z

/tag-and-rerun-ci

yiakwy-xpu-ml-framework-team · 2026-03-02T08:33:04Z

Yes I just manually copy the modification and pp2+cp works as expected (otherwise , it will not return to generate outputs)

yiakwy-xpu-ml-framework-team · 2026-03-02T08:35:30Z

python/sglang/srt/server_args.py

@@ -2142,7 +2117,8 @@ def _handle_context_parallelism(self):
            assert (
                self.tp_size % (self.dp_size * self.attn_cp_size) == 0


can we clarify the attn_cp_size ? Currently cp is used as attn_dp_size = tp / dp. It can be confusing .

yiakwy-xpu-ml-framework-team · 2026-03-02T08:39:50Z

python/sglang/srt/server_args.py

            ), "tp_size must be divisible by dp_size * attn_cp_size"
-            assert self.pp_size == 1, "PP is not supported with context parallelism"
+            if not self.enable_nsa_prefill_context_parallel:
+                assert self.pp_size == 1, "PP is not supported with context parallelism"


This should be reomved.

with TP=2, we can support CP for H800/H20x8.

For H200, there is no such constraints.

ShangmingCai · 2026-03-12T10:30:47Z

/rerun-failed-ci

…9548)

whybeyoung requested a review from ShangmingCai as a code owner February 28, 2026 05:54

ShangmingCai reviewed Feb 28, 2026

View reviewed changes

python/sglang/srt/server_args.py Outdated Show resolved Hide resolved

ShangmingCai approved these changes Feb 28, 2026

View reviewed changes

ShangmingCai reviewed Feb 28, 2026

View reviewed changes

whybeyoung requested review from ByronHsu, Fridge003, Ying1123, hanming-lu, hnyls2002, ispobock, merrymercy, xiezhq-hermann and yizhang2077 as code owners February 28, 2026 13:57

whybeyoung force-pushed the fix-cp8pp2tp8 branch 6 times, most recently from f47f8a5 to c6306a8 Compare March 2, 2026 02:40

whybeyoung commented Mar 2, 2026

View reviewed changes

python/sglang/srt/server_args.py Outdated Show resolved Hide resolved

fix lint

72346bd

github-actions bot added the run-ci label Mar 2, 2026

yiakwy-xpu-ml-framework-team mentioned this pull request Mar 2, 2026

(1/n)support context parallel with deepseekv3.2-DSA #12065

Merged

3 tasks

yiakwy-xpu-ml-framework-team reviewed Mar 2, 2026

View reviewed changes

yiakwy-xpu-ml-framework-team approved these changes Mar 2, 2026

View reviewed changes

Merge branch 'main' into fix-cp8pp2tp8

fb4ac91

whybeyoung and others added 2 commits March 11, 2026 20:45

fix assert

a1e6b24

Merge branch 'main' into fix-cp8pp2tp8

b7bef35

Merge branch 'main' into fix-cp8pp2tp8

3a4dd4f

whybeyoung enabled auto-merge (squash) March 15, 2026 00:46

Merge branch 'main' into fix-cp8pp2tp8

6d7d034

ShangmingCai added the high priority label Mar 16, 2026

whybeyoung merged commit 289cbcf into sgl-project:main Mar 16, 2026
89 of 94 checks passed

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

fix: support PP2+CP8+TP8 (PP with context parallelism) (sgl-project#1…

2cae45c

…9548)

0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026

fix: support PP2+CP8+TP8 (PP with context parallelism) (sgl-project#1…

3a69d37

…9548)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support PP2+CP8+TP8 (PP with context parallelism)#19548

fix: support PP2+CP8+TP8 (PP with context parallelism)#19548
whybeyoung merged 7 commits intosgl-project:mainfrom
whybeyoung:fix-cp8pp2tp8

whybeyoung commented Feb 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

Uh oh!

ShangmingCai left a comment

Uh oh!

ShangmingCai left a comment

Uh oh!

Uh oh!

ShangmingCai commented Mar 2, 2026

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 2, 2026

Uh oh!

yiakwy-xpu-ml-framework-team Mar 2, 2026

Uh oh!

yiakwy-xpu-ml-framework-team Mar 2, 2026

Uh oh!

ShangmingCai commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -2142,7 +2117,8 @@ def _handle_context_parallelism(self):
		assert (
		self.tp_size % (self.dp_size * self.attn_cp_size) == 0

Conversation

whybeyoung commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 28, 2026

Uh oh!

Uh oh!

ShangmingCai left a comment

Choose a reason for hiding this comment

Uh oh!

ShangmingCai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ShangmingCai commented Mar 2, 2026

Uh oh!

yiakwy-xpu-ml-framework-team commented Mar 2, 2026

Uh oh!

yiakwy-xpu-ml-framework-team Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

yiakwy-xpu-ml-framework-team Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

ShangmingCai commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

whybeyoung commented Feb 28, 2026 •

edited

Loading