fix: support PP2+CP8+TP8 (PP with context parallelism)#19548
Merged
whybeyoung merged 7 commits intosgl-project:mainfrom Mar 16, 2026
Merged
fix: support PP2+CP8+TP8 (PP with context parallelism)#19548whybeyoung merged 7 commits intosgl-project:mainfrom
whybeyoung merged 7 commits intosgl-project:mainfrom
Conversation
- scheduler_pp_mixin: only TP0+CP0 rank does pyobj send/recv to next PP stage; after TP broadcast, add CP broadcast so all CP ranks get data. - server_args: set attn_cp_size=tp_size in NSA prefill CP path; allow PP with CP when enable_nsa_prefill_context_parallel is set. Ref: 98e9ecb (fix pp new cp)
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ShangmingCai
approved these changes
Feb 28, 2026
Collaborator
ShangmingCai
left a comment
There was a problem hiding this comment.
LGTM for the broadcast part
f47f8a5 to
c6306a8
Compare
whybeyoung
commented
Mar 2, 2026
Collaborator
|
/tag-and-rerun-ci |
3 tasks
Contributor
|
Yes I just manually copy the modification and pp2+cp works as expected (otherwise , it will not return to generate outputs) |
| @@ -2142,7 +2117,8 @@ def _handle_context_parallelism(self): | |||
| assert ( | |||
| self.tp_size % (self.dp_size * self.attn_cp_size) == 0 | |||
Contributor
There was a problem hiding this comment.
can we clarify the attn_cp_size ? Currently cp is used as attn_dp_size = tp / dp. It can be confusing .
yiakwy-xpu-ml-framework-team
approved these changes
Mar 2, 2026
python/sglang/srt/server_args.py
Outdated
| ), "tp_size must be divisible by dp_size * attn_cp_size" | ||
| assert self.pp_size == 1, "PP is not supported with context parallelism" | ||
| if not self.enable_nsa_prefill_context_parallel: | ||
| assert self.pp_size == 1, "PP is not supported with context parallelism" |
Contributor
There was a problem hiding this comment.
This should be reomved.
with TP=2, we can support CP for H800/H20x8.
For H200, there is no such constraints.
Collaborator
|
/rerun-failed-ci |
Wangzheee
pushed a commit
to Wangzheee/sglang
that referenced
this pull request
Mar 21, 2026
0-693
pushed a commit
to 0-693/sglang
that referenced
this pull request
Mar 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AS #19504 (comment) mentioned:
i fixed it on H20 * 8 * 2
Ref: 98e9ecb (fix pp new cp)
CC @ShangmingCai @Fridge003 @xu-yfei