Skip to content

Fix NSA CP positions mismatch in eagle NextN model#19367

Merged
Fridge003 merged 1 commit intomainfrom
fix/nextn-cp-positions-split
Feb 26, 2026
Merged

Fix NSA CP positions mismatch in eagle NextN model#19367
Fridge003 merged 1 commit intomainfrom
fix/nextn-cp-positions-split

Conversation

@alisonshao
Copy link
Copy Markdown
Collaborator

Summary

  • When context parallelism (CP) is enabled, the eagle NextN model (deepseek_nextn.py) splits hidden_states across CP ranks but does not split positions, causing a shape mismatch in the NSA indexer's rotary embedding
  • The rotary embedding uses positions.size(0) as batch size to reshape k_rope, but k_rope has fewer elements because hidden_states was already CP-split
  • Adds the missing cp_split_and_rebuild_position call, matching what the main model (deepseek_v2.py:2707) already does

Error

RuntimeError: shape '[8, -1, 64]' is invalid for input of size 64

at rotary_embedding/base.py:292 during eagle speculative decoding with ATTN_CP4 TP4.

Full trace: https://github.com/sgl-project/sglang/actions/runs/22376709996/job/64768484941

Test plan

  • Run eagle speculative decoding with NSA + CP enabled (ATTN_CP4 TP4)

When context parallelism (CP) is enabled, the eagle NextN model splits
hidden_states across CP ranks but not positions, causing a shape mismatch
in the NSA indexer's rotary embedding. This matches the fix already present
in the main DeepseekV2 model (deepseek_v2.py:2707).
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@alisonshao
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@alisonshao
Copy link
Copy Markdown
Collaborator Author

@Fridge003 Fridge003 merged commit 0fd44ff into main Feb 26, 2026
234 of 278 checks passed
@Fridge003 Fridge003 deleted the fix/nextn-cp-positions-split branch February 26, 2026 04:14
@alisonshao
Copy link
Copy Markdown
Collaborator Author

Previously, self.cp_size was set to the TP size (wrong), so nsa_use_prefill_cp likely never triggered the actual CP split path correctly. PR #19062 fixed it to use the real CP size, which activated the CP split for hidden_states, but since positions was never split (missing since the original #12065), the mismatch now exists as the runtime error.

klhhhhh pushed a commit to klhhhhh/sglang that referenced this pull request Feb 26, 2026
magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants