Followup fix for Custom AR V2 in non NVL scenarios by b8zhong · Pull Request #24742 · sgl-project/sglang

b8zhong · 2026-05-09T02:28:04Z

gemini-code-assist · 2026-05-09T02:28:08Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

* main: (87 commits) [Fix] Disable FlashInfer allreduce fusion under deterministic inference (sgl-project#24629) fix: STANDALONE spec-decode hidden-size mismatch crash (sgl-project#24217) Followup fix for Custom AR V2 in non NVL scenarios (sgl-project#24742) Fix reduce_scatterv producer contract for SUM_LEN (sgl-project#24785) [NPU]Documentation update for communications quantization feature (sgl-project#24668) [Session R3] Add routed_experts_start_len for absolute routing slice control (sgl-project#24851) [Model] Add MiniCPM-V 4.6 support (sgl-project#24855) Support Intern-S2-Preview (sgl-project#24875) [PD] Unify dsv4 dispatch with swa (sgl-project#24888) Optimize MHC pipeline: DeepGemm, fused norm, fused hc_head (sgl-project#24775) Fix PD bootstrap failure handling (sgl-project#24772) [Spec] Cleanup idle stub and shape-check patterns (sgl-project#24881) [Bug] Add dsv4 state_type branch to mooncake disaggregation (sgl-project#24878) [Spec V1] Split draft-extend phase from `EagleDraftInput` into new `EagleDraftExtendInput` (sgl-project#24859) [Gemma4] Optimize Gemm4 with fused Q/K/V RMSNorm + per-expert FP8 ckpt loader (sgl-project#24696) [spec decoding] support kimi-k2.5-eagle3-mla (sgl-project#24826) [SPEC V2] fix: skip stale state updates in spec-v2 overlap (sgl-project#23456) [RL] Call torch.cuda.empty_cache() for `in-place` pause mode to avoid OOM (sgl-project#24854) [diffusion] CI: add cache-dit CI tests (sgl-project#19213) [Utils] Make request dump robust to unpicklable server_args and large meta_info (sgl-project#24767) ... # Conflicts: # python/sglang/srt/utils/common.py

more

4b2b57f

b8zhong requested review from ch-wan, merrymercy and yizhang2077 as code owners May 9, 2026 02:28

b8zhong added the run-ci label May 9, 2026

ch-wan assigned DarkSharpness May 9, 2026

ch-wan approved these changes May 10, 2026

View reviewed changes

ch-wan merged commit 8acb027 into main May 10, 2026
351 of 416 checks passed

ch-wan deleted the brayden/custom-ar-followup-fix branch May 10, 2026 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Followup fix for Custom AR V2 in non NVL scenarios#24742

Followup fix for Custom AR V2 in non NVL scenarios#24742
ch-wan merged 1 commit into
mainfrom
brayden/custom-ar-followup-fix

b8zhong commented May 9, 2026

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

b8zhong commented May 9, 2026

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants