Fix Mistral Large 3 nightly test by b8zhong · Pull Request #25407 · sgl-project/sglang

b8zhong · 2026-05-15T13:13:59Z

python /sgl-workspace/sglang/test/registered/8-gpu-models/test_mistral_large3.py


============================================================
Mistral-Large-3 Results Summary
Dataset: gsm8k
Baseline: 0.85
============================================================

Model 1: mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4
  Performance: PASS, output: 3773.9 tok/s
  Accuracy: PASS
  Score: 0.957

============================================================
OVERALL: ALL TESTS PASSED
============================================================

.
----------------------------------------------------------------------
Ran 1 test in 2366.479s

OK

CI States

Latest PR Test: ❌ Missing run-ci label — add it to run CI tests.
Latest PR Test (Extra): ❌ Blocked — run-ci is required first.

gemini-code-assist

Code Review

This pull request modifies the quantization process in the MoE scheme to ensure that the input scale tensor passed to fp4_quantize has a shape of [1], which is a requirement for the cute-dsl backend. Feedback was provided regarding a potential edge case where slicing with [:1] could result in an empty tensor (shape [0]) if the source tensor is empty, specifically in distributed environments where a rank might have no local experts.

…5407 The Mistral-Large-3 B200 nightly partition has been red because of TWO independent regressions sharing the same job. Keeping them in one document is misleading — different root causes, different fixes, different PRs. This split: - Creates mistral_large3_tp8_mtp_b200_bisect_report.md with all TP8+MTP-specific content (root cause d2c1034 / PR sgl-project#24436, the _resolve_speculative_algorithm_alias crash on Mistral-native-format drafts, the AutoConfig.from_pretrained ValueError, the empirical one-commit bisect d2c1034 vs f1395af, the proposed try/except fix, the maintainer-ready server log block, and the CI-visibility table). - Strips the same content out of mistral_large3_nvfp4_b200_bisect_report.md, replacing it with cross-references in the header, Open Items, follow-up note, and TL;DR. - Adds a PR sgl-project#25407 verification section to BOTH documents (NVFP4 doc records that PR sgl-project#25407 fixes its issue with gsm8k 0.957; TP8+MTP doc records that PR sgl-project#25407 explicitly does NOT touch server_args.py and the failure remains identical). Run summary on PR sgl-project#25407 head e3fb4ee (1574s wall time, 8x B200, flashinfer 0.6.11.post1, sglang-kernel 0.4.2.post2+cu130, torch 2.11.0): - TP8 PASS gsm8k 0.953 - TP8+MTP FAIL unchanged ValueError (server_args.py:329) - NVFP4 PASS gsm8k 0.957 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Jiminator · 2026-05-15T19:55:31Z

/tag-and-rerun-ci

Jiminator · 2026-05-15T22:27:16Z

I think this is fine as a quick hotfix, but it might be best to fix it at the source in compressed_tensors_w4a4_nvfp4_moe.py in process_weights_after_loading(). Is the expand operation necessary now that FI expects a tensor with numel = 1?

Co-authored-by: b8zhong <b8zhong@users.noreply.github.com>

more

e3fb4ee

b8zhong requested review from AniZpZ, BBuf, Edwardf0t1, FlamingoPg, HaiShaw and ch-wan as code owners May 15, 2026 13:14

github-actions Bot added the blackwell SM100/SM120 label May 15, 2026

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

Comment thread ...lang/srt/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a4_nvfp4_moe.py

b8zhong assigned Fridge003 and Jiminator May 15, 2026

github-actions Bot added the run-ci label May 15, 2026

Fridge003 approved these changes May 16, 2026

View reviewed changes

Fridge003 merged commit d523ae1 into sgl-project:main May 16, 2026
260 of 320 checks passed

Fridge003 pushed a commit that referenced this pull request May 16, 2026

Fix Mistral Large 3 nightly test (#25407)

a004d0a

Co-authored-by: b8zhong <b8zhong@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Mistral Large 3 nightly test#25407

Fix Mistral Large 3 nightly test#25407
Fridge003 merged 1 commit into
sgl-project:mainfrom
bzhng-development:brayden/fix-so-random

b8zhong commented May 15, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Jiminator commented May 15, 2026

Uh oh!

Jiminator commented May 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

b8zhong commented May 15, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI States

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Jiminator commented May 15, 2026

Uh oh!

Jiminator commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

b8zhong commented May 15, 2026 •

edited by github-actions Bot

Loading

Jiminator commented May 15, 2026 •

edited

Loading