[Misc] Fix breaking weight checker test by Fridge003 · Pull Request #24553 · sgl-project/sglang

Fridge003 · 2026-05-06T22:36:17Z

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review and Merge Process

Ping Merge Oncalls to start the process. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

gemini-code-assist · 2026-05-06T22:36:21Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Fridge003 · 2026-05-06T22:36:56Z

/rerun-test test_weight_checker_e2e.py

github-actions · 2026-05-06T22:37:30Z

✅ 1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/rl/test_weight_checker_e2e.py

Step 1: experimentally re-enable suite=stage-b-test-1-gpu-small (32GB 5090) while keeping --mem-fraction-static 0.7, to verify the mem-frac tweak alone resolves the snapshot/reset/compare CPU->GPU round-trip OOM that sgl-project#24553 sidestepped by switching to *-large. If CI on this commit passes, the next commit moves the suite to nightly-1-gpu (still keeping the lower mem-frac as defense).

* main: (894 commits) [Bug Fix] Fix RunAI streamer: corrupted weights, missing quant init, and broken URIs for multimodal models (sgl-project#22715) [Kernel] Deprecate DeepGemm in sgl kernel and apply custom wheel sgl-deep-gemm (sgl-project#24268) propagate pytest exit code from test __main__ entries (sgl-project#24487) [R3] Avoid implicit CUDA sync in routed experts DP slicing (sgl-project#24550) Add ChatCompletionRequest-style support to /v1/tokenize (sgl-project#23981) Support Triton MLA FP8 KV cache (sgl-project#20479) [diffusion] chore: align LTX-2 with official (sgl-project#24313) Expand support matrix for pypi wheel release (sgl-project#24565) [codex] Optimize Z-Image packed QKV (sgl-project#24117) [Misc] Fix breaking weight checker test (sgl-project#24553) [LoRA] Fix qkv_proj LoRA buffer sizing when tp_size > num_key_value_heads (sgl-project#24420) ci: bump test_mimo_models.py est_time 330 → 610 (sgl-project#24551) [CI] Temporarily disable marco/mcdse-2b-v1 in test_embedding_models (sgl-project#24279) Improve metrics, observability, and PD deploy tooling (sgl-project#24521) Fix diffusion fallback guards and validation (sgl-project#23335) [PD] Prevent update_status to Failed from cleared entries (sgl-project#24539) [CP] Register KV cache allgather buffer with symmetric memory (sgl-project#24040) Support getting checksums in weight checker (sgl-project#24537) Refactor buffer patterns in weight checker (sgl-project#24538) Add unit and end-to-end tests for weight checker (sgl-project#24536) ... # Conflicts: # python/sglang/srt/managers/scheduler.py # python/sglang/srt/model_executor/model_runner.py

upd

b132963

Fridge003 merged commit 9e1336d into main May 6, 2026
62 of 68 checks passed

Fridge003 deleted the fix-ci branch May 6, 2026 22:42

fzyzcjy mentioned this pull request May 7, 2026

Fix weight_checker e2e OOM on 32GB GPU + move to nightly #24559

Merged

5 tasks

LLThomas pushed a commit to LLThomas/sglang that referenced this pull request May 8, 2026

[Misc] Fix breaking weight checker test (sgl-project#24553)

bfea764

LucQueen pushed a commit to LucQueen/sglang that referenced this pull request May 12, 2026

[Misc] Fix breaking weight checker test (sgl-project#24553)

0b057dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Fix breaking weight checker test#24553

[Misc] Fix breaking weight checker test#24553
Fridge003 merged 1 commit into
mainfrom
fix-ci

Fridge003 commented May 6, 2026

Uh oh!

gemini-code-assist Bot commented May 6, 2026

Uh oh!

Fridge003 commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Fridge003 commented May 6, 2026

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

Uh oh!

gemini-code-assist Bot commented May 6, 2026

Uh oh!

Fridge003 commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant