Skip to content

[Misc] Fix breaking weight checker test#24553

Merged
Fridge003 merged 1 commit into
mainfrom
fix-ci
May 6, 2026
Merged

[Misc] Fix breaking weight checker test#24553
Fridge003 merged 1 commit into
mainfrom
fix-ci

Conversation

@Fridge003
Copy link
Copy Markdown
Collaborator

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@Fridge003
Copy link
Copy Markdown
Collaborator Author

/rerun-test test_weight_checker_e2e.py

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

1-gpu-h100 (1 test): View workflow run

cd test/ && python3 registered/rl/test_weight_checker_e2e.py

@Fridge003 Fridge003 merged commit 9e1336d into main May 6, 2026
62 of 68 checks passed
@Fridge003 Fridge003 deleted the fix-ci branch May 6, 2026 22:42
fzyzcjy added a commit to fzyzcjy/sglang that referenced this pull request May 6, 2026
Step 1: experimentally re-enable suite=stage-b-test-1-gpu-small (32GB
5090) while keeping --mem-fraction-static 0.7, to verify the mem-frac
tweak alone resolves the snapshot/reset/compare CPU->GPU round-trip
OOM that sgl-project#24553 sidestepped by switching to *-large.

If CI on this commit passes, the next commit moves the suite to
nightly-1-gpu (still keeping the lower mem-frac as defense).
ltcs11 added a commit to ltcs11/sglang that referenced this pull request May 7, 2026
* main: (894 commits)
  [Bug Fix] Fix RunAI streamer: corrupted weights, missing quant init, and broken URIs for multimodal models (sgl-project#22715)
  [Kernel] Deprecate DeepGemm in sgl kernel and apply custom wheel sgl-deep-gemm (sgl-project#24268)
  propagate pytest exit code from test __main__ entries (sgl-project#24487)
  [R3] Avoid implicit CUDA sync in routed experts DP slicing (sgl-project#24550)
  Add ChatCompletionRequest-style support to /v1/tokenize (sgl-project#23981)
  Support Triton MLA FP8 KV cache (sgl-project#20479)
  [diffusion] chore: align LTX-2 with official (sgl-project#24313)
  Expand support matrix for pypi wheel release (sgl-project#24565)
  [codex] Optimize Z-Image packed QKV (sgl-project#24117)
  [Misc] Fix breaking weight checker test (sgl-project#24553)
  [LoRA] Fix qkv_proj LoRA buffer sizing when tp_size > num_key_value_heads (sgl-project#24420)
  ci: bump test_mimo_models.py est_time 330 → 610 (sgl-project#24551)
  [CI] Temporarily disable marco/mcdse-2b-v1 in test_embedding_models (sgl-project#24279)
  Improve metrics, observability, and PD deploy tooling (sgl-project#24521)
  Fix diffusion fallback guards and validation (sgl-project#23335)
  [PD] Prevent update_status to Failed from cleared entries (sgl-project#24539)
  [CP] Register KV cache allgather buffer with symmetric memory (sgl-project#24040)
  Support getting checksums in weight checker (sgl-project#24537)
  Refactor buffer patterns in weight checker (sgl-project#24538)
  Add unit and end-to-end tests for weight checker (sgl-project#24536)
  ...

# Conflicts:
#	python/sglang/srt/managers/scheduler.py
#	python/sglang/srt/model_executor/model_runner.py
LLThomas pushed a commit to LLThomas/sglang that referenced this pull request May 8, 2026
LucQueen pushed a commit to LucQueen/sglang that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant