ci: bump test_mimo_models.py est_time 330 → 610#24551
Merged
Merged
Conversation
PR #23811 added a second test class (TestMiMoV2 — XiaomiMiMo/MiMo-V2.5, TP=8 DP=2, MMMU + GSM8K + EAGLE spec) to test_mimo_models.py without bumping est_time. The file now spins up two full servers and runs ~600 s on h200; staying at 330 s makes the partitioner consistently overload shard 0 of stage-c-test-8-gpu-h200, which keeps hitting the 30-min "Run test" wall (e.g. runs 25428444359, 25411981650). Same value already proposed by the auto-bump bot in #24331; this PR is a focused subset to unblock the H200 timeouts now.
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
Kangyan-Zhou
approved these changes
May 6, 2026
Fridge003
pushed a commit
that referenced
this pull request
May 6, 2026
ltcs11
added a commit
to ltcs11/sglang
that referenced
this pull request
May 7, 2026
* main: (894 commits) [Bug Fix] Fix RunAI streamer: corrupted weights, missing quant init, and broken URIs for multimodal models (sgl-project#22715) [Kernel] Deprecate DeepGemm in sgl kernel and apply custom wheel sgl-deep-gemm (sgl-project#24268) propagate pytest exit code from test __main__ entries (sgl-project#24487) [R3] Avoid implicit CUDA sync in routed experts DP slicing (sgl-project#24550) Add ChatCompletionRequest-style support to /v1/tokenize (sgl-project#23981) Support Triton MLA FP8 KV cache (sgl-project#20479) [diffusion] chore: align LTX-2 with official (sgl-project#24313) Expand support matrix for pypi wheel release (sgl-project#24565) [codex] Optimize Z-Image packed QKV (sgl-project#24117) [Misc] Fix breaking weight checker test (sgl-project#24553) [LoRA] Fix qkv_proj LoRA buffer sizing when tp_size > num_key_value_heads (sgl-project#24420) ci: bump test_mimo_models.py est_time 330 → 610 (sgl-project#24551) [CI] Temporarily disable marco/mcdse-2b-v1 in test_embedding_models (sgl-project#24279) Improve metrics, observability, and PD deploy tooling (sgl-project#24521) Fix diffusion fallback guards and validation (sgl-project#23335) [PD] Prevent update_status to Failed from cleared entries (sgl-project#24539) [CP] Register KV cache allgather buffer with symmetric memory (sgl-project#24040) Support getting checksums in weight checker (sgl-project#24537) Refactor buffer patterns in weight checker (sgl-project#24538) Add unit and end-to-end tests for weight checker (sgl-project#24536) ... # Conflicts: # python/sglang/srt/managers/scheduler.py # python/sglang/srt/model_executor/model_runner.py
LLThomas
pushed a commit
to LLThomas/sglang
that referenced
this pull request
May 8, 2026
LucQueen
pushed a commit
to LucQueen/sglang
that referenced
this pull request
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TestMiMoV2—XiaomiMiMo/MiMo-V2.5, TP=8 DP=2, MMMU + GSM8K + EAGLE spec) totest_mimo_models.pywithout bumpingest_time.est_time=330, which causes the auto-partitioner to consistently overload shard 0 ofstage-c-test-8-gpu-h200and hit the 30-minRun testwall (e.g. runs25428444359,25411981650).Test plan
mainand confirmstage-c-test-8-gpu-h200 (0)no longer hits the 30-min wall.