Skip to content

ci: bump test_mimo_models.py est_time 330 → 610#24551

Merged
Kangyan-Zhou merged 1 commit into
mainfrom
alison/bump-mimo-est-time
May 6, 2026
Merged

ci: bump test_mimo_models.py est_time 330 → 610#24551
Kangyan-Zhou merged 1 commit into
mainfrom
alison/bump-mimo-est-time

Conversation

@alisonshao
Copy link
Copy Markdown
Collaborator

@alisonshao alisonshao commented May 6, 2026

Summary

  • PR [Feature] Xiaomi MiMo-V2.5 day0 support #23811 added a second test class (TestMiMoV2XiaomiMiMo/MiMo-V2.5, TP=8 DP=2, MMMU + GSM8K + EAGLE spec) to test_mimo_models.py without bumping est_time.
  • The file now runs ~500-640 s on h200 vs the unchanged est_time=330, which causes the auto-partitioner to consistently overload shard 0 of stage-c-test-8-gpu-h200 and hit the 30-min Run test wall (e.g. runs 25428444359, 25411981650).

Test plan

  • Trigger a scheduled-style run on main and confirm stage-c-test-8-gpu-h200 (0) no longer hits the 30-min wall.

PR #23811 added a second test class (TestMiMoV2 — XiaomiMiMo/MiMo-V2.5,
TP=8 DP=2, MMMU + GSM8K + EAGLE spec) to test_mimo_models.py without
bumping est_time. The file now spins up two full servers and runs ~600 s
on h200; staying at 330 s makes the partitioner consistently overload
shard 0 of stage-c-test-8-gpu-h200, which keeps hitting the 30-min
"Run test" wall (e.g. runs 25428444359, 25411981650).

Same value already proposed by the auto-bump bot in #24331; this PR
is a focused subset to unblock the H200 timeouts now.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@Kangyan-Zhou Kangyan-Zhou merged commit e72246c into main May 6, 2026
61 of 67 checks passed
@Kangyan-Zhou Kangyan-Zhou deleted the alison/bump-mimo-est-time branch May 6, 2026 21:35
ltcs11 added a commit to ltcs11/sglang that referenced this pull request May 7, 2026
* main: (894 commits)
  [Bug Fix] Fix RunAI streamer: corrupted weights, missing quant init, and broken URIs for multimodal models (sgl-project#22715)
  [Kernel] Deprecate DeepGemm in sgl kernel and apply custom wheel sgl-deep-gemm (sgl-project#24268)
  propagate pytest exit code from test __main__ entries (sgl-project#24487)
  [R3] Avoid implicit CUDA sync in routed experts DP slicing (sgl-project#24550)
  Add ChatCompletionRequest-style support to /v1/tokenize (sgl-project#23981)
  Support Triton MLA FP8 KV cache (sgl-project#20479)
  [diffusion] chore: align LTX-2 with official (sgl-project#24313)
  Expand support matrix for pypi wheel release (sgl-project#24565)
  [codex] Optimize Z-Image packed QKV (sgl-project#24117)
  [Misc] Fix breaking weight checker test (sgl-project#24553)
  [LoRA] Fix qkv_proj LoRA buffer sizing when tp_size > num_key_value_heads (sgl-project#24420)
  ci: bump test_mimo_models.py est_time 330 → 610 (sgl-project#24551)
  [CI] Temporarily disable marco/mcdse-2b-v1 in test_embedding_models (sgl-project#24279)
  Improve metrics, observability, and PD deploy tooling (sgl-project#24521)
  Fix diffusion fallback guards and validation (sgl-project#23335)
  [PD] Prevent update_status to Failed from cleared entries (sgl-project#24539)
  [CP] Register KV cache allgather buffer with symmetric memory (sgl-project#24040)
  Support getting checksums in weight checker (sgl-project#24537)
  Refactor buffer patterns in weight checker (sgl-project#24538)
  Add unit and end-to-end tests for weight checker (sgl-project#24536)
  ...

# Conflicts:
#	python/sglang/srt/managers/scheduler.py
#	python/sglang/srt/model_executor/model_runner.py
LLThomas pushed a commit to LLThomas/sglang that referenced this pull request May 8, 2026
LucQueen pushed a commit to LucQueen/sglang that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants