Skip to content

Enable DeepGemm fast warmup in CI to prevent cold-cache timeouts#18823

Merged
Kangyan-Zhou merged 1 commit into
mainfrom
ci/enable-deepgemm-fast-warmup
Feb 15, 2026
Merged

Enable DeepGemm fast warmup in CI to prevent cold-cache timeouts#18823
Kangyan-Zhou merged 1 commit into
mainfrom
ci/enable-deepgemm-fast-warmup

Conversation

@alisonshao
Copy link
Copy Markdown
Collaborator

Summary

  • Enable SGLANG_JIT_DEEPGEMM_FAST_WARMUP=true globally in CI workflow to prevent DeepGemm compilation timeouts when runners restart and the JIT cache (~/.cache/deep_gemm) is lost
  • Without this, a cold cache triggers full warmup compiling up to 128K kernel variants (~30 min), exceeding CI timeouts
  • Fast warmup (from [DeepGemm] Add a flag for fast warmup #18111) reduces compilation to ~3K variants (<3 min), with negligible perf impact for CI testing

Context

Previously SGLANG_DG_CACHE_DIR was set so CI workers shared a single DeepGemm cache directory. When runners are reprovisioned, this cache is lost. Setting the fast warmup flag through the workflow env ensures this is handled in code without requiring machine-level configuration.

Test plan

  • Verify 8-GPU H200/H20 CI jobs (stage-c) that run DeepSeek models pass without timeout
  • Confirm DeepGemm warmup completes in <3 min on cold cache

When CI runners restart, the DeepGemm cache is lost, causing a full
warmup that compiles up to 128K kernel variants (~30 min), which
exceeds CI timeouts. Enable SGLANG_JIT_DEEPGEMM_FAST_WARMUP to
reduce compilation to ~3K variants (<3 min warmup).
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@alisonshao
Copy link
Copy Markdown
Collaborator Author

/rerun-stage stage-c-test-deepep-4-gpu

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered stage-c-test-deepep-4-gpu to run independently (skipping dependencies).

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

@alisonshao
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@alisonshao
Copy link
Copy Markdown
Collaborator Author

/rerun-stage stage-c-test-4-gpu-b200

@github-actions
Copy link
Copy Markdown
Contributor

✅ Triggered stage-c-test-4-gpu-b200 to run independently (skipping dependencies).

@github-actions
Copy link
Copy Markdown
Contributor

🔗 View workflow run

@Kangyan-Zhou Kangyan-Zhou merged commit f760320 into main Feb 15, 2026
362 of 377 checks passed
@Kangyan-Zhou Kangyan-Zhou deleted the ci/enable-deepgemm-fast-warmup branch February 15, 2026 16:02
magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants