Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/pr-test-xeon.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ jobs:
timeout-minutes: 36
run: |
docker exec -w /sglang-checkout/ ci_sglang_xeon \
bash -c "source /opt/.venv/bin/activate && cd ./test && python3 run_suite.py --hw cpu --suite stage-b-test-cpu"
bash -c "source /opt/.venv/bin/activate && cd ./test/srt && python3 run_suite.py --suite per-commit-cpu --timeout-per-file 1500"

- name: Change permission
timeout-minutes: 2
Expand Down
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@ repos:
entry: python3 scripts/ci/check_registered_tests.py
language: system
files: ^test/registered/.*\.py$
exclude: ^test/registered/.*/utils\.py$
pass_filenames: false
- id: check-no-docs-changes
name: reject changes under legacy docs/
Expand Down
60 changes: 30 additions & 30 deletions docs_new/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<a
href="https://lmsys.org/blog/2026-04-29-p2p-update/"
href="https://lmsys.org/blog/2026-04-25-deepseek-v4/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -104,8 +104,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/p2p-update/p2p-overview.png"
alt="Updating 1T parameters in seconds \u2014 P2P weight transfer in Large Scale Distributed RL"
src="https://lmsys.org/images/blog/deepseek_v4/benchmark_vs_oss.png"
alt="DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles"
style={{
width: "100%",
height: "100%",
Expand All @@ -124,7 +124,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"Updating 1T parameters in seconds \u2014 P2P weight transfer in Large Scale Distributed RL"}
{"DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles"}
</p>
<p
style={{
Expand All @@ -133,12 +133,12 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"April 29, 2026"}
{"April 25, 2026"}
</p>
</div>
</a>
<a
href="https://lmsys.org/blog/2026-04-25-deepseek-v4/"
href="https://lmsys.org/blog/2026-04-10-sglang-hisparse/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -159,8 +159,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/deepseek_v4/benchmark_vs_oss.png"
alt="DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles"
src="https://lmsys.org/images/blog/hisparse/hisparse_overview.png"
alt="HiSparse: Turbocharging Sparse Attention with Hierarchical Memory"
style={{
width: "100%",
height: "100%",
Expand All @@ -179,7 +179,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles"}
{"HiSparse: Turbocharging Sparse Attention with Hierarchical Memory"}
</p>
<p
style={{
Expand All @@ -188,12 +188,12 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"April 25, 2026"}
{"April 10, 2026"}
</p>
</div>
</a>
<a
href="https://lmsys.org/blog/2026-04-10-sglang-hisparse/"
href="https://lmsys.org/blog/2026-03-25-gtc2026/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -214,8 +214,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/hisparse/hisparse_overview.png"
alt="HiSparse: Turbocharging Sparse Attention with Hierarchical Memory"
src="https://lmsys.org/images/blog/gtc2026/happyhour-crowd.jpg"
alt="Highlights of SGLang at NVIDIA GTC 2026"
style={{
width: "100%",
height: "100%",
Expand All @@ -234,7 +234,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"HiSparse: Turbocharging Sparse Attention with Hierarchical Memory"}
{"Highlights of SGLang at NVIDIA GTC 2026"}
</p>
<p
style={{
Expand All @@ -243,12 +243,12 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"April 10, 2026"}
{"March 31, 2026"}
</p>
</div>
</a>
<a
href="https://lmsys.org/blog/2026-03-25-gtc2026/"
href="https://lmsys.org/blog/2026-03-25-eep-partial-failure-tolerance/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -269,8 +269,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/gtc2026/happyhour-crowd.jpg"
alt="Highlights of SGLang at NVIDIA GTC 2026"
src="https://lmsys.org/images/blog/eep-partial-failure-tolerance/figure.png"
alt="Elastic EP in SGLang: Achieving Partial Failure Tolerance for DeepSeek MoE Deployments"
style={{
width: "100%",
height: "100%",
Expand All @@ -289,7 +289,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"Highlights of SGLang at NVIDIA GTC 2026"}
{"Elastic EP in SGLang: Achieving Partial Failure Tolerance for DeepSeek MoE Deployments"}
</p>
<p
style={{
Expand All @@ -298,12 +298,12 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"March 31, 2026"}
{"March 25, 2026"}
</p>
</div>
</a>
<a
href="https://lmsys.org/blog/2026-03-25-eep-partial-failure-tolerance/"
href="https://lmsys.org/blog/2026-03-17-rocm-miles-rl-amd/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -324,8 +324,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/eep-partial-failure-tolerance/figure.png"
alt="Elastic EP in SGLang: Achieving Partial Failure Tolerance for DeepSeek MoE Deployments"
src="https://lmsys.org/images/blog/rocm_miles_rl/fig_1.png"
alt="ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct\u2122 GPUs"
style={{
width: "100%",
height: "100%",
Expand All @@ -344,7 +344,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"Elastic EP in SGLang: Achieving Partial Failure Tolerance for DeepSeek MoE Deployments"}
{"ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct\u2122 GPUs"}
</p>
<p
style={{
Expand All @@ -353,12 +353,12 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"March 25, 2026"}
{"March 17, 2026"}
</p>
</div>
</a>
<a
href="https://lmsys.org/blog/2026-03-17-rocm-miles-rl-amd/"
href="https://lmsys.org/blog/2026-03-11-run-nvidia-nemotron-3-super/"
target="_blank"
rel="noopener noreferrer"
style={{
Expand All @@ -379,8 +379,8 @@ It is designed to deliver low-latency and high-throughput inference across a wid
}}
>
<img
src="https://lmsys.org/images/blog/rocm_miles_rl/fig_1.png"
alt="ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct\u2122 GPUs"
src="https://lmsys.org/images/blog/nemotron-3-super/figure_1.svg"
alt="SGLang Adds Day-0 Support for NVIDIA Nemotron 3 Super for building High-Efficiency Multi-Agent Systems"
style={{
width: "100%",
height: "100%",
Expand All @@ -399,7 +399,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
fontSize: "0.98rem",
}}
>
{"ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct\u2122 GPUs"}
{"SGLang Adds Day-0 Support for NVIDIA Nemotron 3 Super for building High-Efficiency Multi-Agent Systems"}
</p>
<p
style={{
Expand All @@ -408,7 +408,7 @@ It is designed to deliver low-latency and high-throughput inference across a wid
opacity: 0.75,
}}
>
{"March 17, 2026"}
{"March 11, 2026"}
</p>
</div>
</a>
Expand Down
4 changes: 2 additions & 2 deletions scripts/ci/check_registered_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ def main() -> int:
ci_register = importlib.util.module_from_spec(spec)
spec.loader.exec_module(ci_register)

# Same filter as run_suite.py: skip conftest.py, __init__.py, and utils.py
# Same filter as run_suite.py: skip conftest.py and __init__.py
files = sorted(
f
for f in glob.glob("test/registered/**/*.py", recursive=True)
if os.path.basename(f) not in ("conftest.py", "__init__.py", "utils.py")
if os.path.basename(f) not in ("conftest.py", "__init__.py")
)
if not files:
return 0
Expand Down
59 changes: 0 additions & 59 deletions test/registered/cpu/test_activation.py

This file was deleted.

30 changes: 0 additions & 30 deletions test/registered/cpu/test_binding.py

This file was deleted.

Loading
Loading