[AMD CI] Migrate and Add More Testcases by bingxche · Pull Request #17116 · sgl-project/sglang

bingxche · 2026-01-15T07:16:38Z

Motivation

Cleans up and reorganizes AMD CI test infrastructure.

Modifications

Renamed suite: stage-a-test-1 → stage-a-test-1-amd (5 files)
Update suite name: stage-b-test-small-1-gpu → stage-b-test-small-1-gpu-amd (2 files)
Add 3 test cases :
test/registered/core/test_deterministic.py
test/registered/hicache/test_hicache_storage_file_backend.py
test_hicache_storage_3fs_backend.py
Removed 2 legacy jobs: unit-test-backend-1-gpu-amd and unit-test-backend-8-gpu-amd
Migrated 6 tests to test/registered/:
2 from test/srt/: test_deepseek_v3_basic.py, test_deepseek_v3_mtp.py (8-GPU)
4 from per-commit-amd: test_int4fp8_moe.py, test_rope_rocm.py, test_bench_typebaseddispatcher.py, test_type_based_dispatcher.py
Added 1 new perf job (performance-test-1-gpu-part-3-amd): 4 test methods in test_bench_serving.py
Extended performance-test-1-gpu-part-2-amd: 4 new test methods in test_bench_serving.py (LoRA latency ×2, VLM throughput, VLM latency)
Increased server launch timeout for test/registered/amd/test_deepseek_r1_mxfp4_8gpu.py.
Added 4 sgl-kernel tests: test_amd_deterministic_custom_allreduce.py, test_amd_nccl_allreduce_determinism.py, test_moe_topk_sigmoid.py, test_torch_defaults_reset.py

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-15T07:16:42Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist · 2026-01-15T07:32:18Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

bingxche · 2026-01-15T07:32:37Z

please help review

cc @yctseng0211 @michael-amd

michael-amd · 2026-01-15T07:41:22Z

test/srt/test_deepseek_v3_basic.py, test/srt/test_deepseek_v3_mtp.py share with nv?

bingxche · 2026-01-15T07:43:00Z

test/srt/test_deepseek_v3_basic.py, test/srt/test_deepseek_v3_mtp.py share with nv?

Yes, for now I copied these two test files to /test/registered/amd, prevent from breaking their CI

michaelzhang-ai · 2026-01-16T00:19:54Z

Resolved conflict of test/registered/core/test_deterministic.py
Upstream moved to stage-b-test-large-1-gpu.

…nto migrate-amd-ci

* fix(ci): recover from corrupted MMMU parquet cache (sgl-project#17256) * [diffusion] feat: support default 4-step inference for Flux2-Klein distilled models (sgl-project#17225) Signed-off-by: Lancer <maruixiang6688@gmail.com> * Add runner utilization report workflow (sgl-project#17234) * cli: support sglang version (sgl-project#17250) * Use swa radix cache and memory pool for gpt-oss model (sgl-project#17261) * [VLM][Reland] Refactor load_mm_data to improve performance (sgl-project#16152) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> * [Tiny] Improve docs (sgl-project#17264) * [diffusion] fix: set guidance_scale default to None (sgl-project#17182) * Tiny fix comment typo (sgl-project#17287) * [SPEC_V2] Enable cudagraph draft_extend for trtllm_mla_backend and Acclen Fix for DP under cudagraph mode (sgl-project#16974) * Add kl test for swa radix cache (sgl-project#17281) * fix: Handle multiple named chat templates in HuggingFace tokenizers (sgl-project#17236) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> * Move radix cache related tests (sgl-project#17295) * [Refactor] Add `-fp4-gemm-backend` to replace `SGLANG_FLASHINFER_FP4_GEMM_BACKEND` (sgl-project#16534) Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> * [Bugfix] Fix PD accuracy when MTP is not configured on the prefill node (sgl-project#17212) Co-authored-by: Shangming Cai <csmthu@gmail.com> * [Diffusion] Apply jit qk_norm to flux1 (sgl-project#17296) * [Refactor] Split out deepseek v2 weight loader function into mixin (sgl-project#16649) * [NPU]Support GPT-OSS for NPU (sgl-project#14197) * [jit-kernel] Add CuTe DSL GDN Decode Kernel (sgl-project#15631) Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> * [GLM 4.7] Add RTX 6000 Pro aka sm120 (sgl-project#17235) Co-authored-by: root <root@ubuntu-nvidia.localdomain> * Update CODEOWNERS for multimodal_gen (sgl-project#17308) Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * [Feature] overlap LoRA weight loading with compute (sgl-project#15512) * [PD] Optimize MHA models pp util calculation logic (sgl-project#17306) * [Minor] Correct sglang version when installing from source (sgl-project#17315) * Use dsv3 optimized routing `fused_topk_deepseek` instead of `moe_fused_gate` (sgl-project#15347) * [DeepSeek v3.2] Opt MTP decode cuda batch sizes and nsa implementation (sgl-project#16961) * Update code sync scripts (sgl-project#17319) * [Auto Sync] Update tokenizer_manager.py (20260119) (sgl-project#17317) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * support new qwen3_coder_detector (sgl-project#16744) Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> * Fix kernel selection in biased_grouped_topk_gpu (sgl-project#17325) * KV Cache Events with Attention DP bug fix (sgl-project#16030) (sgl-project#16412) * [Perf] fuse q, k norm for Flux2Attention (sgl-project#17241) Co-authored-by: Minglei Zhu <zminglei@linkedin.com> * [CI] Add partition to stage-b-test-large-1-gpu (11->12) (sgl-project#17245) * fix(ci): rate limit and permission errors in trace publishing (sgl-project#17238) * Revert "[Perf] fuse q, k norm for Flux2Attention (sgl-project#17241)" (sgl-project#17332) * Migrate performance, accuracy, and quantization tests to CI registry (sgl-project#17177) Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> * Inclusion of nvfp4 blockscale in EPLB Rebalance (sgl-project#17158) * [Refactor] Set `fp4-gemm-backend=auto` on SM100 and rename `fp4-gemm-backend` with `flashinfer_` prefix (sgl-project#17309) * [Diffusion] Apply qknorm to flux2 and apply lightx2v rms_norm_one_pass kernel(without residual) (sgl-project#17305) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Fix v32 continue_final_message not work (sgl-project#16567) * Evict swa kv cache during decoding (sgl-project#17220) * [RadixTree][1/N Refactor]: Support unified match_prefix params (sgl-project#17142) Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> * [AMD CI] Migrate and Add More Testcases (sgl-project#17116) Co-authored-by: yctseng0211 <yctseng@amd.com> * [AMD] CI - add partitions for stage-b-test-small-1-gpu-amd (sgl-project#17345) * Restore deepseek_v2.py to main's code, except the utils * Ran `pre-commit` --------- Signed-off-by: Lancer <maruixiang6688@gmail.com> Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Hudson Xing <1277646412@qq.com> Co-authored-by: Lancer <402430575@qq.com> Co-authored-by: Alison Shao <54658187+alisonshao@users.noreply.github.com> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: Ke Bao <ispobaoke@gmail.com> Co-authored-by: Yuan Luo <yuan.luo@hotmail.com> Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com> Co-authored-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu> Co-authored-by: Changyi Yang <112288487+ChangyiYang@users.noreply.github.com> Co-authored-by: YAMY <74099316+YAMY1234@users.noreply.github.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: b8zhong <b8zhong@uwaterloo.ca> Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com> Co-authored-by: Ch3ngY1 <91232537+Ch3ngY1@users.noreply.github.com> Co-authored-by: Shangming Cai <csmthu@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Jerry Ji <jerryjilol@gmail.com> Co-authored-by: Todobe <43903496+Todobe@users.noreply.github.com> Co-authored-by: Jinyan Chen <93358689+liz-badada@users.noreply.github.com> Co-authored-by: Jinyan Chen <jinyanc@nvidia.com> Co-authored-by: Koushik Dutta <koush@koushikdutta.com> Co-authored-by: root <root@ubuntu-nvidia.localdomain> Co-authored-by: Glen Liu <62917497+glenliu21@users.noreply.github.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com> Co-authored-by: Lee Nau <lnau@nvidia.com> Co-authored-by: Yongfei Xu <xuyongfei.xyf@antgroup.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Gaoji Liu <34803073+attack204@users.noreply.github.com> Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com> Co-authored-by: yudian0504 <138860534+yudian0504@users.noreply.github.com> Co-authored-by: Kartik Ramesh <kartikx2000@gmail.com> Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com> Co-authored-by: Minglei Zhu <zminglei@linkedin.com> Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com> Co-authored-by: Shu Wang <shuw@nvidia.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: ybyang <10629930+whybeyoung@users.noreply.github.com> Co-authored-by: zhangheng <hzh0425@apache.org> Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com> Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com> Co-authored-by: yctseng0211 <yctseng@amd.com>

bingxche added 6 commits January 15, 2026 06:03

migrate 1 gpu tests

64466fc

delete old unit-test-backend-1-gpu-amd job

9d7c6c3

rename suite name to stage-a-test-1-amd

e4b3114

migrate 1-gpu- test files

4763ead

migrate 8-gpu mi325x test files

d654e10

add performance-test-1-gpu test files

6733fc4

github-actions bot added quant LLM Quantization amd deepseek labels Jan 15, 2026

bingxche added the run-ci label Jan 15, 2026

Merge branch 'main' into migrate-amd-ci

8bd74d1

bingxche marked this pull request as ready for review January 15, 2026 07:32

bingxche requested review from Fridge003, Kangyan-Zhou, ispobock and merrymercy as code owners January 15, 2026 07:32

bingxche changed the title ~~Migrate amd ci~~ [AMD CI] Migrate and Add More Testcases Jan 15, 2026

add test_deterministic

01407b9

michaelzhang-ai approved these changes Jan 15, 2026

View reviewed changes

add two hicache tests

d7575d1

github-actions bot added the hicache Hierarchical Caching for SGLang label Jan 15, 2026

yctseng0211 and others added 3 commits January 15, 2026 03:14

add two hicache tests

b0ba109

Update estimated time of test_hicache_storage_3fs_backend.py

5015574

Merge branch 'main' into migrate-amd-ci

31898fd

bingxche added 2 commits January 16, 2026 09:59

Merge branch 'main' into migrate-amd-ci

8707db5

fix amd suite name

1427cb1

github-actions bot added the lora label Jan 16, 2026

bingxche and others added 6 commits January 16, 2026 15:14

Increased server launch timeout

f8b3d46

Merge branch 'main' into migrate-amd-ci

02fe7d4

add 4 sgl-kernel tests for amd

32e7731

Merge branch 'migrate-amd-ci' of https://github.com/bingxche/sglang i…

15caee5

…nto migrate-amd-ci

Merge branch 'main' into migrate-amd-ci

ab22556

Merge branch 'main' into migrate-amd-ci

6e5c359

HaiShaw approved these changes Jan 19, 2026

View reviewed changes

HaiShaw merged commit 2ea02f0 into sgl-project:main Jan 19, 2026
94 of 101 checks passed

bingxche deleted the migrate-amd-ci branch January 20, 2026 02:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD CI] Migrate and Add More Testcases#17116

[AMD CI] Migrate and Add More Testcases#17116
HaiShaw merged 20 commits intosgl-project:mainfrom
bingxche:migrate-amd-ci

bingxche commented Jan 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Uh oh!

bingxche commented Jan 15, 2026

Uh oh!

michael-amd commented Jan 15, 2026

Uh oh!

bingxche commented Jan 15, 2026

Uh oh!

michaelzhang-ai commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bingxche commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Uh oh!

bingxche commented Jan 15, 2026

Uh oh!

michael-amd commented Jan 15, 2026

Uh oh!

bingxche commented Jan 15, 2026

Uh oh!

michaelzhang-ai commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bingxche commented Jan 15, 2026 •

edited

Loading