[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI by michaelzhang-ai · Pull Request #17523 · sgl-project/sglang

michaelzhang-ai · 2026-01-21T19:29:22Z

Motivation

Add Kimi-K2, DeepSeek-V3.2 accuracy and performance tests for MI325 (MI30x) platform, update Mi35x tests, consolidate test jobs, and fix various CI failures.
Total add 9 unique test to AMD CI. (https://github.com/sgl-project/sglang/actions/runs/21423272318?pr=17523)

Nightly ci pass: https://github.com/sgl-project/sglang/actions/runs/21422385034
Please help to review. @yctseng0211 @bingxche

Modifications

New MI325 and MI355 Tests:

nightly-8-gpu-deepseek-v32: Basic accuracy + perf
nightly-8-gpu-deepseek-v32-mtp: MTP (EAGLE speculative) accuracy + perf
nightly-8-gpu-kimi-k2: Kimi-K2-Instruct-0905 accuracy

CI Fixes:

Increase MI35x MTP perf timeout: 5400s server launch timeout
Add accuracy logging (accuracy={acc:.3f} threshold={threshold} {status}) to all eval tests

Removed:

nightly-8-gpu-deepseek-r1 job (redundant with MI35x tests)

Accuracy Tests

Kimi-K2 Model (MI325)

Model	TP	Accuracy	Threshold	Status
moonshotai/Kimi-K2-Instruct-0905	8	0.953	0.94	✅ PASS

Benchmarking and Profiling

DeepSeek-V3.2 Models (MI325)

Model	Variant	TP	Accuracy	Threshold	Status
deepseek-ai/DeepSeek-V3.2	basic	8	0.950	0.93	✅ PASS

TestNightlyDeepseekV32BasicPerformance

deepseek-ai/DeepSeek-V3.2 (basic) [MI325]

batch size	input len	latency (s)	input throughput (tok/s)	output throughput (tok/s)	ITL (ms)
1	4096	12.30	2818.94	47.22	21.18
8	4096	17.35	8301.86	305.59	26.18
16	4096	21.19	11183.46	534.36	29.94
64	4096	33.67	20786.93	1556.37	41.12

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

…figurations

gemini-code-assist · 2026-01-21T19:29:34Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist · 2026-01-21T22:57:10Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

… branch

… DeepSeek evaluation test

… to optimize performance

…commodate longer execution times

…to streamline setup process

…test This update introduces a new function, _run_benchmark_with_timeout, to manage server launch timeouts during benchmark execution. The test_bench_one_batch method has been modified to utilize this new function, enhancing the robustness of the performance testing process.

…model evaluation results across multiple test files.

…ble condition in the workflow configuration.

yctseng0211 · 2026-01-27T15:50:23Z

will be fixed by #17633

- Introduced a new job in the nightly workflow for Kimi-K2 accuracy testing. - Added a new test script for evaluating Kimi-K2 with the GSM8K benchmark. - Updated workflow triggers to include pull requests affecting the nightly test configuration.

…sue. Update AMD failing models list to include GLM-4.1V.

… DeepSeek - Excluded MI35x performance jobs from CI checks to prevent blocking on non-critical failures. - Adjusted model score threshold for Mixtral-8x7B to 0.57. - Added a watchdog timeout of 1200 seconds to performance test scripts for DeepSeek V32 Basic and MTP on both AMD and MI35x platforms.

…aining scheduled execution only.

.github/workflows/nightly-test-amd.yml

michaelzhang-ai · 2026-01-28T04:47:51Z

Nightly pass: https://github.com/sgl-project/sglang/actions/runs/21422385034. cc: @HaiShaw

HaiShaw

consider to move non-mi35x TCs to mi30x subdir, later.

michaelzhang-ai · 2026-01-28T19:50:31Z

consider to move non-mi35x TCs to mi30x subdir, later.

reorg folder: #17895

Co-authored-by: YC Tseng <yctseng@amd.com>

[CI] Add new accuracy tests for MI35x DeepSeek-V3.2 DP and TP+MTP con…

c0a57c8

…figurations

github-actions bot added amd deepseek labels Jan 21, 2026

michaelzhang-ai marked this pull request as ready for review January 21, 2026 22:57

michaelzhang-ai requested review from Fridge003, Kangyan-Zhou, ispobock and merrymercy as code owners January 21, 2026 22:57

yctseng0211 added the run-ci label Jan 22, 2026

michaelzhang-ai and others added 2 commits January 21, 2026 21:52

Merge branch 'main' into add_mtp_accuracy_test

0f9be7e

Enhance nightly test workflow to trigger on pull requests to the main…

7892300

… branch

michaelzhang-ai changed the title ~~[CI] Add tests for MI35x DeepSeek-V3.2 DP and TP+MTP~~ [CI] Add tests for MI35x DeepSeek-V3.2 DP and MTP Jan 22, 2026

michaelzhang-ai added 8 commits January 22, 2026 20:59

Disable nightly accuracy test for MI35x and adjust speed threshold in…

4b88787

… DeepSeek evaluation test

Implement model configuration prefetching in DeepSeek evaluation test…

6b09b24

… to optimize performance

Increase timeout for performance test MI35x in nightly workflow to ac…

ab92a48

…commodate longer execution times

Remove model configuration prefetching from DeepSeek evaluation test …

10c6619

…to streamline setup process

Enhance accuracy test output by adding detailed print statements for …

39a8c28

…model evaluation results across multiple test files.

Enable nightly accuracy test for MI35x by removing the temporary disa…

d856ce5

…ble condition in the workflow configuration.

[CI] Rename V3.2 DP job to DP+TC and add TC test

9066463

michaelzhang-ai changed the title ~~[CI] Add tests for MI35x DeepSeek-V3.2 DP and MTP~~ [CI] Add DeepSeek-V3.2 to MI325 and MI355 nightly test Jan 24, 2026

michaelzhang-ai added 2 commits January 24, 2026 00:01

[CI] Fix env inheritance in V3.2 tests (use os.environ.copy)

5fb6d75

[CI] Remove DeepSeek-R1 job from nightly AMD workflow

be7d4cc

michaelzhang-ai changed the title ~~[CI] Add DeepSeek-V3.2 to MI325 and MI355 nightly test~~ [AMD] Add DeepSeek-V3.2 to MI325 and MI355 nightly test Jan 24, 2026

michaelzhang-ai added 4 commits January 25, 2026 13:54

[CI] Remove DeepSeek-V3.2 DP+TC job (OOM on MI325)

a006982

[CI] Lower accuracy thresholds for Qwen2 FP8 models

bcca382

Merge upstream/main - keep DeepSeek-R1 removed

3c2fc49

[CI] Fix script paths: scripts/ci/amd_ci_* -> scripts/ci/amd/amd_ci_*

e2d39b7

[CI] Update nightly test workflow to remove pull request trigger

c564fea

michaelzhang-ai changed the title ~~[AMD] Add DeepSeek-V3.2 to MI325 and MI355 nightly test~~ [AMD] Add DeepSeek-V3.2 accuracy and performance tests to MI325 nightly CI Jan 27, 2026

michaelzhang-ai added 3 commits January 27, 2026 15:03

Merge remote-tracking branch 'upstream/main' into add_mtp_accuracy_test

8c8ff29

Disable nightly accuracy test for MI35x due to shared memory limit is…

2ea74ae

…sue. Update AMD failing models list to include GLM-4.1V.

github-actions bot added the Multi-modal multi-modal language model label Jan 27, 2026

michaelzhang-ai changed the title ~~[AMD] Add DeepSeek-V3.2 accuracy and performance tests to MI325 nightly CI~~ [AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI Jan 27, 2026

michaelzhang-ai added 2 commits January 27, 2026 20:21

Remove pull request trigger from nightly test workflow for AMD, maint…

5914d57

…aining scheduled execution only.

bingxche reviewed Jan 28, 2026

View reviewed changes

.github/workflows/nightly-test-amd.yml Show resolved Hide resolved

Merge branch 'main' into add_mtp_accuracy_test

1e4a287

michaelzhang-ai requested a review from bingxche January 28, 2026 07:11

HaiShaw approved these changes Jan 28, 2026

View reviewed changes

HaiShaw merged commit f8636fb into sgl-project:main Jan 28, 2026
139 of 157 checks passed

charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Jan 30, 2026

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI (sgl-project#17523)

0796149

Co-authored-by: YC Tseng <yctseng@amd.com>

Chen-0210 pushed a commit to Chen-0210/sglang that referenced this pull request Jan 30, 2026

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI (sgl-project#17523)

b4072a2

Co-authored-by: YC Tseng <yctseng@amd.com>

sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI (sgl-project#17523)

93461d9

Co-authored-by: YC Tseng <yctseng@amd.com>

Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI (sgl-project#17523)

6c0fa6d

Co-authored-by: YC Tseng <yctseng@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI#17523

[AMD] Add Kimi-K2, DeepSeek-V3.2 tests to nightly CI#17523
HaiShaw merged 24 commits intosgl-project:mainfrom
michaelzhang-ai:add_mtp_accuracy_test

michaelzhang-ai commented Jan 21, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

yctseng0211 commented Jan 27, 2026

Uh oh!

Uh oh!

michaelzhang-ai commented Jan 28, 2026 •

edited

Loading

Uh oh!

HaiShaw left a comment

Uh oh!

Uh oh!

michaelzhang-ai commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

michaelzhang-ai commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Kimi-K2 Model (MI325)

Benchmarking and Profiling

DeepSeek-V3.2 Models (MI325)

TestNightlyDeepseekV32BasicPerformance

deepseek-ai/DeepSeek-V3.2 (basic) [MI325]

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

yctseng0211 commented Jan 27, 2026

Uh oh!

Uh oh!

michaelzhang-ai commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HaiShaw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michaelzhang-ai commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaelzhang-ai commented Jan 21, 2026 •

edited

Loading

michaelzhang-ai commented Jan 28, 2026 •

edited

Loading