Skip to content

Turn off PREBUILD aiter in MI355#13963

Merged
HaiShaw merged 2 commits intosgl-project:mainfrom
1am9trash:mi355-jit-docker-image
Nov 26, 2025
Merged

Turn off PREBUILD aiter in MI355#13963
HaiShaw merged 2 commits intosgl-project:mainfrom
1am9trash:mi355-jit-docker-image

Conversation

@1am9trash
Copy link
Contributor

@1am9trash 1am9trash commented Nov 26, 2025

Motivation

There are some errors in aiter PREBUILD mode, which cause accuracy drop to 0% on DS-MXFP4 model. The errors have been solved but not yet merge into aiter main. We may use jit build to make the docker run correctly for now.

Modifications

Turn off BUILD_AITER_ALL flag in rocm.dockerfile.

Accuracy Tests

Build the image locally, and then run sglang server with jit build.
Machine: MI355 * 8
Model: DeepSeek-R1-MXFP4-Preview

root@smci355-ccs-aus-m15-21:/sgl-workspace/sglang# python3 benchmark/gsm8k/bench_sglang.py --num-questions 2000 --parallel 2000 --port 8000
100%|██████████████████████████████████████████████████████████████| 1319/1319 [01:58<00:00, 11.14it/s]
Accuracy: 0.937
Invalid: 0.000
Latency: 118.575 s

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the amd label Nov 26, 2025
@HaiShaw HaiShaw merged commit 5a8adca into sgl-project:main Nov 26, 2025
36 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants