Add aiter attention support in prefill-attention-backend of gpt-oss by kkHuang-amd · Pull Request #18282 · sgl-project/sglang

kkHuang-amd · 2026-02-05T04:00:42Z

Motivation

Improve the attention performance on ROCm platform.

Modifications

aiter_backend.py to support sliding window attention with sink
server_args.py to add "aiter" for supporting list

Accuracy Tests

Server command

SGLANG_USE_AITER=1 python3 -m sglang.launch_server --model-path /dockerx/data/models/openai/gpt-oss-120b/ --tp 8 --trust-remote-code --chunked-prefill-size 130172 --max-running-requests 128 --mem-fraction-static 0.85 --prefill-attention-backend aiter --decode-attention-backend triton --port 8000

Client command

/sglang# python3 benchmark/gsm8k/bench_sglang.py --num-questions 2000 --parallel 2000 --port 8000 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1319/1319 [00:50<00:00, 26.00it/s] Accuracy: 0.829 Invalid: 0.014 Latency: 86.113 s Output throughput: 5072.316 token/s

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-02-05T04:00:45Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…gl-project#18282) Co-authored-by: wunhuang <wunhuang@amd.com>

aiter MHA prefill support sliding window attention with sink

5efdefd

kkHuang-amd requested review from Fridge003, Qiaolin-Yu, hebiao064, ispobock and merrymercy as code owners February 5, 2026 04:00

kkHuang-amd added amd run-ci labels Feb 5, 2026

kkHuang-amd added 2 commits February 5, 2026 12:01

Merge branch 'main' into gpt-oss-with-aiter-prefill-swa

00afda2

Merge branch 'main' into gpt-oss-with-aiter-prefill-swa

d2fa223

kkHuang-amd requested a review from HaiShaw as a code owner March 2, 2026 00:55

HaiShaw approved these changes Mar 2, 2026

View reviewed changes

HaiShaw merged commit 15af26d into sgl-project:main Mar 2, 2026
157 of 172 checks passed

kkHuang-amd deleted the gpt-oss-with-aiter-prefill-swa branch March 2, 2026 08:26

Kangyan-Zhou pushed a commit to Kangyan-Zhou/sglang that referenced this pull request Mar 4, 2026

Add aiter attention support in prefill-attention-backend of gpt-oss (s…

3a092dc

…gl-project#18282) Co-authored-by: wunhuang <wunhuang@amd.com>

magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026

Add aiter attention support in prefill-attention-backend of gpt-oss (s…

f8a1d0e

…gl-project#18282) Co-authored-by: wunhuang <wunhuang@amd.com>

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

Add aiter attention support in prefill-attention-backend of gpt-oss (s…

3f00ecb

…gl-project#18282) Co-authored-by: wunhuang <wunhuang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add aiter attention support in prefill-attention-backend of gpt-oss#18282

Add aiter attention support in prefill-attention-backend of gpt-oss#18282
HaiShaw merged 3 commits intosgl-project:mainfrom
HaiShaw:gpt-oss-with-aiter-prefill-swa

kkHuang-amd commented Feb 5, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kkHuang-amd commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Server command

Client command

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kkHuang-amd commented Feb 5, 2026 •

edited

Loading