[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to extend_attention_fwd in aiter_backend by michaelzhang-ai · Pull Request #19736 · sgl-project/sglang

michaelzhang-ai · 2026-03-03T02:12:53Z

Summary

Fix TypeError: extend_attention_fwd() missing 1 required positional argument: 'v_scale' crash in aiter_backend when running non-MLA speculative decoding (target_verify / draft_extend) paths: https://github.com/sgl-project/sglang/actions/runs/22648816293/job/65643197339#step:5:6363
Add missing k_scale=1.0, v_scale=1.0 positional args to the extend_attention_fwd call

Motivation

#18882 added k_scale and v_scale as required positional parameters to extend_attention_fwd in triton_ops/extend_attention.py and updated triton_backend.py, but missed the call site in aiter_backend.py for non-MLA target_verify/draft_extend paths (used by hybrid models like Qwen3-Coder-Next with MTP).

Fixes

Nightly (AMD ROCm) failure: https://github.com/sgl-project/sglang/actions/runs/22648816293/job/65643197339#step:5:6363

TypeError: extend_attention_fwd() missing 1 required positional argument: 'v_scale'

Test plan

Re-run nightly-accuracy-8-gpu-mi35x / nightly-accuracy-8-gpu-mi35x-rocm720 to confirm Qwen3-Coder-Next MTP passes:
https://github.com/sgl-project/sglang/actions/runs/22649193163/job/65644407193

gemini-code-assist · 2026-03-03T02:12:57Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

yichiche

This PR fixes a crash in the CUDA graph replay path for non-MLA backends by properly initializing the forward metadata. Previously, custom_mask and mask_indptr were not consistently set outside the MLA path, which could lead to invalid memory access during replay. LGTM.

gemini-code-assist · 2026-03-03T03:00:28Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

HaiShaw · 2026-03-03T04:37:39Z

/tag-and-rerun-ci

…d call #18882 added k_scale and v_scale as required positional parameters to extend_attention_fwd and updated triton_backend.py, but missed updating the call site in aiter_backend.py for non-MLA target_verify/draft_extend paths. This caused Qwen3-Coder-Next MTP to crash with: TypeError: extend_attention_fwd() missing 1 required positional argument: 'v_scale' Fixes: https://github.com/sgl-project/sglang/actions/runs/22636393480/job/65600256830

kkHuang-amd

LGTM

…d_attention_fwd in aiter_backend (sgl-project#19736)

michaelzhang-ai requested review from bingxche, yctseng0211 and yichiche March 3, 2026 02:15

yichiche approved these changes Mar 3, 2026

View reviewed changes

michaelzhang-ai marked this pull request as ready for review March 3, 2026 03:00

michaelzhang-ai requested review from Fridge003, HaiShaw, Qiaolin-Yu, hebiao064, ispobock and merrymercy as code owners March 3, 2026 03:00

github-actions bot added the run-ci label Mar 3, 2026

HaiShaw assigned kkHuang-amd Mar 3, 2026

michaelzhang-ai force-pushed the amd/fix-aiter-mla-replay-cuda-graph-v2 branch from 6aedf5d to 9cf021b Compare March 4, 2026 00:22

michaelzhang-ai changed the title ~~[AMD] Fix Qwen3-Coder-Next: AiterAttnBackend crash for non-MLA backends~~ [AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to extend_attention_fwd in aiter_backend Mar 4, 2026

HaiShaw approved these changes Mar 4, 2026

View reviewed changes

kkHuang-amd approved these changes Mar 4, 2026

View reviewed changes

HaiShaw merged commit c6850ac into main Mar 4, 2026
175 of 198 checks passed

HaiShaw deleted the amd/fix-aiter-mla-replay-cuda-graph-v2 branch March 4, 2026 06:01

Kangyan-Zhou pushed a commit to Kangyan-Zhou/sglang that referenced this pull request Mar 4, 2026

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to exten…

a7085ab

…d_attention_fwd in aiter_backend (sgl-project#19736)

qeternity pushed a commit to qeternity/sglang that referenced this pull request Mar 6, 2026

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to exten…

5e5cc50

…d_attention_fwd in aiter_backend (sgl-project#19736)

This was referenced Mar 8, 2026

[AMD] Add Claude skills for AMD CI workflows michaelzhang-ai/sglang#8

Closed

[AMD] Add Claude skills for AMD CI workflows #20116

Draft

magicYang1573 pushed a commit to magicYang1573/sglang that referenced this pull request Mar 9, 2026

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to exten…

58e1f6d

…d_attention_fwd in aiter_backend (sgl-project#19736)

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to exten…

a997017

…d_attention_fwd in aiter_backend (sgl-project#19736)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to extend_attention_fwd in aiter_backend#19736

[AMD] Fix Qwen3-Coder-Next: Add missing k_scale/v_scale args to extend_attention_fwd in aiter_backend#19736
HaiShaw merged 1 commit intomainfrom
amd/fix-aiter-mla-replay-cuda-graph-v2

michaelzhang-ai commented Mar 3, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

yichiche left a comment

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

HaiShaw commented Mar 3, 2026

Uh oh!

kkHuang-amd left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

michaelzhang-ai commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Fixes

Test plan

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

yichiche left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Uh oh!

HaiShaw commented Mar 3, 2026

Uh oh!

kkHuang-amd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaelzhang-ai commented Mar 3, 2026 •

edited

Loading