Add SM120 (Blackwell desktop) MXFP4 support by amittell · Pull Request #16975 · sgl-project/sglang

amittell · 2026-01-12T18:47:33Z

SM120 (RTX PRO 6000, RTX 5090) doesn't support persistent kernels the same way SM100 does. This adds SM120-specific configuration following the same pattern as vLLM PR vllm-project/vllm#31089:

Use StridedLayout instead of TMA block layout
Set is_persistent=False and num_stages=1

Tested with GPT-OSS-120B on RTX PRO 6000:

4K: 151 tok/s
131K: 57 tok/s

Fixes #13061, related to #9707, #12695

SM120 (RTX PRO 6000, RTX 5090) doesn't support persistent kernels the same way SM100 does. This adds SM120-specific configuration following the same pattern as vLLM PR #31089: - Use StridedLayout instead of TMA block layout - Set is_persistent=False and num_stages=1 Tested with GPT-OSS-120B on RTX PRO 6000: - 4K context: 151 tok/s - 131K context: 57 tok/s Fixes sgl-project#13061 Related: sgl-project#9707, sgl-project#12695

gemini-code-assist · 2026-01-12T18:47:37Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

SM120 (Blackwell desktop) doesn't support flashinfer_mxfp4 due to persistent kernel limitations. This adds: 1. server_args.py: Auto-select triton_kernel for SM120 + MXFP4 2. mxfp4.py: Use StridedLayout for SM120 with triton_kernel backend Tested on RTX PRO 6000 with GPT-OSS-120B - server starts and runs without needing to manually specify --moe-runner-backend.

b8zhong · 2026-03-02T01:49:32Z

@amittell Could you help fix the merge conflicts? Thanks!

b8zhong · 2026-03-02T22:26:59Z

Hi, I add you as coauthor in #19718. Thanks~

amittell requested review from AniZpZ, BBuf, Edwardf0t1, FlamingoPg and ch-wan as code owners January 12, 2026 18:47

amittell added 2 commits January 12, 2026 13:55

fix: apply black formatting

f31f171

This was referenced Jan 29, 2026

fix(attention): add SM120 block size configuration for extend attention #17908

Closed

[Bug] Kimi k2 crashes sglang after first request on sm120 #14322

Closed

This was referenced Mar 2, 2026

SM120 Performance Optimization Plan #19637

Open

Support triton_kernels for GPT-OSS on SM120 #19718

Merged

b8zhong closed this Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SM120 (Blackwell desktop) MXFP4 support#16975

Add SM120 (Blackwell desktop) MXFP4 support#16975
amittell wants to merge 3 commits intosgl-project:mainfrom
amittell:sm120-mxfp4-support

amittell commented Jan 12, 2026

Uh oh!

gemini-code-assist bot commented Jan 12, 2026

Uh oh!

b8zhong commented Mar 2, 2026

Uh oh!

b8zhong commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amittell commented Jan 12, 2026

Uh oh!

gemini-code-assist bot commented Jan 12, 2026

Uh oh!

b8zhong commented Mar 2, 2026

Uh oh!

b8zhong commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants