Skip to content

fix: trtllm mha attention auto-selection on sm120#14842

Merged
Fridge003 merged 1 commit intosgl-project:mainfrom
bzhng-development:fix-sm120-trtllm-mha
Dec 12, 2025
Merged

fix: trtllm mha attention auto-selection on sm120#14842
Fridge003 merged 1 commit intosgl-project:mainfrom
bzhng-development:fix-sm120-trtllm-mha

Conversation

@b8zhong
Copy link
Collaborator

@b8zhong b8zhong commented Dec 10, 2025

TRTLLM MHA does not support SM120, SM120 has Flashinfer has the best performing attention backend.

Fix #14814

@b8zhong b8zhong requested review from Fridge003 and fzyzcjy December 10, 2025 19:04
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@b8zhong
Copy link
Collaborator Author

b8zhong commented Dec 10, 2025

/tag-and-rerun-ci

Also, this case can't be covered by CI (there is no SM120 device)

@Fridge003 Fridge003 merged commit fe6d38d into sgl-project:main Dec 12, 2025
91 of 134 checks passed
BenYao21 pushed a commit to minleminzui/sglang that referenced this pull request Dec 13, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
@b8zhong b8zhong deleted the fix-sm120-trtllm-mha branch December 13, 2025 06:10
Prozac614 pushed a commit to Prozac614/sglang that referenced this pull request Dec 17, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] RTX 5090, the attention_backend is automatically set to 'trtllm_mha', but a ValueError is raised during SM version detection.

3 participants