Skip to content

[CUDA] PagedAttention: add SM<80 fp16 fallback via memory-efficient attention#28200

Merged
tianleiwu merged 10 commits into
microsoft:mainfrom
elwhyjay:feature/paged-attention-mea-fallback
Apr 28, 2026
Merged

[CUDA] PagedAttention: add SM<80 fp16 fallback via memory-efficient attention#28200
tianleiwu merged 10 commits into
microsoft:mainfrom
elwhyjay:feature/paged-attention-mea-fallback

[CUDA] PagedAttention: early-return on empty query input (token_count…

7375578
Select commit
Loading
Failed to load commit list.
GitHub Advanced Security / lintrunner succeeded Apr 25, 2026 in 1s

No new alerts in code changed by this pull request