Skip to content

[CUDA] PagedAttention: add SM<80 fp16 fallback via memory-efficient attention#28200

Merged
tianleiwu merged 10 commits into
microsoft:mainfrom
elwhyjay:feature/paged-attention-mea-fallback
Apr 28, 2026
Merged

[CUDA] PagedAttention: add SM<80 fp16 fallback via memory-efficient attention#28200
tianleiwu merged 10 commits into
microsoft:mainfrom
elwhyjay:feature/paged-attention-mea-fallback

[CUDA] PagedAttention: early-return on empty query input (token_count…

7375578
Select commit
Loading
Failed to load commit list.
GitHub Advanced Security / CodeQL completed Apr 25, 2026 in 2s

1 configuration not found

Warning: Code scanning cannot determine the alerts introduced by this pull request, because 1 configuration present on refs/heads/main was not found:

API upload

  • ❓  <default>

View all branch alerts.