Skip to content

[Performance] Support MQA/GQA in decode phase by using FlashAttention#2744

Closed
zhaoyang-star wants to merge 11 commits intovllm-project:mainfrom
zhaoyang-star:fa_decode
Closed

[Performance] Support MQA/GQA in decode phase by using FlashAttention#2744
zhaoyang-star wants to merge 11 commits intovllm-project:mainfrom
zhaoyang-star:fa_decode

Commits

Commits on Jan 17, 2024

Commits on Jan 23, 2024

Commits on Feb 2, 2024

Commits on Feb 4, 2024

Commits on Feb 5, 2024