Skip to content

[Attention][TurboQuant] Optimize k8v4 decode attention with GQA head grouping#40792

Open
hoseung2 wants to merge 3 commits intovllm-project:mainfrom
hoseung2:turboquant-fp8-gqa
Open

[Attention][TurboQuant] Optimize k8v4 decode attention with GQA head grouping#40792
hoseung2 wants to merge 3 commits intovllm-project:mainfrom
hoseung2:turboquant-fp8-gqa

Commits

Commits on Apr 24, 2026