[Attention][TurboQuant] Optimize k8v4 decode attention with GQA head grouping#40792
Open
hoseung2 wants to merge 3 commits intovllm-project:mainfrom
Open
[Attention][TurboQuant] Optimize k8v4 decode attention with GQA head grouping#40792hoseung2 wants to merge 3 commits intovllm-project:mainfrom
hoseung2 wants to merge 3 commits intovllm-project:mainfrom
Commits
Commits on Apr 24, 2026
- committed
hoseung-kim - committed
hoseung-kim - committed
hoseung-kim