[Attention][TurboQuant] Optimize k8v4 decode attention with GQA head grouping#40792

Open

hoseung2 wants to merge 3 commits intovllm-project:mainfrom

hoseung2:turboquant-fp8-gqa

Commits on Apr 24, 2026

perf: truboquant GQA head grouping
hoseung-kim
committed
test: add turboquant gqa grouping test
hoseung-kim
committed
refactor: remove dead code, unused variables
hoseung-kim
committed