Skip to content

CUDA backend: 3-bit uniform KV cache (turbo3, 4.6x compression, 96% f16 speed)#15

Closed
nalditopr wants to merge 95 commits into
TheTom:feature/turboquant-kv-cachefrom
nalditopr:feature/tq3-reference-approach
Closed

CUDA backend: 3-bit uniform KV cache (turbo3, 4.6x compression, 96% f16 speed)#15
nalditopr wants to merge 95 commits into
TheTom:feature/turboquant-kv-cachefrom
nalditopr:feature/tq3-reference-approach

Commits

Commits on Mar 26, 2026

Commits on Mar 27, 2026

Commits on Mar 28, 2026

Commits on Mar 29, 2026