Skip to content

[CUDA] Increase number of output elements per-thread block if the K-dimension is small#20635

Open
gaugarg-nv wants to merge 3 commits intoggml-org:masterfrom
gaugarg-nv:small_k_optimization
Open

[CUDA] Increase number of output elements per-thread block if the K-dimension is small#20635
gaugarg-nv wants to merge 3 commits intoggml-org:masterfrom
gaugarg-nv:small_k_optimization

Commits

Commits on Mar 18, 2026