opencl: use flat variants of gemv for very large M by lhez · Pull Request #24006 · ggml-org/llama.cpp

lhez · 2026-06-02T04:44:43Z

Overview

After some profiling, it turns out that gemv-noshuffle kernels for Q4_K and Q6_K are slow with very large M (those seen in vocab). On the contrary, the flat variants are faster. This PR uses flat GEMV variants for such large M.

Additional information

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: Yes, asked Claude to profile gemma-4 and Qwen3.5 non-MoE models and identified this.

lhez · 2026-06-02T17:53:39Z

@ggml-org/maintainers Can I please get another approval?

…l-org#24006)

…l-org#24006) (cherry picked from commit 63e66fd)

opencl: use flat variants of gemv for very large M

a2c94a9

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Jun 2, 2026

lhez marked this pull request as ready for review June 2, 2026 06:42

lhez requested a review from a team as a code owner June 2, 2026 06:42

max-krasnyansky approved these changes Jun 2, 2026

View reviewed changes

CISC approved these changes Jun 2, 2026

View reviewed changes

lhez merged commit 63e66fd into ggml-org:master Jun 2, 2026
25 of 26 checks passed

arichiardi pushed a commit to arichiardi/llama.cpp that referenced this pull request Jun 2, 2026

opencl: use flat variants of q4_K and q6_K gemv for very large M (ggm…

fc487d9

…l-org#24006)

jimbothigpen pushed a commit to jimbothigpen/llama.cpp that referenced this pull request Jun 6, 2026

opencl: use flat variants of q4_K and q6_K gemv for very large M (ggm…

b3a3941

…l-org#24006) (cherry picked from commit 63e66fd)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencl: use flat variants of gemv for very large M#24006

opencl: use flat variants of gemv for very large M#24006
lhez merged 1 commit into
ggml-org:masterfrom
qualcomm:lh/gemv-large-m-reroute

lhez commented Jun 2, 2026

Uh oh!

lhez commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lhez commented Jun 2, 2026

Overview

Additional information

Requirements

Uh oh!

lhez commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants