ggml-webgpu: only use subgroup-matrix path when head dims are divisib… by ArberSephirotheca · Pull Request #23020 · ggml-org/llama.cpp

ArberSephirotheca · 2026-05-13T18:22:25Z

Overview

Previously, WebGPU FlashAttention selected the subgroup matrix path whenever subgroup matrix support was available. However, this fails in certain cases. For example, Jetson Thor’s smallest supported subgroup matrix shape is 16x16x16, which is incompatible with head dimensions such as 40 and 72.
This change adds a shape guard before selecting the subgroup matrix path. Specifically, it requires:
head_dim_qk % sg_mat_k == 0 and head_dim_v % sg_mat_n == 0.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: Yes

I used an AI agent to help me understand why the tests are failing on my Jetson Thor machine.

…le by sg_mat_k / sg_mat_n

reeselevine · 2026-05-13T19:45:39Z

Nice I wonder if this is the same failure I'm observing just now as I try to enable the nvidia ci: https://github.com/ggml-org/llama.cpp/actions/runs/25816362883/job/75845993489?pr=22976#step:4:13081

ArberSephirotheca · 2026-05-13T20:08:33Z

Yea very likely, These tests were also failed on my Jetson Thor as they have hsv = 40, which is not divisible by 16.

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

c6f97cf

…le by sg_mat_k / sg_mat_n

ArberSephirotheca requested a review from a team as a code owner May 13, 2026 18:22

CISC approved these changes May 13, 2026

View reviewed changes

reeselevine approved these changes May 13, 2026

View reviewed changes

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning WebGPU labels May 13, 2026

reeselevine merged commit 4c1c3ac into ggml-org:master May 13, 2026
47 checks passed

xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 14, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

4a65ee5

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

dandm1 pushed a commit to dandm1/llama.cpp that referenced this pull request May 16, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

3cf4c11

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

1785043

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

3942931

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

451b348

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…

bec8f49

…le by sg_mat_k / sg_mat_n (ggml-org#23020)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…#23020

ggml-webgpu: only use subgroup-matrix path when head dims are divisib…#23020
reeselevine merged 1 commit into
ggml-org:masterfrom
ArberSephirotheca:webgpu-fattn-sgmat-dim-guard

ArberSephirotheca commented May 13, 2026

Uh oh!

reeselevine commented May 13, 2026

Uh oh!

ArberSephirotheca commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ArberSephirotheca commented May 13, 2026

Overview

Requirements

Uh oh!

reeselevine commented May 13, 2026

Uh oh!

ArberSephirotheca commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants