vulkan: split mul_mat into multiple dispatches to avoid overflow by jeffbolznv · Pull Request #19509 · ggml-org/llama.cpp

jeffbolznv · 2026-02-11T07:29:27Z

The batch dimensions can be greater than the max workgroup count limit, in which case we need to split into multiple dispatches and pass the base index through a push constant.

Fall back for the less common p021 and nc variants.

Fixes #19471.

The batch dimensions can be greater than the max workgroup count limit, in which case we need to split into multiple dispatches and pass the base index through a push constant. Fall back for the less common p021 and nc variants.

jeffbolznv · 2026-02-11T16:23:42Z

The new tests are failing on multiple backends, I'll move them to a separate PR so this isn't blocked.

ggml/src/ggml-vulkan/ggml-vulkan.cpp

0cc4m · 2026-02-13T12:08:22Z

ggml/src/ggml-vulkan/ggml-vulkan.cpp

+        while (base_work_group_z < batch) {
+            uint32_t groups_z = std::min(batch - base_work_group_z, ctx->device->properties.limits.maxComputeWorkGroupCount[2]);
+
+            ggml_pipeline_request_descriptor_sets(ctx, pipeline, 1);


Why request the descriptor sets in the loop and not before? It's not gonna retrigger pipeline compile of course, but will ping the descriptor pools more than necessary.

ok, moved it out of the loop for now. Eventually, I'd like to not have to explicitly call this anywhere.

It should be possible to automatically request one when grabbing the pipeline and to allocate the sets on demand before dispatch.

Yeah, the main catch right now is that some shaders won't get their wg_denoms initialized until after this call.

…l-org#19509) * vulkan: split mul_mat into multiple dispatches to avoid overflow The batch dimensions can be greater than the max workgroup count limit, in which case we need to split into multiple dispatches and pass the base index through a push constant. Fall back for the less common p021 and nc variants. * address feedback

jeffbolznv requested review from 0cc4m and ggerganov as code owners February 11, 2026 07:29

github-actions bot added testing Everything test related Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 11, 2026

jeffbolznv force-pushed the mul_mat_batch_overflow branch from ccb4ed3 to 66d7c14 Compare February 11, 2026 16:24

jeffbolznv mentioned this pull request Feb 11, 2026

test: mul_mat tests with huge batch size #19519

Merged

reeselevine mentioned this pull request Feb 12, 2026

ggml webgpu: Fix bug in dispatching large matrix-vector multiplication #19535

Merged

0cc4m reviewed Feb 13, 2026

View reviewed changes

address feedback

feb3eef

engrtipusultan mentioned this pull request Feb 15, 2026

Vulkan Scalar Flash Attention Refactor #19625

Merged

0cc4m approved these changes Feb 18, 2026

View reviewed changes

0cc4m merged commit d0061be into ggml-org:master Feb 18, 2026
74 of 78 checks passed

velartrill mentioned this pull request Feb 18, 2026

Eval bug: Vulkan Incoherence after #19509 #19710

Closed

0cc4m mentioned this pull request Feb 19, 2026

vulkan: fix MMQ shader push constants and multi-dispatch #19732

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: split mul_mat into multiple dispatches to avoid overflow#19509

vulkan: split mul_mat into multiple dispatches to avoid overflow#19509
0cc4m merged 2 commits intoggml-org:masterfrom
jeffbolznv:mul_mat_batch_overflow

jeffbolznv commented Feb 11, 2026

Uh oh!

jeffbolznv commented Feb 11, 2026

Uh oh!

Uh oh!

0cc4m Feb 13, 2026

Uh oh!

jeffbolznv Feb 13, 2026

Uh oh!

0cc4m Feb 13, 2026

Uh oh!

jeffbolznv Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeffbolznv commented Feb 11, 2026

Uh oh!

jeffbolznv commented Feb 11, 2026

Uh oh!

Uh oh!

0cc4m Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

jeffbolznv Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

0cc4m Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

jeffbolznv Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants