test: mul_mat tests with huge batch size by jeffbolznv · Pull Request #19519 · ggml-org/llama.cpp

jeffbolznv · 2026-02-11T16:27:36Z

tests for #19471.
vulkan fix is in #19509.

jeffbolznv · 2026-02-12T03:52:49Z

@reeselevine can you address the webgpu failure? @JohannesGaessler or @am17an can you address the cuda failure? For context, in #19471 with a larger -ub parameter the total batch size was more than 64k and overflowed the max workgroup count in the y or z dimension (in the vulkan backend).

reeselevine · 2026-02-12T06:01:19Z

#19535 should fix the WebGPU failures

am17an · 2026-02-12T06:38:13Z

Are these only for the F16 data-type? For large batch sizes the CUDA code falls back to using cuBLAS, I think that should be a relatively simple change vs doing for quantized data types

jeffbolznv · 2026-02-12T06:43:57Z

In the failing model, everything was GGML_TYPE_F32. The GGML_TYPE_F16 came from me copy/pasting another test case. We could add both if there's an interesting difference in the code paths.

am17an · 2026-02-12T06:52:25Z

as long as it's F16, BF16 or F32 I think #19538 will fix it (passes these tests)

jeffbolznv requested a review from ggerganov as a code owner February 11, 2026 16:27

github-actions bot added the testing Everything test related label Feb 11, 2026

reeselevine mentioned this pull request Feb 12, 2026

ggml webgpu: Fix bug in dispatching large matrix-vector multiplication #19535

Merged

ggerganov approved these changes Feb 12, 2026

View reviewed changes

test: mul_mat tests with huge batch size

f6c10e6

jeffbolznv force-pushed the test_mul_mat_huge_batch branch from a3de448 to f6c10e6 Compare February 19, 2026 14:05

jeffbolznv merged commit 77d6ae4 into ggml-org:master Feb 20, 2026
77 of 78 checks passed

liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026

test: mul_mat tests with huge batch size (ggml-org#19519)

7dd53d9

bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026

test: mul_mat tests with huge batch size (ggml-org#19519)

aee4935

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026

test: mul_mat tests with huge batch size (ggml-org#19519)

728a28c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: mul_mat tests with huge batch size#19519

test: mul_mat tests with huge batch size#19519
jeffbolznv merged 1 commit intoggml-org:masterfrom
jeffbolznv:test_mul_mat_huge_batch

jeffbolznv commented Feb 11, 2026

Uh oh!

jeffbolznv commented Feb 12, 2026

Uh oh!

reeselevine commented Feb 12, 2026

Uh oh!

am17an commented Feb 12, 2026

Uh oh!

jeffbolznv commented Feb 12, 2026

Uh oh!

am17an commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jeffbolznv commented Feb 11, 2026

Uh oh!

jeffbolznv commented Feb 12, 2026

Uh oh!

reeselevine commented Feb 12, 2026

Uh oh!

am17an commented Feb 12, 2026

Uh oh!

jeffbolznv commented Feb 12, 2026

Uh oh!

am17an commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants