Add missing buffer set in allreduce fallback !COMPUTE clear#23480
Merged
Conversation
Without this at least the vulkan backend will skip the `* 0` for !COMPUTE tensors, causing corrupt output.
JohannesGaessler
approved these changes
May 22, 2026
JohannesGaessler
left a comment
Contributor
There was a problem hiding this comment.
This should be correct, can you describe the failure mode when this is not set?
Contributor
Author
|
The check at llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp Line 13213 in 9c92e96 Admittedly currently this doesn't accomplish much against upstream vulkan because trying to actually use tensor parallelism on vulkan immediately segfaults due to the meta backend not handling split buffers. At some point I'll get around to upstreaming my patch for that, but once I do that this will also be required :). |
Contributor
|
@ggerganov @gaugarg-nv can either one of you please review this PR? |
gaugarg-nv
approved these changes
May 28, 2026
Contributor
|
Looks good to me. |
gabe-l-hart
added a commit
to gabe-l-hart/llama.cpp
that referenced
this pull request
May 29, 2026
* origin/master: vocab : support tokenizer for LFM2.5-8B-A1B (ggml-org#23826) graph : ensure DS32 kq_mask_lid is F32 (ggml-org#23864) server: remove obsolete scripts (ggml-org#23870) ci : update macos release to use macos-26 runner (ggml-org#23878) download: add option to skip_download (ggml-org#23059) mtmd: Add DeepSeekOCR 2 Support (ggml-org#20975) CUDA: Check PTX version on host side to guard PDL dispatch (ggml-org#23530) server: bump timeout to 3600s (ggml-org#23842) model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (ggml-org#23346) llama: use f16 mask for FA to save VRAM (ggml-org#23764) sync : ggml ggml : bump version to 0.13.1 (ggml/1523) ngram-mod : Add missing include (ggml-org#23857) llama: add llm_graph_input_mtp (ggml-org#23643) app : move licences to llama-app (ggml-org#23824) cuda : disables launch_fattn PDL enrollment due to compiler bug (ggml-org#23825) meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (ggml-org#23480)
fewtarius
pushed a commit
to fewtarius/llama.cpp
that referenced
this pull request
May 30, 2026
…gml-org#23480) Without this at least the vulkan backend will skip the `* 0` for !COMPUTE tensors, causing corrupt output.
turbo-tan
pushed a commit
to turbo-tan/llama.cpp-tq3
that referenced
this pull request
Jun 2, 2026
…gml-org#23480) Without this at least the vulkan backend will skip the `* 0` for !COMPUTE tensors, causing corrupt output.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Without this at least the vulkan backend will skip the
* 0for !COMPUTE tensors, causing corrupt output.