Skip to content

Add missing buffer set in allreduce fallback !COMPUTE clear#23480

Merged
ggerganov merged 1 commit into
ggml-org:masterfrom
TheBlueMatt:2026-05-ar-fallback
May 29, 2026
Merged

Add missing buffer set in allreduce fallback !COMPUTE clear#23480
ggerganov merged 1 commit into
ggml-org:masterfrom
TheBlueMatt:2026-05-ar-fallback

Conversation

@TheBlueMatt

Copy link
Copy Markdown
Contributor

Without this at least the vulkan backend will skip the * 0 for !COMPUTE tensors, causing corrupt output.

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES, claude found this bug while debugging unrelated issues with tensor par.

Without this at least the vulkan backend will skip the `* 0` for
!COMPUTE tensors, causing corrupt output.
@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label May 21, 2026

@JohannesGaessler JohannesGaessler left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be correct, can you describe the failure mode when this is not set?

@TheBlueMatt

Copy link
Copy Markdown
Contributor Author

The check at

if (ggml_is_empty(node) || ggml_op_is_empty(node->op) || !node->buffer) {
will skip the SCALE op, leading to gibberish output (chinese from english prompts, repeated tokens, etc). I found this when I ran into an issue in the driver which led to my vulkan allreduce hitting the fallback path and corrupting output.

Admittedly currently this doesn't accomplish much against upstream vulkan because trying to actually use tensor parallelism on vulkan immediately segfaults due to the meta backend not handling split buffers. At some point I'll get around to upstreaming my patch for that, but once I do that this will also be required :).

@JohannesGaessler

Copy link
Copy Markdown
Contributor

@ggerganov @gaugarg-nv can either one of you please review this PR?

@gaugarg-nv

Copy link
Copy Markdown
Contributor

Looks good to me.

@ggerganov ggerganov merged commit 33c718d into ggml-org:master May 29, 2026
47 of 49 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 29, 2026
* origin/master:
vocab : support tokenizer for LFM2.5-8B-A1B (ggml-org#23826)
graph : ensure DS32 kq_mask_lid is F32 (ggml-org#23864)
server: remove obsolete scripts (ggml-org#23870)
ci : update macos release to use macos-26 runner (ggml-org#23878)
download: add option to skip_download (ggml-org#23059)
mtmd: Add DeepSeekOCR 2 Support (ggml-org#20975)
CUDA: Check PTX version on host side to guard PDL dispatch (ggml-org#23530)
server: bump timeout to 3600s (ggml-org#23842)
model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation (ggml-org#23346)
llama: use f16 mask for FA to save VRAM (ggml-org#23764)
sync : ggml
ggml : bump version to 0.13.1 (ggml/1523)
ngram-mod : Add missing include (ggml-org#23857)
llama: add llm_graph_input_mtp (ggml-org#23643)
app : move licences to llama-app (ggml-org#23824)
cuda : disables launch_fattn PDL enrollment due to compiler bug (ggml-org#23825)
meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (ggml-org#23480)
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
…gml-org#23480)

Without this at least the vulkan backend will skip the `* 0` for
!COMPUTE tensors, causing corrupt output.
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
…gml-org#23480)

Without this at least the vulkan backend will skip the `* 0` for
!COMPUTE tensors, causing corrupt output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants