Skip to content

UPSTREAM PR #19286: completion : simplify batch (embd) processing#1151

Open
loci-dev wants to merge 2 commits intomainfrom
loci/pr-19286-completion-embd-processing-simplification
Open

UPSTREAM PR #19286: completion : simplify batch (embd) processing#1151
loci-dev wants to merge 2 commits intomainfrom
loci/pr-19286-completion-embd-processing-simplification

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Feb 3, 2026

Note

Source pull request: ggml-org/llama.cpp#19286

This commit simplifies the processing of embd by removing the for loop that currently exists which uses params.n_batch as its increment. This commit also removes the clamping of n_eval as the size of embd is always at most the size of params.n_batch.

The motivation is to clarify the code as it is currently a little confusing when looking at this for loop in isolation and thinking that it can process multiple batches.

This commit simplifies the processing of embd by removing the for loop
that currently exists which uses params.n_batch as its increment. This
commit also removes the clamping of n_eval as the size of embd is always
at most the size of params.n_batch.

The motivation is to clarify the code as it is currently a little
confusing when looking at this for loop in isolation and thinking that
it can process multiple batches.
@loci-review
Copy link

loci-review bot commented Feb 3, 2026

No meaningful performance changes were detected across 115468 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-bench, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-gemma3-cli, build.bin.libggml.so, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-review
Copy link

loci-review bot commented Feb 3, 2026

No meaningful performance changes were detected across 115468 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-gemma3-cli, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.llama-bench.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 10 times, most recently from 823244c to bab7d39 Compare February 19, 2026 02:17
@loci-dev loci-dev force-pushed the main branch 10 times, most recently from a92fe2a to 6495042 Compare February 27, 2026 02:17
@loci-dev loci-dev force-pushed the main branch 9 times, most recently from 4298c74 to 0db6c47 Compare March 7, 2026 02:16
@loci-dev loci-dev force-pushed the main branch 8 times, most recently from 56aaa36 to 21147c2 Compare March 13, 2026 02:17
@loci-dev loci-dev force-pushed the main branch 10 times, most recently from 945fa3a to 0e8e1d6 Compare March 20, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants