UPSTREAM PR #19785: jinja: correct stats for tojson and string filters by loci-dev · Pull Request #1198 · auroralabs-loci/llama.cpp

loci-dev · 2026-02-22T03:07:34Z

Note

Source pull request: ggml-org/llama.cpp#19785

@pwilkin please give this a try (see the added test case for more info)

loci-review · 2026-02-22T04:40:41Z

Overview

Analysis of 111,678 functions across llama.cpp versions reveals minimal performance impact from a single commit adding Jinja template statistics tracking. Modified: 72 functions (0.06%), New: 2, Removed: 0, Unchanged: 111,604 (99.93%).

Power Consumption Changes:

build.bin.llama-cvector-generator: -0.026%
build.bin.llama-tts: -0.035%
build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-bench, build.bin.libggml.so, build.bin.libggml-cpu.so, build.bin.libggml-base.so, build.bin.llama-tokenize, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli: 0.000%

Function Analysis

Intentional Regressions (Template Statistics Tracking):

value_array_t::operator() and value_object_t::operator() (llama-tts, llama-cvector-generator): Response time +5,588-5,650ns (+30.5-30.9%), throughput time +45ns (+32.8%). Added conditional is_get_stats checks and recursive mark_used() calls for template variable usage tracking. Overhead only manifests when stats collection explicitly enabled; negligible in normal operation.

Compiler-Driven Improvements:

std::vector::_S_max_size (cvector-generator): Response time -203-208ns (-56.2-56.7%), throughput time -203-208ns (-62.6-63.4%)
std::make_move_iterator (cvector-generator): Response time -168ns (-58.4%), throughput time -168ns (-68.4%)
nlohmann::json::iterator_input_adapter_factory::create (llama-tts): Response time -56ns (-28.7%), throughput time -56ns (-43.0%)
jinja::string::is_uppercase (llama-tts): Response time -194ns (-24.3%), throughput time -194ns (-58.3%)

No source code changes to these functions; improvements attributed to compiler optimization differences between builds.

Other analyzed functions showed minor changes from compilation unit effects or measurement variance, with no practical runtime impact.

Additional Findings

No changes to performance-critical inference components: matrix multiplication (70-90% of inference time), attention mechanisms, KV cache operations, or GPU kernels remain unchanged. Template processing occurs during initialization, not in the token generation hot path. The opt-in statistics feature provides valuable debugging capabilities with acceptable overhead confined to non-critical code paths.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

jinja: correct stats for tojson and string filters

3e33f8d

loci-dev temporarily deployed to PROD__AL_DEMO February 22, 2026 03:07 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 10 times, most recently from 8c889a6 to 13648e6 Compare March 2, 2026 02:17

loci-dev force-pushed the main branch 8 times, most recently from 17452e3 to 551dfb5 Compare March 10, 2026 02:17

loci-dev force-pushed the main branch 9 times, most recently from 910a8a6 to 3c7b997 Compare March 17, 2026 02:18

loci-dev force-pushed the main branch 2 times, most recently from 5ac00d6 to 998dd7a Compare March 18, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #19785: jinja: correct stats for tojson and string filters#1198

UPSTREAM PR #19785: jinja: correct stats for tojson and string filters#1198
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19785-xsn-jinja_tojson_stats

loci-dev commented Feb 22, 2026

Uh oh!

loci-review bot commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Feb 22, 2026

Uh oh!

loci-review bot commented Feb 22, 2026

Overview

Function Analysis

Additional Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants