UPSTREAM PR #19239: jinja : add missing 'in' test to template engine (#19004)#1117
UPSTREAM PR #19239: jinja : add missing 'in' test to template engine (#19004)#1117
Conversation
The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".
This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.
Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.
Includes test cases for all three containment types plus
reject/select filter usage.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8e59b18 to
ce8f671
Compare
OverviewAnalysis of commit 421ffab adding Jinja2 'in' operator to llama.cpp's template engine. Examined 115,443 functions across 15 binaries: 118 modified (0.10%), 116 new, 57 removed, 115,152 unchanged. Power Consumption Changes:
All changes are within measurement noise (<0.1%), indicating negligible energy impact. Function AnalysisTemplate Engine Functions (Modified/New):
STL Template Functions (Compiler Artifacts):
JSON Parsing Functions (Compiler Artifacts):
Jinja Array Builtin:
PEG Parser Functions (Debug Only):
Other analyzed functions showed negligible changes. Additional FindingsCore Inference Operations: Zero changes to critical paths—llama_decode(), GEMM operations, attention mechanisms, KV cache, quantization kernels, and all GPU backends (CUDA, Metal, HIP, Vulkan, SYCL) remain completely unaffected. Template processing occurs during preprocessing, isolated from inference pipeline. Cumulative Impact: Template evaluation overhead increased ~655μs (+4-6%), representing 0.06-0.6% of total inference time (105-1023ms). Initialization overhead increased 11.76μs (+0.0001% of 1-10 seconds). All performance variations are compiler optimization artifacts affecting non-critical paths. Justification: The Jinja 'in' operator addition provides valuable template expressiveness for chat templates and structured output. Performance changes are acceptable given: (1) zero impact on inference hot path, (2) negligible absolute overhead, (3) most changes are compiler artifacts without source modifications, (4) functional enhancement justifies minor preprocessing overhead. 🔎 Full breakdown: Loci Inspector. |
1e94f5e to
01000b6
Compare
048ad94 to
6c1fde6
Compare
823244c to
bab7d39
Compare
a92fe2a to
6495042
Compare
Note
Source pull request: ggml-org/llama.cpp#19239
Hey all!
Note: Fixes #19004
Context:
The jinja template parser was missing the
'in'test fromglobal_builtins(), causing templates usingreject("in", ...), select("in", ...), or 'x is in(y)'to fail withselectattr: unknown test 'in'This broke tool-calling for Qwen3-Coder and would break for any other model whose chat template uses the
intest.What I did:
test_is_insupporting array, string, and object containment checks, mirroring the existing 'in' operator logic in runtime.cpp.Local Test environment:
Comments:
Appreciate the look - let me know if you have any questions!