(internal) llamacpp-llm: chat template tests with Qwen3 model by mialso · Pull Request #1036 · tetherto/qvac

mialso · 2026-03-20T08:52:55Z

asana task

Description

improve tests, as per comment

test_chat_template_utils.cpp: cover the isQwen3Model branch logic that was previously untested

…MakeLists The new source file was added to the test CMakeLists but missing from the addon and cli_tool targets, causing an undefined symbol linker error on CI win64 builds. Made-with: Cursor

Made-with: Cursor

Reorder TextLlmContext members so threadpools are declared before llamaInit_. C++ destroys members in reverse declaration order, so llamaInit_ (which calls llama_free) now runs while threadpools are still alive, preventing use-after-free when llama_free accesses attached threadpool pointers. Made-with: Cursor

…t (linux)" This reverts commit 7d9c237.

…exit The ThreadPoolDeleter was doing ggml backend registry lookups during destruction, which is fragile during process teardown when the registry may already be torn down. Additionally, threadpools attached to llama_context could be freed before the context itself, causing use-after-free. Fix: cache ggml_threadpool_free fn pointer at construction time, and add explicit destructor that detaches threadpools before freeing them. Made-with: Cursor

…SEGV on exit" This reverts commit 4e66b38.

When a prefill run leaves nPast_ > 0 and the next run is a non-cached single-shot, the stale KV cache and dynamic-tools bookkeeping (nPastBeforeTools_, nConversationOnlyTokens_) caused token duplication and incorrect cache trimming. Clear state eagerly when shouldResetAfterInference is true and nPast_ is non-zero. Made-with: Cursor

…_end When tools_at_end is true and a session continues without explicit save between turns, old tool+response tokens remained in the KV cache. New tool tokens were appended, causing conflicting tool definitions. Add a guard in processPrompt() that trims from nPastBeforeTools_ to nPast_ before eval when stale tool tokens are detected. Includes new dynamic-tools integration tests covering changing tools, same tools, and single-shot regression. Made-with: Cursor

…e selection race The toolsAtEnd flag was set via setToolsAtEnd() after context creation, but getChatTemplateForModel() was called during construction — always seeing toolsAtEnd=0 and selecting the wrong Qwen3 template. Pass the flag through createContext() into TextLlmContext and MtmdLlmContext constructors so the correct template is selected from the start. Also restore the conditional template selection in ChatTemplateUtils that was previously hardcoded.

Add stripInternalBlocks() helper to testToolRemoval.js and benchToolsPlacement.js to remove <tool_call> and <think> blocks from assistant responses before including them in conversation history. Prevents model from pattern-matching on old tool calls and hallucinating removed tools. Also extend benchToolsPlacement to 20 turns and add HTML chart.

…rn cache tests

… on save

…ools

@jesusmb1995

Implement all 10 reviewer requests from PR #706 (jesusmb1995, gianni-cor). | # | Reviewer | Request | Result | |---|---------|---------|--------| | R1 | @jesusmb1995 | Extract DynamicToolsState class | Done - new class in LlmContext.hpp with toolsAtEnd_, nConversationOnlyTokens_, nPastBeforeTools_, recordToolBoundary(), reset() | | R2 | @jesusmb1995 | Collapse 3 virtual methods into single dynamicToolsState() accessor | Done - removed setToolsAtEnd, getNPastBeforeTools, setNPastBeforeTools virtuals; added dynamicToolsState() non-virtual accessor on base class | | R3 | @gianni-cor | Remove redundant setToolsAtEnd() after createContext() | Done - removed the 4-line block in LlamaModel::init() | | R4 | @gianni-cor | Add assert: nConversationOnlyTokens_ <= inputTokens.size() | Done - added in TextLlmContext::tokenizeChat | | R5 | @gianni-cor | Reset nConversationOnlyTokens_ in TextLlmContext::resetState | Done - both contexts now call dynamicToolsState().reset() which resets both values | | R6 | @gianni-cor | Guard tools_at_end for non-Qwen3 models | Done - architecture check after config parsing, logs warning and disables flag | | R7 | @gianni-cor | Fix off-by-A trim error (disable add_generation_prompt) | Done - both TextLlmContext and MtmdLlmContext save/restore add_generation_prompt=false during no-tools tokenization | | R8 | @gianni-cor | Add cold-start reset in MtmdLlmContext::tokenizeChat | Done - dynamicToolsState().reset() added at cold-start path | | R9 | @gianni-cor | Cap firstMsgTokens_ after post-eval trim | Done - setFirstMsgTokens(getNPast()) if inflated after trim | | R10 | @gianni-cor | Remove duplicate toolsAtEnd_ from LlamaModel | Done - runtime code in processPromptImpl queries dynamicToolsState().toolsAtEnd() instead of state_->toolsAtEnd_ | Made-with: Cursor

…le source of truth in DynamicToolsState Made-with: Cursor

…e cleanup Made-with: Cursor

…match test" This reverts commit 181b98a.

This reverts commit 27e6a5c.

Made-with: Cursor

N3: Save/restore inputs.use_jinja around no-tools tokenization to prevent getPrompt() Jinja fallback from corrupting the flag. N4: Remove dead Jinja template variables (ns.multi_step_tool, ns.last_query_index) from Qwen3ToolsDynamicTemplate. N5: Add missing assert(conversationOnlyTokens <= totalTokens) in MtmdLlmContext::tokenizeChat, matching TextLlmContext. N6: Document Qwen3-only model support in tools-at-end.md. N7: Merge duplicate if(nPast_==0 && !isCacheLoaded) blocks in TextLlmContext::tokenizeChat. N8: Remove unnecessary save/restore of inputs.tools and inputs.add_generation_prompt (locals not read after). Also: merge main into feature branch, move dynamic-tools changelog to separate 0.13.1 entry. Made-with: Cursor

Made-with: Cursor

The file packages/ocr-onnx/big_and_clear_watermarks.png was unintentionally staged during merge conflict resolution. Made-with: Cursor

Made-with: Cursor

mialso · 2026-03-24T23:52:58Z

closing in flavor of #1121 since this has unsigned commits

github-actions · 2026-03-24T23:53:26Z

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ❌ PENDING

**Requirements:**
- 1 Team Member approval ❌ (0/1)
- 1 Team Lead OR Management approval ❌ (0/1)



---
*This comment is automatically updated when reviews change.*

mialso and others added 30 commits March 13, 2026 09:20

(improvement) llamacpp-llm: Qwen3 dynamic tools template

bb2d249

(improvement) llamacpp-llm: add llm config tools flag

b9ed672

(improvement) llamacpp-llm: use template based on tools param

48cedbd

(improvement) llamacpp-llm: count tools token offset with tokenizer

3bf572d

(improvement) llamacpp-llm: track n-past, run Qwen3 tests, fix reset

d63ad92

(improvement) llamacpp-llm: save cache with respect to tools flag

9a75956

(fix) llamacpp-llm: add Qwen3ToolsDynamicTemplate.cpp to production C…

83f2d8b

…MakeLists The new source file was added to the test CMakeLists but missing from the addon and cli_tool targets, causing an undefined symbol linker error on CI win64 builds. Made-with: Cursor

chore: retrigger CI for CMakeLists fix

9f97519

Made-with: Cursor

Revert "(fix) llamacpp-llm: fix use-after-free SIGSEGV on process exi…

f3adb55

…t (linux)" This reverts commit 7d9c237.

Revert "(fix) llamacpp-llm: robust threadpool teardown to prevent SIG…

7bd6fc2

…SEGV on exit" This reverts commit 4e66b38.

(fix) llamacpp-llm: dynamic tools cache trim, tmp template, debugs

d305c92

(fix) llamacpp-llm: use correct template in tests

b6dae3a

(chore) llamacpp-llm: move qwen3 cache tests to own file

31b2069

(improvement) llamacpp-llm: simplify nPastBeforeTools reset, multi-tu…

e2b660b

…rn cache tests

(improvement) llamacpp-llm: simply nPastBeforeTools tracking, no trim…

47292a5

… on save

(chore) llamacpp-llm: remove redundant getters and cleanup

aedadda

(internal) llamacpp-llm: run Qwen3 context tests

f13b1aa

(chore) cleanup

c1e85c2

(chore) fix lint errors in examples

f2fe2a5

(chore) fix remaining lint errors in benchToolsPlacement

63b31e2

(chore) fix indentation in benchToolsPlacement ternary

9384335

Merge remote-tracking branch 'origin/main' into feature/llm-dynamic-t…

52d6706

…ools

Merge branch 'main' into feature/llm-dynamic-tools

4bab03b

(chore) llamacpp-llm: remove unused example files

04cb86f

olyasir and others added 25 commits March 16, 2026 15:10

Merge branch 'main' into feature/llm-dynamic-tools

00a72f6

(chore) llamacpp-llm: changelog and version bump

c52e076

refactor(llamacpp-llm): remove toolsAtEnd_ from ReloadableState, sing…

44da74e

…le source of truth in DynamicToolsState Made-with: Cursor

Merge branch 'main' into feature/llm-dynamic-tools

4dbb387

fix(llamacpp-llm): use dts.reset() after post-eval trim for full stat…

4161c77

…e cleanup Made-with: Cursor

(draft) llamacpp-llm: dynamic tools cache tokens test debug

27e6a5c

(internal) llamacpp-llm: dynamic tools token count and cache match test

181b98a

Revert "(internal) llamacpp-llm: dynamic tools token count and cache …

a03ad49

…match test" This reverts commit 181b98a.

Revert "(draft) llamacpp-llm: dynamic tools cache tokens test debug"

a58893b

This reverts commit 27e6a5c.

Merge branch 'main' into feature/llm-dynamic-tools

047debf

Merge branch 'main' into feature/llm-dynamic-tools

661cbb1

Made-with: Cursor

style(llamacpp-llm): apply clang-format to all PR-touched C++ files

a4086e8

Made-with: Cursor

style(llamacpp-llm): fix remaining clang-format-19 brace-init formatting

11b186b

Made-with: Cursor

chore: remove accidentally committed binary file

1bb6556

The file packages/ocr-onnx/big_and_clear_watermarks.png was unintentionally staged during merge conflict resolution. Made-with: Cursor

chore(llm): bump version to 0.14.0

02a327a

Made-with: Cursor

chore: remove working artifacts from feature branch

7d33988

Made-with: Cursor

chore: remove accidentally committed sdk model history file

2ddac41

Made-with: Cursor

doc: add dynamic-tools examples to README

79dab19

Made-with: Cursor

fix(llm): reset use_jinja from params_ instead of save/restore

22603f9

Made-with: Cursor

fix(llm): reset use_jinja before second getPrompt call

b9a54ec

Made-with: Cursor

Merge branch 'main' into feature/llm-dynamic-tools

306f401

Merge branch 'main' into feature/llm-dynamic-tools

11991ae

(internal) llamacpp-llm: chat template tests with Qwen3 model

7d729bc

Base automatically changed from feature/llm-dynamic-tools to main March 21, 2026 09:09

mialso closed this Mar 24, 2026

mialso deleted the internal/llm-dynamic-tools-tests-improved branch April 15, 2026 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(internal) llamacpp-llm: chat template tests with Qwen3 model#1036

(internal) llamacpp-llm: chat template tests with Qwen3 model#1036
mialso wants to merge 57 commits into
mainfrom
internal/llm-dynamic-tools-tests-improved

mialso commented Mar 20, 2026 •

edited

Loading

Uh oh!

mialso commented Mar 24, 2026

Uh oh!

github-actions Bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mialso commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

asana task

Description

Uh oh!

mialso commented Mar 24, 2026

Uh oh!

github-actions Bot commented Mar 24, 2026

Tier-based Approval Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mialso commented Mar 20, 2026 •

edited

Loading