Skip to content
Open
Changes from 8 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
c6c4f7c
Update chat.cpp to support (at least) qwen3 + tool_choice = required
ExtReMLapin Aug 11, 2025
42937a5
refactored changes to follow string tern op
ExtReMLapin Aug 11, 2025
5796938
fixing editorconfig-checker CI (tailing whitespace)
ExtReMLapin Aug 12, 2025
de07a43
Merge branch 'ggml-org:master' into fix_qwen_reasoning_tool_calling_r…
ExtReMLapin Aug 25, 2025
79e4a7b
hermes 2 pro tool calling, better support for thinking (thinking tag …
Aug 25, 2025
dbae921
qwen hermes tool calling : fixed grammar rules names
Aug 25, 2025
86493dd
fixed really weird grammar crash `Unexpected empty grammar stack afte…
Aug 25, 2025
bb5e352
also apply the hotcrashfix here, just in case
Aug 25, 2025
6d5f561
reverted changes done to grammar_lazy for hermes 2
Aug 26, 2025
352274e
if there is enable_thinking enabled but hermes model doesn't support …
Aug 26, 2025
0e55830
fix thinking-content eating closing think tag | ref #8953
Aug 26, 2025
e62cd70
removed `?` from grammar as it doesn't crash on linux, probably worth…
Aug 26, 2025
2f28a1c
Merge branch 'ggml-org:master' into fix_qwen_reasoning_tool_calling_r…
ExtReMLapin Aug 28, 2025
310701b
fixed crash with "auto" mode, trigger was missing
Aug 28, 2025
5688afa
Merge branch 'ggml-org:master' into fix_qwen_reasoning_tool_calling_r…
ExtReMLapin Sep 1, 2025
dc75a57
Merge branch 'ggml-org:master' into fix_qwen_reasoning_tool_calling_r…
ExtReMLapin Oct 20, 2025
9381f69
Merge branch 'ggml-org:master' into fix_qwen_reasoning_tool_calling_r…
ExtReMLapin Nov 3, 2025
73010a1
Applied @ochafik 's suggested code after testing locally, no regression
ExtReMLapin Nov 4, 2025
6441ad4
updated qwen3 0.6B chat template with official one
Nov 4, 2025
cc18ecc
added tests
Nov 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 25 additions & 4 deletions common/chat.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1767,8 +1767,11 @@ static common_chat_params common_chat_params_init_hermes_2_pro(const common_chat
}

if (!inputs.tools.is_null()) {
auto supports_thinking = tmpl.source().find("<think>") != std::string::npos;
// you should not be able to call enable_thinking if <think> is not supported
GGML_ASSERT(!extra_context["enable_thinking"] || extra_context["enable_thinking"] == supports_thinking);
// (content)?(<tool_call>{"name": "foo", "arguments": {"a": 1}}</tool_call>)*
data.grammar_lazy = inputs.tool_choice != COMMON_CHAT_TOOL_CHOICE_REQUIRED;
data.grammar_lazy = true;
data.grammar = build_grammar([&](const common_grammar_builder & builder) {
std::vector<std::string> tool_rules;
std::vector<std::string> tool_call_alts;
Expand Down Expand Up @@ -1820,9 +1823,27 @@ static common_chat_params common_chat_params_init_hermes_2_pro(const common_chat
tool_call_alts.push_back(
"( \"```\\n\" | \"```json\\n\" | \"```xml\\n\" ) space " + wrappable_tool_call + " space \"```\" space ");
auto tool_call = builder.add_rule("tool_call", string_join(tool_call_alts, " | "));
builder.add_rule("root",
std::string(data.thinking_forced_open ? "( \"</think>\" space )? " : "") +
(inputs.parallel_tool_calls ? "(" + tool_call + ")+" : tool_call));

builder.add_rule("thinking-start", "\"<think>\"");
builder.add_rule("thinking-content", "[^\\x00]*");
builder.add_rule("thinking-end", "\"</think>\" space");

//thinking grammar logic depending on if thinking_forced_open was to true (so already opened (and maybe closed)) and if thinking is even allowed
std::string thinking_grammar_logic = ""; // thinking tag was closed or not supported/wanted
if (extra_context["enable_thinking"]) {
if (data.thinking_forced_open) {
//thinking tag was already opened by used so we don't need to add it again
thinking_grammar_logic = "(thinking-content thinking-end)? ";
}
else
{
thinking_grammar_logic = "(thinking-start thinking-content thinking-end)? ";
}
}


builder.add_rule("root", thinking_grammar_logic + (inputs.parallel_tool_calls ? "(" + tool_call + ")+" : tool_call));

// Trigger on some common known "good bad" outputs (only from the start and with a json that's about a specific argument name to avoid false positives)
data.grammar_triggers.push_back({
COMMON_GRAMMAR_TRIGGER_TYPE_PATTERN_FULL,
Expand Down
Loading