Autoparser - complete refactoring of parser architecture by firecoperana · Pull Request #1376 · ikawrakow/ik_llama.cpp

firecoperana · 2026-03-06T22:56:30Z

Port the new Autoparser and optional argument reshuffle capability PR from mainline

ggml-org/llama.cpp#18675 and ggml-org/llama.cpp#20171

Continues #1369

@ikawrakow Can you merge the PEG parser PR first and then this one? This is a large PR and I don't want to squash them into one commit.

firecoperana · 2026-03-07T03:43:28Z

Streaming for tool call is enabled.

ikawrakow · 2026-03-07T08:03:40Z

@ikawrakow Can you merge the PEG parser PR first and then this one? This is a large PR and I don't want to squash them into one commit.

So is the PEG parser PR. Are we confident that there are no regressions?

firecoperana · 2026-03-07T14:22:07Z

I see some new issues with auto parser mentioned in mainline. In PEG parser PR, PEG is just added without being used. It's mostly the new jinja template engine changes that matters. It's been added there for a while, so most bugs should have been fixed.

hksdpc255 · 2026-03-08T03:51:48Z

These changes in mainline llama.cpp appear to work well. All models except MiroThinker (which was added by me) have already been tested by upstream developers.

In principle, this should not introduce regressions, unless there are additional unmerged differences between mainline and ik_llama.cpp. I'll test this branch.

hksdpc255 · 2026-03-09T10:08:24Z

This breaks MiroThinker, but seems an upstream bug

log:

error: the supplied chat template is not supported: /var/llm/MiroThinker.jinja
error: invalid parameter for argument: --chat-template-file
common_chat_templates_init: error: Index 1 out of bounds for array of size 0
common_chat_templates_init: failed to initialize chat template
common_chat_templates_init: please consider disabling jinja via --no-jinja, or using another chat template
common_chat_verify_template: failed to apply template: std::exception

firecoperana · 2026-03-09T12:59:12Z

There are more issues showing up in mainline. I will wait before they are full resolved.

sayap · 2026-03-11T09:30:23Z

In examples/server/server-common.cpp, can we change parallel_tool_calls to default to true? It is harder than I thought to try setting this as a request parameter in various agentic coding tools 😬

This capability is from late 2024, so all the newer models should already support this.

firecoperana · 2026-03-11T13:19:51Z

I could add command line arg to enable it.

hksdpc255 · 2026-03-12T10:52:31Z

@firecoperana I'm planning to send PRs to upstream llama.cpp adding support for MiroThinker with the Refactored chat template. I have tested this patch will work. You can merge it.

diff --git a/common/chat.cpp b/common/chat.cpp
index b799912a..7a76f8a9 100644
--- a/common/chat.cpp
+++ b/common/chat.cpp
@@ -1278,6 +1278,116 @@ static common_chat_params common_chat_params_init_kimi_k2(const common_chat_temp
     return data;
 }
 
+// MiroThinker - uses MCP style toolcalling
+static common_chat_params common_chat_params_init_mirothinker(const common_chat_template &    tmpl,
+                                                          const autoparser::templates_params & inputs) {
+    common_chat_params data;
+
+    data.prompt             = common_chat_template_direct_apply(tmpl, inputs);
+    data.format             = COMMON_CHAT_FORMAT_PEG_NATIVE;
+    data.supports_thinking  = true;
+    data.thinking_start_tag = "<think>";
+    data.thinking_end_tag   = "</think>";
+    data.preserved_tokens  = {
+        "<think>",
+        "</think>",
+    };
+
+    auto has_tools         = inputs.tools.is_array() && !inputs.tools.empty();
+    auto extract_reasoning = inputs.reasoning_format != COMMON_REASONING_FORMAT_NONE;
+    auto include_grammar   = has_tools && inputs.tool_choice != COMMON_CHAT_TOOL_CHOICE_NONE;
+
+    auto parser = build_chat_peg_parser([&](common_chat_peg_builder & p) {
+        // MiroThinker Thinking format:
+        // - Reasoning: <think>{reasoning}</think>
+        // - Content: text after reasoning
+        // - Tool calls section:
+        //   <use_mcp_tool>
+        //   <server_name>{server_name}</server_name>
+        //   <tool_name>{tool_name}</tool_name>
+        //   <arguments>
+        //   {json_args}
+        //   </arguments>
+        //   ...
+        //   </use_mcp_tool>
+
+        auto reasoning = extract_reasoning ? p.optional("<think>" + p.reasoning(p.until("</think>")) + "</think>") : p.eps();
+
+        // Tool call markers
+        const std::string SECTION_BEGIN = "<use_mcp_tool>";
+        const std::string SECTION_END   = "</use_mcp_tool>";
+        const std::string CALL_BEGIN    = "<server_name>";
+        const std::string ARGS_BEGIN    = "<arguments>";
+        const std::string CALL_END      = "</arguments>";
+
+        auto end = p.end();
+
+        // Content only parser (no tools)
+        if (!has_tools || inputs.tool_choice == COMMON_CHAT_TOOL_CHOICE_NONE) {
+            return reasoning + p.content(p.rest()) + end;
+        }
+
+        // Build tool call parsers for each available function
+        // Function name format is: <tool_name>{tool_name}</tool_name>
+        // We need to match: {what_ever}</server_name>{spaces}<tool_name>{tool_name}</tool_name>
+        auto tool_choice = p.choice();
+        foreach_function(inputs.tools, [&](const json & tool) {
+            const auto & function = tool.at("function");
+            std::string  name     = function.at("name");
+            const auto & schema   = function.at("parameters");
+
+            // Match: {what_ever}</server_name>{spaces}<tool_name>{tool_name}</tool_name>
+            auto tool_parser = p.tool(
+                p.tool_open(
+                    p.until("</server_name>") +
+                    p.literal("</server_name>") +
+                    p.space() +
+                    p.literal("<tool_name>") +
+                    p.tool_name(p.literal(name)) +
+                    p.literal(ARGS_BEGIN)
+                ) + p.space() +
+                p.tool_args(p.schema(p.json(), "tool-" + name + "-schema", schema)) +
+                p.space() + p.tool_close(p.literal(CALL_END))
+            );
+
+            tool_choice |= p.rule("tool-" + name, tool_parser);
+        });
+
+        // Tool calls section: <use_mcp_tool> tool_calls </use_mcp_tool>
+        auto min_calls  = inputs.tool_choice == COMMON_CHAT_TOOL_CHOICE_REQUIRED ? 1 : 0;
+        auto max_calls  = inputs.parallel_tool_calls ? -1 : 1;
+        auto tool_calls = p.trigger_rule("tool-calls",
+            p.literal(SECTION_BEGIN) + p.space() +
+            p.rule("tool-call", p.repeat(CALL_BEGIN + tool_choice, min_calls, max_calls) +
+            p.space() + p.literal(SECTION_END))
+        );
+
+        auto content_before_tools = p.content(p.until(SECTION_BEGIN));
+
+        return reasoning + content_before_tools + tool_calls + end;
+    });
+
+    data.parser = parser.save();
+
+    if (include_grammar) {
+        data.grammar_lazy = inputs.tool_choice == COMMON_CHAT_TOOL_CHOICE_AUTO;
+        data.grammar      = build_grammar([&](const common_grammar_builder & builder) {
+            foreach_function(inputs.tools, [&](const json & tool) {
+                const auto & function = tool.at("function");
+                auto         schema   = function.at("parameters");
+                builder.resolve_refs(schema);
+            });
+            parser.build_grammar(builder, data.grammar_lazy);
+        });
+
+        data.grammar_triggers = {
+            { COMMON_GRAMMAR_TRIGGER_TYPE_WORD, "<use_mcp_tool>" }
+        };
+    }
+
+    return data;
+}
+
 // LFM2 format:
 // - Reasoning: <think>{reasoning}</think> (optional, only if enable_thinking is true)
 // - Content: text after reasoning (optional)
@@ -1517,6 +1627,14 @@ static common_chat_params common_chat_templates_apply_jinja(const struct common_
         return common_chat_params_init_kimi_k2(tmpl, params);
     }
 
+    // MiroThinker - uses MCP style toolcalling <use_mcp_tool> ... </use_mcp_tool>
+    // Detection: template has "</use_mcp_tool>" and "</server_name>"
+    if (src.find("</use_mcp_tool>") != std::string::npos &&
+        src.find("</server_name>") != std::string::npos) {
+        LOG_DBG("Using specialized template: MiroThinker\n");
+        return common_chat_params_init_mirothinker(tmpl, params);
+    }
+
     // LFM2 - uses <|tool_list_start|>/<|tool_list_end|> markers and <|tool_call_start|>[name(args)]<|tool_call_end|> format
     // Detection: template has "<|tool_list_start|>" and "<|tool_list_end|>" markers
     if (src.find("<|tool_list_start|>") != std::string::npos &&

@CISC

Autoparser: add optional argument reshuffle capability Autoparser: True streaming (#20177) * Relax atomicity constraint for nicer, more pleasent, True Streaming parsing * Whitespace * Remove redundant atomics Revert to OAI-compatible args (#20213) * Revert to OAI-compatible args * Apply workaround::func_args_not_string Fix structured outputs (#20223) * Fix structured outputs * Update common/chat-auto-parser-generator.cpp Co-authored-by: Aldehir Rojas <hello@alde.dev> --------- Co-authored-by: Aldehir Rojas <hello@alde.dev> Fix compile bug (#20203) * Fix compile bug * Update common/chat-auto-parser-helpers.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> # Conflicts: # common/chat-auto-parser-helpers.cpp common : gracefully handle incomplete output (#20191) * common : handle incomplete UTF-8 at end of input in PEG parser * cont : if reached end prematurely, emit needs_more_input to propagate partial output * cont: refactor peg parse context to add lenient flag * cont : remove partial flag, keep lenient flag PEG parser for LFM2 (#20251) * PEG parser for LFM2 * Simplify using python_value() common: map developer role to system (#20215) * Map developer role to system * Simplify common: consolidate PEG string parsers (#20263) * common : consolidate PEG string parsers * cont : fix json_string_content() examples : fix empty items in json_schema_to_grammar.py [no ci] (#19968) * Fix logic for retrieving schema items in `json_schema_to_grammar.py` If `schema['items']` is `{}` and `prefixItems not in schema', as `{}` is Falsy, the original code here will raise an error. I think if `schema['items']` is `{}`, them items should just be `{}` * Apply suggestion from @CISC Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Add tests for arrays with empty items Add two unit tests to `tests/test-json-schema-to-grammar.cpp` that validate handling of arrays when 'items' is an empty schema and when 'prefixItems' is present alongside an empty 'items'. Both tests expect the same generated grammar, ensuring the JSON Schema->grammar conversion treats an empty 'items' schema (and the presence of 'prefixItems') correctly and covering this edge case. --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Reduce level of content parser warning message to avoid log spam on non-debug verbosity (#20347) do not return if template parse failed add arg to enable parallel tool call common : fix incorrect uses of stoul (#20313) # Conflicts: # common/arg.cpp # src/llama-grammar.cpp examples : fix empty items in json_schema_to_grammar.py [no ci] (#19968) * Fix logic for retrieving schema items in `json_schema_to_grammar.py` If `schema['items']` is `{}` and `prefixItems not in schema', as `{}` is Falsy, the original code here will raise an error. I think if `schema['items']` is `{}`, them items should just be `{}` * Apply suggestion from @CISC Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Add tests for arrays with empty items Add two unit tests to `tests/test-json-schema-to-grammar.cpp` that validate handling of arrays when 'items' is an empty schema and when 'prefixItems' is present alongside an empty 'items'. Both tests expect the same generated grammar, ensuring the JSON Schema->grammar conversion treats an empty 'items' schema (and the presence of 'prefixItems') correctly and covering this edge case. --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

firecoperana · 2026-03-13T13:02:53Z

Add support for MiroThinker with the new template engine as well as add --parallel-tool-calls to enable parallel tool calls.

firecoperana mentioned this pull request Mar 6, 2026

Add PEG parser and new jinja template engine #1369

Merged

firecoperana force-pushed the fcp/auto_parser2 branch from 342b06e to 8e38c07 Compare March 6, 2026 23:12

firecoperana force-pushed the fcp/auto_parser2 branch 2 times, most recently from 796aa23 to 59233a7 Compare March 10, 2026 22:54

firecoperana force-pushed the fcp/auto_parser2 branch from 59233a7 to d0ea90e Compare March 13, 2026 00:54

hksdpc255 mentioned this pull request Mar 13, 2026

Improve MiroThinker chat template compatibility with the new Jinja template engine #1404

Merged

4 tasks

pwilkin and others added 2 commits March 13, 2026 08:00

Add support for MiroThinker with new jinja template

4fc7a15

firecoperana force-pushed the fcp/auto_parser2 branch from d0ea90e to 4fc7a15 Compare March 13, 2026 13:01

firecoperana mentioned this pull request Mar 13, 2026

Bug: Qwen3.5-35B-A3B-UD-Q6_K_XL - Unexpected empty grammar stack #1420

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoparser - complete refactoring of parser architecture#1376

Autoparser - complete refactoring of parser architecture#1376
firecoperana wants to merge 2 commits intomainfrom
fcp/auto_parser2

firecoperana commented Mar 6, 2026 •

edited

Loading

Uh oh!

firecoperana commented Mar 7, 2026

Uh oh!

ikawrakow commented Mar 7, 2026

Uh oh!

firecoperana commented Mar 7, 2026

Uh oh!

hksdpc255 commented Mar 8, 2026 •

edited

Loading

Uh oh!

hksdpc255 commented Mar 9, 2026 •

edited

Loading

Uh oh!

firecoperana commented Mar 9, 2026

Uh oh!

sayap commented Mar 11, 2026

Uh oh!

firecoperana commented Mar 11, 2026

Uh oh!

hksdpc255 commented Mar 12, 2026 •

edited

Loading

Uh oh!

firecoperana commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

firecoperana commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

firecoperana commented Mar 7, 2026

Uh oh!

ikawrakow commented Mar 7, 2026

Uh oh!

firecoperana commented Mar 7, 2026

Uh oh!

hksdpc255 commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hksdpc255 commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

firecoperana commented Mar 9, 2026

Uh oh!

sayap commented Mar 11, 2026

Uh oh!

firecoperana commented Mar 11, 2026

Uh oh!

hksdpc255 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

firecoperana commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

firecoperana commented Mar 6, 2026 •

edited

Loading

hksdpc255 commented Mar 8, 2026 •

edited

Loading

hksdpc255 commented Mar 9, 2026 •

edited

Loading

hksdpc255 commented Mar 12, 2026 •

edited

Loading