Skip to content

common/grammar : replace problematic backtracking regex [\s\S]*#18342

Merged
aldehir merged 6 commits intoggml-org:masterfrom
aldehir:fix-regex-stack
Jan 3, 2026
Merged

common/grammar : replace problematic backtracking regex [\s\S]*#18342
aldehir merged 6 commits intoggml-org:masterfrom
aldehir:fix-regex-stack

Conversation

@aldehir
Copy link
Collaborator

@aldehir aldehir commented Dec 24, 2025

There are several regex patterns scattered throughout the codebase that use [\s\S]* or [\s\S]*?. This is problematic because it matches characters recursively in backtracking engines such as std::regex, causing stack overflows on large inputs.

Minimal reproducing example

#include <iostream>
#include <regex>
#include <string>

int main() {
    std::string text = std::string(50000, 'a') + "b";
    std::regex pattern("[\\s\\S]*b"); // [\s\S]* is matched recursively
    
    std::smatch match;
    if (std::regex_match(text, match, pattern)) {
        std::cout << "Match\n";
    }
    
    return 0;
}

This PR replaces these instances with alternative solutions.

Fixes #17902, #18188

Details

src/llama-grammar.cpp

These patterns exist because the grammar sampler uses std::regex_match(). The only way to search for a needle is to surround it with [\s\S]*?<pat>[\s\S]*.

To address this, I check whether the pattern is wrapped in anchors (^$). If not, std::regex_search() is used instead. Most engines search by iteratively matching at each position, starting from the beginning. This is an improvement over recursively matching characters.

common/sampling.cpp

To accommodate the grammar change, COMMON_TRIGGER_TYPE_PATTERN_FULL now wraps the pattern as ^<pat>$. The problematic patterns are removed for COMMON_TRIGGER_TYPE_PATTERN and COMMON_TRIGGER_TYPE_WORD. This maintains existing semantics to remain compatible with chat parsing implementations.

Additionally, COMMON_TRIGGER_TYPE_PATTERN no longer adds an implicit capture group. The capture semantics now align with COMMON_TRIGGER_TYPE_PATTERN_FULL and can be used to determine the start of the input fed to the grammar sampler.

common/chat.cpp

As an example, I removed [\s\S]*? from the Hermes 2 Pro implementation. I also removed "(?:<think>[\\s\\S]*?</think>\\s*)?" since it's optional and non-capturing—it would be consumed by [\s\S]*? anyway and isn't needed when using std::regex_search().

I updated existing COMMON_TRIGGER_TYPE_PATTERN patterns to use non-capturing groups, since it now follows the same capture semantics as COMMON_TRIGGER_TYPE_PATTERN_FULL.

common/regex-partial.cpp

The generated reverse regex pattern contains a trailing [\s\S]*. This is replaced by anchoring the generated pattern as ^<pat> and using std::regex_constants::match_continuous to enforce matching at the start of the reverse iterator.

Models Tested

I tested content, reasoning, tool calling, and agentic scenarios with my own test suite on the following models:

  • Qwen3-Next-80B-A3B-Instruct
  • Qwen3-VL-8B-Instruct
  • Qwen3-VL-8B-Thinking
    • Reasoning + tool_choice = required is broken but this is an existing issue
  • GPT-OSS-20B

@ochafik If you have some time, I'd appreciate your opinion on these changes.

No AI was used to code, only to review changes.

@ggerganov
Copy link
Member

@aldehir A bit off-topic, but I played with your test suite and noticed that Qwen3 models are failing some of the agentic tests:

./bin/llama-server -hf ggml-org/Qwen3-0.6B-GGUF

./llm-serve-test --base-url http://localhost:8080/v1 --model qwen3-0.6b --filter agentic

Agentic
  ✗ agentic_tool_call - turn 2 request failed: unexpected status 500: {"error":{"code":500,"message":"Trying to call method 'lstrip' on null at row 41, column 128:\n   
  ✗ agentic_reasoning_in_template - /apply-template failed: unexpected status 500: {"error":{"code":500,"message":"Trying to call method 'lstrip' on null at row 41, column 128:\n
  ✓ agentic_reasoning_not_in_user_template (303ms)
Results: 1/3 passed

Seems like we are calling content.lstrip() on null content. If I patch the jinja template like this, the tests pass:

41c41
<                 {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
---
>                 {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content }}

Is this a bug in the chat template of these models?

@aldehir
Copy link
Collaborator Author

aldehir commented Dec 29, 2025

Is this a bug in the chat template of these models?

Ah, looks like Minja is getting a false negative on the "requires non-null content" check. Since that code path is only hit when reasoning_content is passed in, it makes sense why it would miss it.

I can submit a Minja PR, or we can convert null content to an empty string by default. I don't know why we don't already, seems like it makes the most sense.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on feedback #18188 (comment)

@aldehir aldehir merged commit cef1d23 into ggml-org:master Jan 3, 2026
68 of 71 checks passed
@thomasjfox
Copy link
Contributor

Just wanted to report: Commit works fine with tool calling using MiniMax M2.1.

Thanks a lot! 🙏 The backtracking regex has bitten me quite a few times in the past.

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
…8342)

* grammar : add support for std::regex_search() with trigger patterns

* common : update hermes2 pro trigger to search instead of match

* common : use regex_search with anchoring for partial matching

* common : adjust regex partial tests to use new pattern

* grammar : check pattern directly instead of adding a type

* common : adjust existing patterns to match new semantics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: core dump due to regex stack overflow during load / security test of an agent using llama-server

3 participants