[Bugfix] Fix Kimi-K2 tool parser streaming regex for multiple tool calls by karanb192 · Pull Request #32504 · vllm-project/vllm

karanb192 · 2026-01-17T05:43:25Z

Summary

Fix the streaming regex patterns in Kimi-K2 tool parser to correctly handle multiple concatenated tool calls
The bug was that the streaming regexes used .+ which greedily matched across tool call boundaries, causing multiple tool calls to be incorrectly merged into a single malformed entry
The fix changes .+ to [^<]+ in stream_tool_call_portion_regex and stream_tool_call_name_regex to match the non-streaming regex behavior

Related Issue

Root Cause

The streaming tool call regexes used .+ pattern which matches ANY character including <. When multiple tool calls were concatenated without spacing (e.g., <|tool_call_end|><|tool_call_begin|>), the greedy .+ would match across tool call boundaries, capturing content like:

functions.get_weather:0<|tool_call_argument_begin|>{"lat":34}<|tool_call_end|><|tool_call_begin|>functions.get_news:1

This resulted in a garbled tool call ID and mismatched arguments as shown in the issue.

Fix

Changed .+ to [^<]+ which excludes < characters. This ensures the regex stops at the first < (the start of the next special marker), preventing over-matching. This matches the behavior of the non-streaming regex which already used [^<]+.

Test Plan

Added test_streaming_regex_does_not_match_across_tool_boundaries: Unit test that directly validates the regex patterns correctly stop at < characters
Added test_streaming_concatenated_tool_calls_issue_24478: Regression test that simulates the exact scenario from the bug report

Run tests:

pytest tests/tool_parsers/test_kimi_k2_tool_parser.py -v

Fix the streaming regex patterns in Kimi-K2 tool parser to correctly handle multiple concatenated tool calls. The bug was that the streaming regexes used `.+` which greedily matched across tool call boundaries, causing multiple tool calls to be incorrectly merged into a single malformed entry. The fix changes `.+` to `[^<]+` in `stream_tool_call_portion_regex` and `stream_tool_call_name_regex` to match the non-streaming regex behavior. This ensures the regex stops at the first `<` character (the start of the next special marker), preventing over-matching. Fixes vllm-project#24478 Signed-off-by: Karan Bansal <karan@karanbansal.com> Signed-off-by: Karan Bansal <karanb192@gmail.com>

gemini-code-assist

Code Review

This is an excellent bug fix. The pull request description clearly explains the root cause of the issue—a greedy regex .+ causing multiple tool calls to be merged—and the fix of using [^<]+ is correct and effective. The addition of two new tests, one unit test for the regex and one regression test for the streaming behavior, is very thorough and ensures this bug will not reappear. I have a couple of suggestions to make the new tests even more robust by asserting that a regex match occurs where expected, preventing silent test passes. Overall, great work!

gemini-code-assist · 2026-01-17T05:44:33Z

tests/tool_parsers/test_kimi_k2_tool_parser.py

+    if match:
+        # If it matches, it should only capture up to the first '<'
+        tool_id = match.group("tool_call_id")
+        # The tool_id should NOT contain special markers
+        assert "<|" not in tool_id, (
+            f"Regex over-matched across tool boundaries! "
+            f"Captured tool_id: {repr(tool_id)}"
+        )
+        assert tool_id == "functions.get_weather:0", (
+            f"Expected 'functions.get_weather:0', got: {repr(tool_id)}"
+        )


The if match: condition could mask a potential issue. If portion_regex.match() were to return None (for example, due to a future change in the regex), this test would pass silently without executing the assertions. It's better to explicitly assert that a match is found when one is expected.

Suggested change

if match:

# If it matches, it should only capture up to the first '<'

tool_id = match.group("tool_call_id")

# The tool_id should NOT contain special markers

assert "<|" not in tool_id, (

f"Regex over-matched across tool boundaries! "

f"Captured tool_id: {repr(tool_id)}"

)

assert tool_id == "functions.get_weather:0", (

f"Expected 'functions.get_weather:0', got: {repr(tool_id)}"

)

assert match is not None, "The regex is expected to match the problematic input."

# If it matches, it should only capture up to the first '<'

tool_id = match.group("tool_call_id")

# The tool_id should NOT contain special markers

assert "<|" not in tool_id, (

f"Regex over-matched across tool boundaries! "

f"Captured tool_id: {repr(tool_id)}"

)

assert tool_id == "functions.get_weather:0", (

f"Expected 'functions.get_weather:0', got: {repr(tool_id)}"

)

gemini-code-assist · 2026-01-17T05:44:33Z

tests/tool_parsers/test_kimi_k2_tool_parser.py

+    if name_match2:
+        # Should stop at '<' if using [^<]+
+        captured_id = name_match2.group("tool_call_id")
+        assert "<|" not in captured_id, (
+            f"Name regex over-matched! Captured: {repr(captured_id)}"
+        )


Similar to the previous comment, using if name_match2: can hide a potential failure if the regex doesn't match as expected. The test would pass silently. It's more robust to assert that a match is found.

Suggested change

if name_match2:

# Should stop at '<' if using [^<]+

captured_id = name_match2.group("tool_call_id")

assert "<|" not in captured_id, (

f"Name regex over-matched! Captured: {repr(captured_id)}"

)

assert name_match2 is not None, "The name_regex is expected to match."

# Should stop at '<' if using [^<]+

captured_id = name_match2.group("tool_call_id")

assert "<|" not in captured_id, (

f"Name regex over-matched! Captured: {repr(captured_id)}"

)

karanb192 · 2026-01-21T11:56:50Z

Hi @DarkLight1337 - could you please review this tool parser fix? It addresses issue #24478 where the Kimi-K2 parser had a streaming regex failure. Thanks!

DarkLight1337 · 2026-01-21T11:57:53Z

cc @chaunceyjiang

ehfd · 2026-03-01T12:56:59Z

@chaunceyjiang I think this is relevant.

…ewline in tool call ID (vllm-project#38441) The model occasionally emits a stray \n between <|tool_call_begin|> and the function name, e.g.: <|tool_call_begin|> functions.edit:15<|tool_call_argument_begin|>{...} Because Python regex does not match \n with . by default, both stream_tool_call_portion_regex and stream_tool_call_name_regex silently failed to match, causing the entire tool call to be dropped during streaming. Fix: - Add a leading \s* to both streaming regexes so any leading whitespace/newlines before the tool_call_id are consumed. - Compile both regexes with re.DOTALL so . inside the capture group spans newlines. This is distinct from PR vllm-project#37384 which only adds re.DOTALL (without leading \s*) to the portion regex and does not fix stream_tool_call_name_regex. Tests added: - test_stream_tool_call_portion_regex_handles_leading_newline: unit test that both regexes match inputs with a leading \n. - test_streaming_tool_call_with_newline_after_begin_token: end-to-end streaming simulation reproducing the exact scenario in the issue. Why this is not a duplicate: checked open PRs vllm-project#37384, vllm-project#37445, vllm-project#32504, vllm-project#24847, vllm-project#26918, vllm-project#36891. None add the leading \s* prefix to handle whitespace/newlines preceding the tool_call_id capture group, and none fix stream_tool_call_name_regex with re.DOTALL. Co-authored-by: GitHub Copilot

…ewline in tool call ID (vllm-project#38441) The model occasionally emits a stray \n between <|tool_call_begin|> and the function name, e.g.: <|tool_call_begin|> functions.edit:15<|tool_call_argument_begin|>{...} Because Python regex does not match \n with . by default, both stream_tool_call_portion_regex and stream_tool_call_name_regex silently failed to match, causing the entire tool call to be dropped during streaming. Fix: - Add a leading \s* to both streaming regexes so any leading whitespace/newlines before the tool_call_id are consumed. - Compile both regexes with re.DOTALL so . inside the capture group spans newlines. This is distinct from PR vllm-project#37384 which only adds re.DOTALL (without leading \s*) to the portion regex and does not fix stream_tool_call_name_regex. Tests added: - test_stream_tool_call_portion_regex_handles_leading_newline: unit test that both regexes match inputs with a leading \n. - test_streaming_tool_call_with_newline_after_begin_token: end-to-end streaming simulation reproducing the exact scenario in the issue. Why this is not a duplicate: checked open PRs vllm-project#37384, vllm-project#37445, vllm-project#32504, whitespace/newlines preceding the tool_call_id capture group, and none fix stream_tool_call_name_regex with re.DOTALL. Co-authored-by: GitHub Copilot Signed-off-by: saif <contact@saifmb.com>

mergify bot added the bug Something isn't working label Jan 17, 2026

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

wangln19 mentioned this pull request Jan 21, 2026

fix: preserve native tool call ID in multi-turn tool calling #32768

Merged

5 tasks

ehfd mentioned this pull request Feb 14, 2026

[Bug]: The content of response from Kimi-K2.5 is empty. #33654

Open

1 task

mergify bot added the tool-calling label Mar 20, 2026

github-project-automation bot added this to Tool Calling Mar 20, 2026

saifmb0 mentioned this pull request Mar 28, 2026

[Bugfix][Tool Parser] Fix Kimi-K2 streaming regex to handle leading newline before tool call ID #38443

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix Kimi-K2 tool parser streaming regex for multiple tool calls#32504

[Bugfix] Fix Kimi-K2 tool parser streaming regex for multiple tool calls#32504
karanb192 wants to merge 1 commit intovllm-project:mainfrom
karanb192:fix/kimi-k2-multiple-tool-calls

karanb192 commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 17, 2026

Uh oh!

gemini-code-assist bot Jan 17, 2026

Uh oh!

karanb192 commented Jan 21, 2026

Uh oh!

DarkLight1337 commented Jan 21, 2026

Uh oh!

ehfd commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

karanb192 commented Jan 17, 2026

Summary

Related Issue

Root Cause

Fix

Test Plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

karanb192 commented Jan 21, 2026

Uh oh!

DarkLight1337 commented Jan 21, 2026

Uh oh!

ehfd commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants