[Bugfix][Tool Parser] Fix Kimi-K2 streaming regex to handle leading newline before tool call ID by saifmb0 · Pull Request #38443 · vllm-project/vllm

saifmb0 · 2026-03-28T23:32:28Z

Summary

The model (Kimi K2 / K2.5) occasionally emits a stray \n between <|tool_call_begin|> and the function name during streaming (observed on long-context inference with tool_choice: auto, without constrained decoding):

# What the model sometimes produces:
<|tool_call_begin|>
functions.edit:15<|tool_call_argument_begin|>{"path": "..."}

# Instead of:
<|tool_call_begin|>functions.edit:15<|tool_call_argument_begin|>{"path": "..."}

Because Python regex's . does not match \n by default, both stream_tool_call_portion_regex and stream_tool_call_name_regex silently failed to match, causing the tool call to be entirely dropped during streaming.

Root cause

# Before (broken on leading \n)
self.stream_tool_call_portion_regex = re.compile(
    r"(?P<tool_call_id>.+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*)"
)
self.stream_tool_call_name_regex = re.compile(r"(?P<tool_call_id>.+:\d+)\s*")

Fix

# After
self.stream_tool_call_portion_regex = re.compile(
    r"\s*(?P<tool_call_id>.+:\d+)\s*"
    r"<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*)",
    re.DOTALL,
)
self.stream_tool_call_name_regex = re.compile(
    r"\s*(?P<tool_call_id>.+:\d+)\s*", re.DOTALL
)

Two changes per regex:

Leading \s* — consumes any leading whitespace/newlines before the function name.
re.DOTALL — makes . match \n so the tool_call_id capture group spans newlines.

Why this is not a duplicate

Checked open PRs: #37384, #37445, #32504, #24847, #26918, #36891.

PR [Bugfix][Tool Parser] Fix Kimi-K2.5 parser accuracy, buffer limits, and token leaks #37384 adds re.DOTALL only to stream_tool_call_portion_regex (for multi-line arguments), but does not add the leading \s* that handles a newline before the tool_call_id, and does not fix stream_tool_call_name_regex at all. The \s* prefix is the critical fix for this issue.
No other open PR addresses stream_tool_call_name_regex.

Test plan

Two new tests added to tests/tool_parsers/test_kimi_k2_tool_parser.py:

test_stream_tool_call_portion_regex_handles_leading_newline — unit test: both regexes must match with/without a leading \n, and correctly extract tool_call_id and function_arguments.
test_streaming_tool_call_with_newline_after_begin_token — end-to-end streaming simulation of the exact failure scenario from the issue; asserts at least one tool-call delta is emitted (i.e., the tool call is not silently dropped).

Existing tests: pre-commit run ruff-format and pre-commit run ruff-check pass clean.

AI assistance disclosure

This fix was investigated, reproduced, and implemented with AI assistance (GitHub Copilot / Claude Sonnet 4.6). Every changed line has been reviewed. The reviewer (submitter) understands the change end-to-end.

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request fixes an issue where the Kimi K2 tool parser fails to process tool calls when a leading newline is emitted by the model. The fix involves updating the regex patterns for tool call portions and names to handle leading whitespace and using the re.DOTALL flag. Comprehensive regression tests have been added to verify these changes. Review feedback suggests further refining the regex patterns to use \S.*:\d+ instead of .+:\d+ to avoid capturing surrounding whitespace within the tool_call_id group, which would simplify the code by removing the need for manual .strip() calls.

vllm/tool_parsers/kimi_k2_tool_parser.py

tests/tool_parsers/test_kimi_k2_tool_parser.py

saifmb0 · 2026-03-28T23:38:37Z

Applied all four suggestions:

Both streaming regexes now use \S.*:\d+ instead of .+:\d+ — the capture starts at the first non-whitespace character, so trailing whitespace before the :\d+ is excluded from the group.
Removed both .strip() calls in the parser logic (lines 456 and 466).
Removed both .strip() calls in the new tests.

Amended and force-pushed to the same branch.

gemini-code-assist · 2026-03-28T23:38:48Z

Thanks for the update, @saifmb0. The changes look correct and address the identified issues. I've verified that the regex updates and the removal of .strip() calls are consistent with the requirements. The added tests also correctly cover the edge cases for leading newlines.

…ewline in tool call ID (vllm-project#38441) The model occasionally emits a stray \n between <|tool_call_begin|> and the function name, e.g.: <|tool_call_begin|> functions.edit:15<|tool_call_argument_begin|>{...} Because Python regex does not match \n with . by default, both stream_tool_call_portion_regex and stream_tool_call_name_regex silently failed to match, causing the entire tool call to be dropped during streaming. Fix: - Add a leading \s* to both streaming regexes so any leading whitespace/newlines before the tool_call_id are consumed. - Compile both regexes with re.DOTALL so . inside the capture group spans newlines. This is distinct from PR vllm-project#37384 which only adds re.DOTALL (without leading \s*) to the portion regex and does not fix stream_tool_call_name_regex. Tests added: - test_stream_tool_call_portion_regex_handles_leading_newline: unit test that both regexes match inputs with a leading \n. - test_streaming_tool_call_with_newline_after_begin_token: end-to-end streaming simulation reproducing the exact scenario in the issue. Why this is not a duplicate: checked open PRs vllm-project#37384, vllm-project#37445, vllm-project#32504, whitespace/newlines preceding the tool_call_id capture group, and none fix stream_tool_call_name_regex with re.DOTALL. Co-authored-by: GitHub Copilot Signed-off-by: saif <contact@saifmb.com>

Based on vllm-project#38443 and vllm-project#37445

saifmb0 requested review from aarnphm and chaunceyjiang as code owners March 28, 2026 23:32

claude bot reviewed Mar 28, 2026

View reviewed changes

mergify bot added tool-calling bug Something isn't working labels Mar 28, 2026

github-project-automation bot added this to Tool Calling Mar 28, 2026

gemini-code-assist bot reviewed Mar 28, 2026

View reviewed changes

saifmb0 force-pushed the fix/kimi-k2-streaming-regex-leading-newline-38441 branch 2 times, most recently from 41038d5 to eb0ab41 Compare March 28, 2026 23:38

saifmb0 force-pushed the fix/kimi-k2-streaming-regex-leading-newline-38441 branch from eb0ab41 to 54a8914 Compare March 28, 2026 23:42

alexandrnikitin added a commit to alexandrnikitin/vllm that referenced this pull request Mar 29, 2026

Fix kimi tool parser

9fbc122

Based on vllm-project#38443 and vllm-project#37445

alexandrnikitin added a commit to alexandrnikitin/vllm that referenced this pull request Mar 29, 2026

Fix kimi tool parser

a6d6699

Based on vllm-project#38443 and vllm-project#37445

JosephAhn23 mentioned this pull request Mar 30, 2026

[Bugfix] Opt-in INFO prompt summaries for request logging (--enable-log-request-prompts) #38583

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][Tool Parser] Fix Kimi-K2 streaming regex to handle leading newline before tool call ID#38443

[Bugfix][Tool Parser] Fix Kimi-K2 streaming regex to handle leading newline before tool call ID#38443
saifmb0 wants to merge 1 commit intovllm-project:mainfrom
saifmb0:fix/kimi-k2-streaming-regex-leading-newline-38441

saifmb0 commented Mar 28, 2026

Uh oh!

claude bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saifmb0 commented Mar 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

saifmb0 commented Mar 28, 2026

Summary

Root cause

Fix

Why this is not a duplicate

Test plan

AI assistance disclosure

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saifmb0 commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

saifmb0 commented Mar 28, 2026 •

edited

Loading