[Bugfix][Tool Parser] Fix Kimi-K2 streaming regex to handle leading newline before tool call ID#38443
Conversation
There was a problem hiding this comment.
Code Review
This pull request fixes an issue where the Kimi K2 tool parser fails to process tool calls when a leading newline is emitted by the model. The fix involves updating the regex patterns for tool call portions and names to handle leading whitespace and using the re.DOTALL flag. Comprehensive regression tests have been added to verify these changes. Review feedback suggests further refining the regex patterns to use \S.*:\d+ instead of .+:\d+ to avoid capturing surrounding whitespace within the tool_call_id group, which would simplify the code by removing the need for manual .strip() calls.
41038d5 to
eb0ab41
Compare
|
Applied all four suggestions:
Amended and force-pushed to the same branch. |
|
Thanks for the update, @saifmb0. The changes look correct and address the identified issues. I've verified that the regex updates and the removal of |
…ewline in tool call ID (vllm-project#38441) The model occasionally emits a stray \n between <|tool_call_begin|> and the function name, e.g.: <|tool_call_begin|> functions.edit:15<|tool_call_argument_begin|>{...} Because Python regex does not match \n with . by default, both stream_tool_call_portion_regex and stream_tool_call_name_regex silently failed to match, causing the entire tool call to be dropped during streaming. Fix: - Add a leading \s* to both streaming regexes so any leading whitespace/newlines before the tool_call_id are consumed. - Compile both regexes with re.DOTALL so . inside the capture group spans newlines. This is distinct from PR vllm-project#37384 which only adds re.DOTALL (without leading \s*) to the portion regex and does not fix stream_tool_call_name_regex. Tests added: - test_stream_tool_call_portion_regex_handles_leading_newline: unit test that both regexes match inputs with a leading \n. - test_streaming_tool_call_with_newline_after_begin_token: end-to-end streaming simulation reproducing the exact scenario in the issue. Why this is not a duplicate: checked open PRs vllm-project#37384, vllm-project#37445, vllm-project#32504, whitespace/newlines preceding the tool_call_id capture group, and none fix stream_tool_call_name_regex with re.DOTALL. Co-authored-by: GitHub Copilot Signed-off-by: saif <contact@saifmb.com>
eb0ab41 to
54a8914
Compare
Based on vllm-project#38443 and vllm-project#37445
Based on vllm-project#38443 and vllm-project#37445
Summary
Fixes #38441.
The model (Kimi K2 / K2.5) occasionally emits a stray
\nbetween<|tool_call_begin|>and the function name during streaming (observed on long-context inference withtool_choice: auto, without constrained decoding):Because Python regex's
.does not match\nby default, bothstream_tool_call_portion_regexandstream_tool_call_name_regexsilently failed to match, causing the tool call to be entirely dropped during streaming.Root cause
Fix
Two changes per regex:
\s*— consumes any leading whitespace/newlines before the function name.re.DOTALL— makes.match\nso thetool_call_idcapture group spans newlines.Why this is not a duplicate
Checked open PRs: #37384, #37445, #32504, #24847, #26918, #36891.
re.DOTALLonly tostream_tool_call_portion_regex(for multi-line arguments), but does not add the leading\s*that handles a newline before the tool_call_id, and does not fixstream_tool_call_name_regexat all. The\s*prefix is the critical fix for this issue.stream_tool_call_name_regex.Test plan
Two new tests added to
tests/tool_parsers/test_kimi_k2_tool_parser.py:test_stream_tool_call_portion_regex_handles_leading_newline— unit test: both regexes must match with/without a leading\n, and correctly extracttool_call_idandfunction_arguments.test_streaming_tool_call_with_newline_after_begin_token— end-to-end streaming simulation of the exact failure scenario from the issue; asserts at least one tool-call delta is emitted (i.e., the tool call is not silently dropped).Existing tests:
pre-commit run ruff-formatandpre-commit run ruff-checkpass clean.AI assistance disclosure
This fix was investigated, reproduced, and implemented with AI assistance (GitHub Copilot / Claude Sonnet 4.6). Every changed line has been reviewed. The reviewer (submitter) understands the change end-to-end.