fix(glm47): improve tool call parsing and content normalization#37386
fix(glm47): improve tool call parsing and content normalization#37386chaunceyjiang merged 2 commits intovllm-project:mainfrom
Conversation
Fix GLM-4.7 tool call parser regex and normalize empty content to None for OpenAI API compatibility. - Improve func_detail_regex to cleanly separate function name from args using \S+? instead of .*? to avoid trailing whitespace - Simplify func_arg_regex by replacing redundant (?:\\n|\s)* with \s* - Normalize empty/whitespace-only content to None in extract_tool_calls - Update GLM-4.5 parser tests for the content normalization - Add GLM-4.7 parser tests for zero-arg calls, inline args, streaming Fixes: vllm-project#37277 Related: vllm-project#32436, vllm-project#33877 Signed-off-by: karanb192 <karan@example.com>
There was a problem hiding this comment.
Code Review
This pull request improves tool call parsing for GLM-4.7 models by refining the regular expressions to handle different formatting and capture function names and arguments more robustly. It also introduces content normalization to align with OpenAI API conventions by treating empty or whitespace-only content as None. The changes are well-supported by new, specific tests for GLM-4.7 and updates to existing tests, ensuring the modifications are correct and don't introduce regressions. The overall changes enhance correctness and maintainability.
Fix test_with_args streaming test that failed because city is declared as a string type, triggering incremental string streaming. Split the arg value chunk so the parser processes <arg_value>, the value content, and </arg_value> in separate calls, allowing </tool_call> to be processed in the final call. Also remove unused FunctionCall/ToolCall import and apply ruff formatting. Signed-off-by: karanb192 <karan@example.com>
|
Did you write this using CC? |
Yes @chaunceyjiang |
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com>
Prompt 3: 'Chris Lee. 42. Interests include jazz music and wo...' Why is this happening? |
|
@xi1212 Is your |
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com>
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com>
Sorry,it's my problem |
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com>
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com>
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
…-project#37386) Signed-off-by: karanb192 <karan@example.com> Co-authored-by: karanb192 <karan@example.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
Summary
func_detail_regex: Use\S+?instead of.*?for the function name capture group, and make the arg group greedy (.*vs.*?) so all argument pairs are captured correctly. This produces cleaner function names without trailing whitespace/newlines.func_arg_regex: Replace redundant(?:\\n|\s)*with\s*between</arg_key>and<arg_value>tags.None: InGlm4MoeModelToolParser.extract_tool_calls, returncontent=Noneinstead ofcontent=""when there is no meaningful text before the tool call. This aligns with the OpenAI API convention wherecontentis null when the assistant only produces tool calls.expected_contentvalues from""toNoneto match the content normalization change.Test plan
pre-commit run --all-filespassesFixes #37277
Related: #32436, #33877