reasoning: fix tool-call routing bug in GLM-5.1-FP8 at long context (AIVLLM-229) by jatseng-ai · Pull Request #40782 · vllm-project/vllm

jatseng-ai · 2026-04-24T07:17:10Z

Summary

GLM-5.1-FP8 places <tool_call> XML inside <think> blocks when prompt context exceeds ~8k tokens. vLLM's glm45 reasoning parser captures this in reasoning_content instead of routing it to tool_calls, producing finish_reason=stop instead of finish_reason=tool_calls.

Three-phase fix in DeepSeekV3ReasoningWithThinkingParser:

Phase 1 — complete <tool_call>…</tool_call> blocks inside reasoning promoted verbatim
Phase 2 — body-without-open-tag (token id 154829 stripped by vLLM) reconstructed by prepending <tool_call>
Phase 3a/b — closing fragment appended to unclosed content call, or promoted standalone

19/19 unit tests pass. Live test enforces 8k-token gate. Confirmed PASS at 10,431 prompt tokens.

Analogous to #39055 (Qwen3) but requires additional phases because GLM's <tool_call> open token is a special token stripped from text output.

Ref: AIVLLM-229

…AIVLLM-229)

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request introduces a three-phase fix for the GLM-4.5 model to handle tool-call XML blocks that are incorrectly emitted inside reasoning tags. It includes a new _split_embedded_tool_calls method to promote these blocks to the content field and comprehensive unit tests for various edge cases. Feedback highlights a regression where essential properties were removed from the parser and a potential issue with a regular expression being too greedy, which could lead to the accidental removal of reasoning prose.

gemini-code-assist · 2026-04-24T07:18:49Z

@@ -40,14 +76,6 @@ def __init__(self, tokenizer: PreTrainedTokenizerBase, *args, **kwargs):
        else:
            self._parser = IdentityReasoningParser(tokenizer, *args, **kwargs)

-    @property
-    def reasoning_start_str(self) -> str | None:
-        return self._parser.reasoning_start_str
-
-    @property
-    def reasoning_end_str(self) -> str | None:
-        return self._parser.reasoning_end_str


The removal of the reasoning_start_str and reasoning_end_str properties from DeepSeekV3ReasoningParser is a regression. These properties are essential for delegating to the underlying parser (e.g., DeepSeekR1ReasoningParser), which defines the actual delimiters like <think> and </think>. Without these, the parser will return None for these delimiters, which can break reasoning block detection in the engine or API layers that rely on them.

gemini-code-assist · 2026-04-24T07:18:49Z

+_BODY_WITHOUT_OPEN_RE = re.compile(
+    r"^(?:[\w.-]+\n?)?"           # optional function name (no spaces, optional newline)
+    r"(?:<arg_key>.*?</arg_key>"  # at least one complete key-value pair
+    r"<arg_value>.*?</arg_value>)*"
+    r"<arg_key>.*?</arg_key>"     # required: at least one <arg_key> must be present
+    r"<arg_value>.*?"             # value (may be truncated)
+    r"(?:</arg_value>)?"
+    r"</tool_call>"               # required closing tag
+    r".*$",
+    re.DOTALL,
+)


The regex _BODY_WITHOUT_OPEN_RE uses .*$ at the end, which greedily captures any text following the </tool_call> tag. If the model produces additional reasoning prose after the tool call body within the same block, that prose will be incorrectly moved to the content field and removed from reasoning_content. Changing this to \s*$ ensures that Phase 2 only triggers when the entire remaining string is a tool call body (optionally followed by whitespace), preventing accidental loss of reasoning prose.

Suggested change

_BODY_WITHOUT_OPEN_RE = re.compile(

r"^(?:[\w.-]+\n?)?" # optional function name (no spaces, optional newline)

r"(?:<arg_key>.*?</arg_key>" # at least one complete key-value pair

r"<arg_value>.*?</arg_value>)*"

r"<arg_key>.*?</arg_key>" # required: at least one <arg_key> must be present

r"<arg_value>.*?" # value (may be truncated)

r"(?:</arg_value>)?"

r"</tool_call>" # required closing tag

r".*$",

re.DOTALL,

)

_BODY_WITHOUT_OPEN_RE = re.compile(

r"^(?:[\w.-]+\n?)?" # optional function name (no spaces, optional newline)

r"(?:<arg_key>.*?</arg_key>"

r"<arg_value>.*?</arg_value>)*"

r"<arg_key>.*?</arg_key>" # required: at least one <arg_key> must be present

r"<arg_value>.*?" # value (may be truncated)

r"(?:</arg_value>)?"

r"</tool_call>"

r"\s*$",

re.DOTALL,

)

reasoning: fix tool-call routing bug in GLM-5.1-FP8 at long context (…

5c3d82e

…AIVLLM-229)

jatseng-ai requested review from aarnphm, bbrowning, chaunceyjiang and sfeng33 as code owners April 24, 2026 07:17

claude Bot reviewed Apr 24, 2026

View reviewed changes

mergify Bot added deepseek Related to DeepSeek models bug Something isn't working labels Apr 24, 2026

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

jatseng-ai closed this Apr 24, 2026

jatseng-ai deleted the fix/glm45-tool-call-routing-aivllm-229 branch April 24, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reasoning: fix tool-call routing bug in GLM-5.1-FP8 at long context (AIVLLM-229)#40782

reasoning: fix tool-call routing bug in GLM-5.1-FP8 at long context (AIVLLM-229)#40782
jatseng-ai wants to merge 1 commit into
vllm-project:mainfrom
jatseng-ai:fix/glm45-tool-call-routing-aivllm-229

jatseng-ai commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jatseng-ai commented Apr 24, 2026

Summary

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant