[Bugfix] Fix streaming boundary delta losing content and role fields in DelegatingParser#43649
Conversation
When a delta spans the reasoning/tool-call boundary (e.g., contains both '</think>' and '<tool_call>' in the same chunk), the reasoning parser may return a DeltaMessage with both 'reasoning' and 'content' fields set. Previously, only 'reasoning' was saved before _extract_tool_calls_streaming overwrote delta_message. The 'content' (e.g., the '<tool_call>' prefix text) and 'role' fields were silently discarded. This fix saves all three fields (reasoning, content, role) before the tool parser overwrites delta_message, and merges them back afterwards. If the tool parser also returns content, the two content strings are concatenated in order. Fixes: vllm-project#43221 Signed-off-by: QingZhou-YangHY <3868850350@qq.com>
There was a problem hiding this comment.
Code Review
This pull request updates parse_delta in vllm/parser/abstract_parser.py to save and restore content and role attributes in addition to reasoning before they are potentially overwritten by the tool parser. Feedback focuses on removing unnecessary getattr calls in favor of direct attribute access, as delta_message is a statically typed Pydantic model where these attributes are guaranteed to exist.
| saved_reasoning = ( | ||
| getattr(delta_message, "reasoning", None) if delta_message else None | ||
| ) | ||
| saved_content = ( | ||
| getattr(delta_message, "content", None) if delta_message else None | ||
| ) | ||
| saved_role = getattr(delta_message, "role", None) if delta_message else None |
There was a problem hiding this comment.
Since delta_message is statically typed as DeltaMessage | None (which is a Pydantic model), its attributes reasoning, content, and role are guaranteed to exist when it is not None. Using getattr is unnecessary, bypasses static type checking (e.g., mypy/pyright), and is inconsistent with how attributes are accessed elsewhere in this file (e.g., delta_message.tool_calls). Direct attribute access is cleaner and more idiomatic.
| saved_reasoning = ( | |
| getattr(delta_message, "reasoning", None) if delta_message else None | |
| ) | |
| saved_content = ( | |
| getattr(delta_message, "content", None) if delta_message else None | |
| ) | |
| saved_role = getattr(delta_message, "role", None) if delta_message else None | |
| saved_reasoning = delta_message.reasoning if delta_message else None | |
| saved_content = delta_message.content if delta_message else None | |
| saved_role = delta_message.role if delta_message else None |
| current_content = getattr(delta_message, "content", None) | ||
| if current_content: | ||
| delta_message.content = saved_content + current_content | ||
| else: | ||
| delta_message.content = saved_content |
There was a problem hiding this comment.
Since delta_message is guaranteed to be an instance of DeltaMessage at this point, we can access delta_message.content directly instead of using getattr.
| current_content = getattr(delta_message, "content", None) | |
| if current_content: | |
| delta_message.content = saved_content + current_content | |
| else: | |
| delta_message.content = saved_content | |
| if delta_message.content: | |
| delta_message.content = saved_content + delta_message.content | |
| else: | |
| delta_message.content = saved_content |
|
Hi @QingZhou-YangHY, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
Closing this PR as it is a duplicate of #42691, which was merged on May 23, 2026 and already fixes the same root cause (reasoning/content/role fields being discarded when the tool parser overwrites |
Purpose
Fixes #43221
When a streaming delta spans the reasoning/tool-call boundary (e.g., contains both
</think>and<tool_call>in the same chunk),Qwen3ReasoningParser.extract_reasoning_streamingmay return aDeltaMessagewith bothreasoningandcontentfields set.Previously in
DelegatingParser.parse_delta, onlyreasoningwas saved before_extract_tool_calls_streamingoverwrotedelta_message. Thecontentfield (e.g., the<tool_call>prefix text) androlefield were silently discarded, causing streaming reasoning tokens to be truncated.Changes
vllm/parser/abstract_parser.py,DelegatingParser.parse_delta: save all three fields (reasoning,content,role) from the boundarydelta_messagebefore the tool parser overwrites it.content, the two strings are concatenated in order (saved_content + tool_content).reasoningandroleare restored unconditionally if they were set.Test Plan
Manually verified with a mock test that reproduces the bug:
result.contentisNone(the<tool_call>prefix is lost)result.content == "<tool_call>"andresult.reasoning == "think about this"are both preservedThe existing
tests/parser/test_streaming.pysuite continues to pass.