Skip to content

[Bugfix] Fix streaming boundary delta losing content and role fields in DelegatingParser#43649

Closed
QingZhou-YangHY wants to merge 1 commit into
vllm-project:mainfrom
QingZhou-YangHY:fix/streaming-reasoning-tokens-truncated
Closed

[Bugfix] Fix streaming boundary delta losing content and role fields in DelegatingParser#43649
QingZhou-YangHY wants to merge 1 commit into
vllm-project:mainfrom
QingZhou-YangHY:fix/streaming-reasoning-tokens-truncated

Conversation

@QingZhou-YangHY
Copy link
Copy Markdown
Contributor

Purpose

Fixes #43221

When a streaming delta spans the reasoning/tool-call boundary (e.g., contains both </think> and <tool_call> in the same chunk), Qwen3ReasoningParser.extract_reasoning_streaming may return a DeltaMessage with both reasoning and content fields set.

Previously in DelegatingParser.parse_delta, only reasoning was saved before _extract_tool_calls_streaming overwrote delta_message. The content field (e.g., the <tool_call> prefix text) and role field were silently discarded, causing streaming reasoning tokens to be truncated.

Changes

  • In vllm/parser/abstract_parser.py, DelegatingParser.parse_delta: save all three fields (reasoning, content, role) from the boundary delta_message before the tool parser overwrites it.
  • After the tool parser returns, merge the saved fields back:
    • If the tool parser also returns content, the two strings are concatenated in order (saved_content + tool_content).
    • reasoning and role are restored unconditionally if they were set.

Test Plan

Manually verified with a mock test that reproduces the bug:

  • Before fix: result.content is None (the <tool_call> prefix is lost)
  • After fix: result.content == "<tool_call>" and result.reasoning == "think about this" are both preserved

The existing tests/parser/test_streaming.py suite continues to pass.

When a delta spans the reasoning/tool-call boundary (e.g., contains
both '</think>' and '<tool_call>' in the same chunk), the reasoning
parser may return a DeltaMessage with both 'reasoning' and 'content'
fields set.

Previously, only 'reasoning' was saved before _extract_tool_calls_streaming
overwrote delta_message. The 'content' (e.g., the '<tool_call>' prefix
text) and 'role' fields were silently discarded.

This fix saves all three fields (reasoning, content, role) before
the tool parser overwrites delta_message, and merges them back
afterwards. If the tool parser also returns content, the two content
strings are concatenated in order.

Fixes: vllm-project#43221
Signed-off-by: QingZhou-YangHY <3868850350@qq.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates parse_delta in vllm/parser/abstract_parser.py to save and restore content and role attributes in addition to reasoning before they are potentially overwritten by the tool parser. Feedback focuses on removing unnecessary getattr calls in favor of direct attribute access, as delta_message is a statically typed Pydantic model where these attributes are guaranteed to exist.

Comment on lines +711 to +717
saved_reasoning = (
getattr(delta_message, "reasoning", None) if delta_message else None
)
saved_content = (
getattr(delta_message, "content", None) if delta_message else None
)
saved_role = getattr(delta_message, "role", None) if delta_message else None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since delta_message is statically typed as DeltaMessage | None (which is a Pydantic model), its attributes reasoning, content, and role are guaranteed to exist when it is not None. Using getattr is unnecessary, bypasses static type checking (e.g., mypy/pyright), and is inconsistent with how attributes are accessed elsewhere in this file (e.g., delta_message.tool_calls). Direct attribute access is cleaner and more idiomatic.

Suggested change
saved_reasoning = (
getattr(delta_message, "reasoning", None) if delta_message else None
)
saved_content = (
getattr(delta_message, "content", None) if delta_message else None
)
saved_role = getattr(delta_message, "role", None) if delta_message else None
saved_reasoning = delta_message.reasoning if delta_message else None
saved_content = delta_message.content if delta_message else None
saved_role = delta_message.role if delta_message else None

Comment on lines +739 to +743
current_content = getattr(delta_message, "content", None)
if current_content:
delta_message.content = saved_content + current_content
else:
delta_message.content = saved_content
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since delta_message is guaranteed to be an instance of DeltaMessage at this point, we can access delta_message.content directly instead of using getattr.

Suggested change
current_content = getattr(delta_message, "content", None)
if current_content:
delta_message.content = saved_content + current_content
else:
delta_message.content = saved_content
if delta_message.content:
delta_message.content = saved_content + delta_message.content
else:
delta_message.content = saved_content

@mergify mergify Bot added the bug Something isn't working label May 26, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 26, 2026

Hi @QingZhou-YangHY, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@QingZhou-YangHY
Copy link
Copy Markdown
Contributor Author

Closing this PR as it is a duplicate of #42691, which was merged on May 23, 2026 and already fixes the same root cause (reasoning/content/role fields being discarded when the tool parser overwrites delta_message on a boundary delta). Apologies for the noise!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Streaming reasoning tokens truncated when </think> and <tool_call> appear in the same delta

1 participant