[Bugfix]: Prevent reasoning_content leak by RohanDisa · Pull Request #32997 · vllm-project/vllm

RohanDisa · 2026-01-24T07:09:30Z

Purpose

Fix a bug where reasoning_content was incorrectly flushed into content in the final streamed chunk when finish_reason='tool_calls'.

Root Cause & Solution

Stream finalization did not clear content/reasoning buffers when finish_reason='tool_calls', allowing leftover reasoning_content (especially from speculative decoding) to leak.
Added guards to clear content/reasoning fields before final chunk creation and after tool call extraction.

Test Plan

Added regression test test_no_content_leak_when_finish_reason_tool_calls in tests/entrypoints/openai/test_chat_with_tool_reasoning.py.
The test simulates streaming with reasoning + tool_choice="auto" and asserts that the final chunk has no content/reasoning leaks while earlier reasoning is preserved.

Test Result

Regression test passes ✅
Final chunk with finish_reason="tool_calls" has content=null and no reasoning/reasoning_content

Fixes: #32921

…ool_calls This fixes a bug where reasoning_content was incorrectly flushed into the content field in the final streamed chunk when finish_reason='tool_calls'. The issue occurred when: - stream=true - OpenAI tool-call parser enabled - tool_choice='auto' - reasoning fields enabled (reasoning, reasoning_content) - speculative decoding enabled Per OpenAI's schema contract, when finish_reason='tool_calls', the response must only contain tool_calls and finish_reason, never content. Changes: 1. Add guard before final chunk creation to clear content/reasoning when finish_reason='tool_calls' 2. Add guards after tool call extraction in all paths (auto, required, named, harmony) to prevent content leakage during streaming 3. Ensure reasoning_content is never flushed into content when tool calls are present 4. Add test to verify no content leak when finish_reason=tool_calls Signed-off-by: RohanDisa <105740583+RohanDisa@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request effectively addresses a bug where reasoning_content could leak into the content field during streaming with tool calls. The solution, which involves clearing content and reasoning fields at various points, is sound and is well-supported by a new regression test. My review includes a suggestion to ensure consistency in the fix across different code paths and a recommendation to refactor duplicated code to improve long-term maintainability.

vllm/entrypoints/openai/chat_completion/serving.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

vllm/entrypoints/openai/chat_completion/serving.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Rohan Disa <105740583+RohanDisa@users.noreply.github.com>

Signed-off-by: RohanDisa <105740583+RohanDisa@users.noreply.github.com>

chaunceyjiang

Per OpenAI spec, tool call deltas must not contain content or reasoning

Could you share the link? I couldn’t find any similar documentation or explanation.

chaunceyjiang · 2026-02-02T05:56:02Z

tests/entrypoints/openai/test_chat_with_tool_reasoning.py

+        temperature=0.0,
+        stream=True,
+        tool_choice="auto",
+        include_reasoning=True,


got an unexpected keyword argument 'include_reasoning'

chaunceyjiang · 2026-02-02T05:57:27Z

tests/entrypoints/openai/test_chat_with_tool_reasoning.py

+    )
+
+    # Verify tool_calls are present (the expected behavior)
+    assert delta.tool_calls is not None and len(delta.tool_calls) > 0, (


AssertionError: Final chunk with finish_reason='tool_calls' must have tool_calls

I ran this test on your branch and got the error shown above.

chaunceyjiang · 2026-02-02T05:58:27Z

vllm/entrypoints/openai/chat_completion/serving.py

+        """
+        Clear content and reasoning fields from a delta message.
+
+        Per OpenAI spec, tool call deltas must not contain content or reasoning


Per OpenAI spec, tool call deltas must not contain content or reasoning

Could you share the link? I couldn’t find any similar documentation or explanation.

bbrowning · 2026-02-11T19:20:30Z

I also haven't run across any place that states that reasoning or content should not be in the Chat Completion chunks when tool calls are involved. The spec at https://github.com/openai/openai-openapi/blob/498c71ddf6f1c45b983f972ccabca795da211a3e/openapi.yaml#L18416 doesn't show anything like this, for example. Do you have an example where this was causing problems?

think-in-universe · 2026-03-06T10:23:57Z

Hey guys, any ideas when this PR and can be merged and fix this issue: #32921

Suffering a lot from this issue recently.

RohanDisa requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang and robertgshaw2-redhat as code owners January 24, 2026 07:09

mergify bot added frontend tool-calling bug Something isn't working labels Jan 24, 2026

github-project-automation bot added this to Tool Calling Jan 24, 2026

gemini-code-assist bot reviewed Jan 24, 2026

View reviewed changes

vllm/entrypoints/openai/chat_completion/serving.py Show resolved Hide resolved

vllm/entrypoints/openai/chat_completion/serving.py Outdated Show resolved Hide resolved

cursor bot reviewed Jan 24, 2026

View reviewed changes

vllm/entrypoints/openai/chat_completion/serving.py Show resolved Hide resolved

RohanDisa and others added 2 commits January 24, 2026 16:48

Update vllm/entrypoints/openai/chat_completion/serving.py

9d70b96

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Rohan Disa <105740583+RohanDisa@users.noreply.github.com>

Refactor: Extract _clear_non_tool_call_fields helper method

963fb08

Signed-off-by: RohanDisa <105740583+RohanDisa@users.noreply.github.com>

RohanDisa changed the title ~~Bugfix: Prevent reasoning_content leak~~ [Bugfix]: Prevent reasoning_content leak Jan 25, 2026

chaunceyjiang self-assigned this Jan 25, 2026

chaunceyjiang reviewed Feb 2, 2026

View reviewed changes

will-deines mentioned this pull request Mar 4, 2026

[Bugfix] Fix Harmony streaming cross-channel delta accumulation #36011

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix]: Prevent reasoning_content leak #32997

[Bugfix]: Prevent reasoning_content leak #32997
RohanDisa wants to merge 3 commits intovllm-project:mainfrom
RohanDisa:fix/reasoning-content-leak-in-tool-calls-streaming

RohanDisa commented Jan 24, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

chaunceyjiang left a comment

Uh oh!

chaunceyjiang Feb 2, 2026

Uh oh!

chaunceyjiang Feb 2, 2026

Uh oh!

chaunceyjiang Feb 2, 2026

Uh oh!

bbrowning commented Feb 11, 2026

Uh oh!

think-in-universe commented Mar 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

RohanDisa commented Jan 24, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Root Cause & Solution

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

bbrowning commented Feb 11, 2026

Uh oh!

think-in-universe commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RohanDisa commented Jan 24, 2026 •

edited by github-actions bot

Loading

think-in-universe commented Mar 6, 2026 •

edited

Loading