[BUGFIX] Parse or convert thinking chunks given content format by juliendenize · Pull Request #41822 · vllm-project/vllm

juliendenize · 2026-05-06T13:24:47Z

Purpose

This PR is similar to #41718 and is an alternative.

The previous handling of {"type": "thinking"} content blocks in parse_chat_messages unconditionally flattened them to {"type": "text"} dicts, losing the semantic distinction. This broke Jinja chat templates (e.g. Mistral models) that rely on rendering thinking blocks natively as {"type": "thinking", ...} dicts.

This PR fixes thinking block handling with a dual-path strategy based on content_format:

"openai" (hf renderer using Jinja templates supporting openai format): Thinking blocks are preserved inline as {"type": "thinking", "thinking": ...} dicts so chat templates can render them natively.
"string" (all custom renderers IIUC): Thinking blocks are extracted and concatenated into the reasoning / reasoning_content fields on ConversationMessage, which is what the custom tokenizers expect.

Note:
Mistral Renderer was not impacted by the bug because the tokenizer wasn't seeing formatted content but the raw messages sent by the user.

Additionally:

Adds a conflict guard that raises ValueError when a message contains both a top-level reasoning field and inline thinking content blocks.
Extracts a flatten_content_to_text() helper to fix the echo feature, which crashed when ConversationMessage.content was a list of dicts (as it now can be with preserved thinking blocks). This is applied consistently across streaming, non-streaming, and batch serving paths. For echo feature, Mistral Models thinking chunks are not kept which is imo expected IIUC its intent.

Test Plan

Unit tests added

pytest tests/entrypoints/test_chat_utils.py::TestParseChatMessagesThinking -v

tested ministral with hf format

Test Result

Tests pass and mininstral with hf format now receives correctly thinking chunks whereas previously it received text chunks.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: juliendenize <julien.denize@mistral.ai>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request introduces specialized handling for 'thinking' content blocks in chat messages, allowing them to be preserved for Jinja templates or extracted into a reasoning field based on the content format. It also adds a flatten_content_to_text utility to standardize text extraction for echo functionality. Review feedback identified a potential TypeError when processing null thinking content and a logic error where empty thinking blocks were not correctly removed in 'string' format, which could lead to assertion failures.

Signed-off-by: juliendenize <julien.denize@mistral.ai>

[BUGFIX] Parse or convert thinking chunks given content format

65db272

Signed-off-by: juliendenize <julien.denize@mistral.ai>

juliendenize requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners May 6, 2026 13:24

claude Bot reviewed May 6, 2026

View reviewed changes

mergify Bot added frontend bug Something isn't working labels May 6, 2026

gemini-code-assist Bot reviewed May 6, 2026

View reviewed changes

Comment thread vllm/entrypoints/chat_utils.py

Comment thread vllm/entrypoints/chat_utils.py

Apply comments

a6ea18c

Signed-off-by: juliendenize <julien.denize@mistral.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUGFIX] Parse or convert thinking chunks given content format#41822

[BUGFIX] Parse or convert thinking chunks given content format#41822
juliendenize wants to merge 2 commits intovllm-project:mainfrom
juliendenize:juliendenize/fix_thinking_chunk

juliendenize commented May 6, 2026 •

edited by github-actions Bot

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

juliendenize commented May 6, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

juliendenize commented May 6, 2026 •

edited by github-actions Bot

Loading