Skip to content

[BUGFIX] Parse or convert thinking chunks given content format#41822

Open
juliendenize wants to merge 2 commits intovllm-project:mainfrom
juliendenize:juliendenize/fix_thinking_chunk
Open

[BUGFIX] Parse or convert thinking chunks given content format#41822
juliendenize wants to merge 2 commits intovllm-project:mainfrom
juliendenize:juliendenize/fix_thinking_chunk

Conversation

@juliendenize
Copy link
Copy Markdown
Contributor

@juliendenize juliendenize commented May 6, 2026

Purpose

This PR is similar to #41718 and is an alternative.

The previous handling of {"type": "thinking"} content blocks in parse_chat_messages unconditionally flattened them to {"type": "text"} dicts, losing the semantic distinction. This broke Jinja chat templates (e.g. Mistral models) that rely on rendering thinking blocks natively as {"type": "thinking", ...} dicts.

This PR fixes thinking block handling with a dual-path strategy based on content_format:

  • "openai" (hf renderer using Jinja templates supporting openai format): Thinking blocks are preserved inline as {"type": "thinking", "thinking": ...} dicts so chat templates can render them natively.
  • "string" (all custom renderers IIUC): Thinking blocks are extracted and concatenated into the reasoning / reasoning_content fields on ConversationMessage, which is what the custom tokenizers expect.

Note:
Mistral Renderer was not impacted by the bug because the tokenizer wasn't seeing formatted content but the raw messages sent by the user.

Additionally:

  • Adds a conflict guard that raises ValueError when a message contains both a top-level reasoning field and inline thinking content blocks.
  • Extracts a flatten_content_to_text() helper to fix the echo feature, which crashed when ConversationMessage.content was a list of dicts (as it now can be with preserved thinking blocks). This is applied consistently across streaming, non-streaming, and batch serving paths. For echo feature, Mistral Models thinking chunks are not kept which is imo expected IIUC its intent.

Test Plan

Unit tests added

pytest tests/entrypoints/test_chat_utils.py::TestParseChatMessagesThinking -v
  • tested ministral with hf format

Test Result

Tests pass and mininstral with hf format now receives correctly thinking chunks whereas previously it received text chunks.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: juliendenize <julien.denize@mistral.ai>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added frontend bug Something isn't working labels May 6, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces specialized handling for 'thinking' content blocks in chat messages, allowing them to be preserved for Jinja templates or extracted into a reasoning field based on the content format. It also adds a flatten_content_to_text utility to standardize text extraction for echo functionality. Review feedback identified a potential TypeError when processing null thinking content and a logic error where empty thinking blocks were not correctly removed in 'string' format, which could lead to assertion failures.

Comment thread vllm/entrypoints/chat_utils.py
Comment thread vllm/entrypoints/chat_utils.py
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working frontend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant