[BUGFIX] Parse or convert thinking chunks given content format#41822
Open
juliendenize wants to merge 2 commits intovllm-project:mainfrom
Open
[BUGFIX] Parse or convert thinking chunks given content format#41822juliendenize wants to merge 2 commits intovllm-project:mainfrom
juliendenize wants to merge 2 commits intovllm-project:mainfrom
Conversation
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces specialized handling for 'thinking' content blocks in chat messages, allowing them to be preserved for Jinja templates or extracted into a reasoning field based on the content format. It also adds a flatten_content_to_text utility to standardize text extraction for echo functionality. Review feedback identified a potential TypeError when processing null thinking content and a logic error where empty thinking blocks were not correctly removed in 'string' format, which could lead to assertion failures.
Signed-off-by: juliendenize <julien.denize@mistral.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
This PR is similar to #41718 and is an alternative.
The previous handling of {"type": "thinking"} content blocks in parse_chat_messages unconditionally flattened them to {"type": "text"} dicts, losing the semantic distinction. This broke Jinja chat templates (e.g. Mistral models) that rely on rendering thinking blocks natively as {"type": "thinking", ...} dicts.
This PR fixes thinking block handling with a dual-path strategy based on content_format:
Note:
Mistral Renderer was not impacted by the bug because the tokenizer wasn't seeing formatted content but the raw messages sent by the user.
Additionally:
Test Plan
Unit tests added
Test Result
Tests pass and mininstral with hf format now receives correctly thinking chunks whereas previously it received text chunks.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.