[Mistral Tokenizer] allow more leniency in apply_chat_template by juliendenize · Pull Request #41658 · vllm-project/vllm

juliendenize · 2026-05-04T20:10:33Z

Purpose

This PR upgrade mistral-common to 1.11.2. This allows to:

use 'reasoning' inside assistant message to support user sending back think trace to the model which improves performance for some of them (including Mistral Medium 3.5)
move Tool special handling inside mistral-common to reduce LOC inside vLLM

Test Plan

Tested on manually made requests + updated some unit tests in the lib

Test Result

All pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: juliendenize <julien.denize@mistral.ai>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request upgrades mistral_common to version 1.11.2 and refactors the Mistral tokenizer and tool parser to utilize native library methods for tool adaptation. The internal helper for preparing chat templates has been converted into a validation function, removing manual message and tool modification logic. Feedback identifies a high-severity issue where the add_generation_prompt parameter was omitted in the call to the underlying transformers tokenizer, which could lead to incorrect prompt formatting.

gemini-code-assist · 2026-05-04T20:15:57Z

+            messages, continue_final_message, add_generation_prompt
        )

        return self.transformers_tokenizer.apply_chat_template(


The add_generation_prompt argument is missing from the call to self.transformers_tokenizer.apply_chat_template. While mistral-common often infers the generation prompt from the conversation state, the transformers tokenizer wrapper (MistralCommonBackend) explicitly supports this parameter. Omitting it causes it to default to False, which may conflict with the user's intent and vLLM's default behavior (where add_generation_prompt defaults to True).

Suggested change

return self.transformers_tokenizer.apply_chat_template(

return self.transformers_tokenizer.apply_chat_template(

conversation=messages,

tools=tools,

add_generation_prompt=add_generation_prompt,

continue_final_message=continue_final_message,

tokenize=tokenize,

padding=padding,

truncation=truncation,

max_length=max_length,

return_tensors=None,

return_dict=False,

**version_kwargs,

)

it was already not the case. While I agree ultimately that we'd want to do that, we need to wait a bit that last Transformers version is released as I recently updated MistralCommonBackend.

Signed-off-by: juliendenize <julien.denize@mistral.ai>

joa-stdn

thanks a lot!

juliendenize · 2026-05-05T22:25:43Z

@DarkLight1337 i don't think the CI errors are related ! If you agree could you consider merging 🙏 ?

…project#41658) Signed-off-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io>

…project#41658) Signed-off-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: Mehdi Ghanimifard <mehdi.ghanimifard@amd.com>

…project#41658) Signed-off-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: hongbolv <33214277+hongbolv@users.noreply.github.com>

…project#41658) Signed-off-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: Ifta Khairul Alam Adil <ikaadil007@gmail.com>

…project#41658) Signed-off-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: Libin Tang <libin.tang@intel.com>

[Mistral Tokenizer] Support 'reasoning' field in message.

0e55cc1

Signed-off-by: juliendenize <julien.denize@mistral.ai>

juliendenize requested review from aarnphm, bbrowning, chaunceyjiang, patrickvonplaten and sfeng33 as code owners May 4, 2026 20:10

claude Bot reviewed May 4, 2026

View reviewed changes

mergify Bot added ci/build mistral Related to Mistral models nvidia tool-calling labels May 4, 2026

github-project-automation Bot added this to NVIDIA and Tool Calling May 4, 2026

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

joa-stdn reviewed May 4, 2026

View reviewed changes

Comment thread tests/tokenizers_/test_mistral.py

juliendenize and others added 3 commits May 4, 2026 22:38

Merge branch 'main' into juliendenize/patch_mistral_tokenizer

d9e0336

Add tests

6a8b73b

Signed-off-by: juliendenize <julien.denize@mistral.ai>

merge asserts

63d76c7

Signed-off-by: juliendenize <julien.denize@mistral.ai>

joa-stdn approved these changes May 4, 2026

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label May 4, 2026

DarkLight1337 approved these changes May 5, 2026

View reviewed changes

github-project-automation Bot moved this to Ready in NVIDIA May 5, 2026

DarkLight1337 enabled auto-merge (squash) May 5, 2026 02:36

Merge branch 'main' into juliendenize/patch_mistral_tokenizer

3af6dbd

juliendenize mentioned this pull request May 5, 2026

Fix assistant thinking block normalization #41718

Open

juliendenize added 2 commits May 5, 2026 14:45

Merge branch 'main' into juliendenize/patch_mistral_tokenizer

5de267c

Merge branch 'main' into juliendenize/patch_mistral_tokenizer

019ae48

vllm-bot merged commit 16e3364 into vllm-project:main May 6, 2026
148 of 151 checks passed

github-project-automation Bot moved this to Done in Tool Calling May 6, 2026

github-project-automation Bot moved this from Ready to Done in NVIDIA May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Mistral Tokenizer] allow more leniency in apply_chat_template#41658

[Mistral Tokenizer] allow more leniency in apply_chat_template#41658
vllm-bot merged 7 commits intovllm-project:mainfrom
juliendenize:juliendenize/patch_mistral_tokenizer

juliendenize commented May 4, 2026 •

edited by github-actions Bot

Loading

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Uh oh!

juliendenize May 4, 2026

Uh oh!

Uh oh!

joa-stdn left a comment

Uh oh!

juliendenize commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

-        return self.transformers_tokenizer.apply_chat_template(
+        return self.transformers_tokenizer.apply_chat_template(
+            conversation=messages,
+            tools=tools,
+            add_generation_prompt=add_generation_prompt,
+            continue_final_message=continue_final_message,
+            tokenize=tokenize,
+            padding=padding,
+            truncation=truncation,
+            max_length=max_length,
+            return_tensors=None,
+            return_dict=False,
+            **version_kwargs,
+        )

Uh oh!

Conversation

juliendenize commented May 4, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

juliendenize May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joa-stdn left a comment

Choose a reason for hiding this comment

Uh oh!

juliendenize commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

juliendenize commented May 4, 2026 •

edited by github-actions Bot

Loading