Skip to content

[mistral_common] Add v11 tokenizer#19193

Merged
njhill merged 9 commits intovllm-project:mainfrom
patrickvonplaten:add_v11_mistral_common
Jun 5, 2025
Merged

[mistral_common] Add v11 tokenizer#19193
njhill merged 9 commits intovllm-project:mainfrom
patrickvonplaten:add_v11_mistral_common

Conversation

@patrickvonplaten
Copy link
Collaborator

@patrickvonplaten patrickvonplaten commented Jun 5, 2025

Support of new mistral_common v11 tokenizer

@patrickvonplaten patrickvonplaten marked this pull request as draft June 5, 2025 08:37
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @patrickvonplaten, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello team, Gemini here to provide a summary of this pull request. Based on the title and the code changes, this PR aims to add support for the Mistral v11 tokenizer, specifically addressing how tool calls are handled and parsed when using this new version. The PR description is currently empty, so the intent has been derived solely from the code changes and title.

Highlights

  • Mistral v11 Tokenizer Support: Adds the necessary logic to recognize and handle the Mistral v11 tokenizer.
  • Updated Tool Call Parsing: Modifies the tool call parsing logic in the OpenAI entrypoint to accommodate a potentially different format used by the v11 tokenizer, which seems to separate the function name from the arguments JSON.
  • Tokenizer Version Extraction: Introduces logic to extract the numerical version from the Mistral tokenizer to enable version-specific handling.

Changelog

  • vllm/entrypoints/openai/tool_parsers/mistral_tool_parser.py
    • Added conditional logic in __init__ to compile a specific regex (fn_name_regex) for parsing tool calls if the tokenizer is Mistral v11 or newer.
    • Updated extract_tool_calls to use the new fn_name_regex for v11 tokenizers, extracting the function name and parsing the arguments JSON separately, while retaining the old parsing method for older versions.
  • vllm/transformers_utils/tokenizers/mistral.py
    • Added code in __init__ to parse the tokenizer version string and store the numerical version in self.version.
    • Added a print(request) statement in apply_chat_template (likely for debugging).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A new tokenizer,
A different tool call format,
Parse it just right.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request adds support for the v11 tokenizer in the mistral_common library. The changes involve updating the MistralToolParser to handle the new tokenizer version and modifying the MistralTokenizer class to extract the version number. The code seems well-structured and addresses the intended functionality. However, there are a few areas that could be improved for clarity and robustness.

Summary of Findings

  • Version Extraction Robustness: The version extraction logic in MistralTokenizer could be made more robust by adding error handling for unexpected version string formats.
  • Clarity of Tool Parser Logic: The logic for handling function names in MistralToolParser could benefit from additional comments explaining the encoding approach.
  • Remove print statement: The print statement in apply_chat_template should be removed before merging.

Merge Readiness

The pull request introduces a valuable feature by adding support for the v11 tokenizer. While the code is generally well-structured, addressing the comments regarding error handling and code clarity would further improve its quality. I recommend addressing the comments before merging. I am unable to directly approve this pull request, and users should have others review and approve this code before merging.

Comment on lines 73 to 83
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider extracting this version check into a separate helper function for better readability and maintainability. This would also allow for easier testing of this specific logic.

def _is_tokenizer_version_supported(model_tokenizer: Any) -> bool:
    return isinstance(model_tokenizer, MistralTokenizer) and model_tokenizer.version >= 11

if _is_tokenizer_version_supported(self.model_tokenizer):
    self.fn_name_regex = re.compile(r'([a-zA-Z0-9_-]+)(\{.*?\})', re.DOTALL)
else:
    self.fn_name_regex = None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually pretty good idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a try-except block to handle potential errors during version extraction. This would prevent the program from crashing if the version string is not in the expected format.

try:
    self.version: int = int(self.instruct.tokenizer.version.value.split("v")[-1].split("m")[0])
except Exception as e:
    logger.warning(f"Failed to extract tokenizer version: {e}")
    self.version = 0  # Or some default value

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm no we have tests in mistral_common that ensure that this always works

@github-actions
Copy link

github-actions bot commented Jun 5, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@patrickvonplaten patrickvonplaten marked this pull request as ready for review June 5, 2025 09:32
@patrickvonplaten patrickvonplaten force-pushed the add_v11_mistral_common branch from 90e0555 to 55ae025 Compare June 5, 2025 09:47
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 5, 2025
jmswen and others added 8 commits June 5, 2025 13:03
…ect#19102)

Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
@patrickvonplaten patrickvonplaten force-pushed the add_v11_mistral_common branch from 4f70646 to 2941b28 Compare June 5, 2025 11:03
@mergify mergify bot added documentation Improvements or additions to documentation v1 labels Jun 5, 2025
@simon-mo simon-mo added this to the v0.9.1 milestone Jun 5, 2025
@njhill njhill merged commit f20f9f0 into vllm-project:main Jun 5, 2025
66 checks passed
self.bot_token_id = self.vocab.get(self.bot_token)
self.tool_call_regex = re.compile(r"\[{.*}\]", re.DOTALL)
if _is_fn_name_regex_support(self.model_tokenizer):
self.fn_name_regex = re.compile(r'([a-zA-Z0-9_-]+)(\{.*?\})',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noticed that this regex sadly doesn't match the outer most {...} but instead only the inner-most. This means that for inputs such as:

'{"sub_dict": {....}}' it incorrectly parse it to '{"sub_dict": {....}' => and hence miss the final {. The corrected regex should be:

fn_name_regex = re.compile(r'([a-zA-Z0-9_-]+)(\{[\s\S]*?\})(?=\s*$|,|\s)', re.DOTALL)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the find. I can post a fix for this

@gaby
Copy link

gaby commented Jul 2, 2025

@patrickvonplaten Any idea when a new vLLM is getting released?

Unable to use Magistral and Small 3.2 because the latest Docker release of vLLM is over a month old.

@mgoin
Copy link
Member

mgoin commented Jul 2, 2025

@gaby current target is end of this week, milestone here https://github.com/vllm-project/vllm/milestone/6

@gaby
Copy link

gaby commented Jul 2, 2025

@gaby current target is end of this week, milestone here https://github.com/vllm-project/vllm/milestone/6

Thank you! 💪

leoli1208 pushed a commit to leoli1208/vllm that referenced this pull request Jul 22, 2025
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants