Patch Mistral config by juliendenize · Pull Request #37104 · vllm-project/vllm

juliendenize · 2026-03-15T12:11:01Z

Purpose

This PR does the following:

rope parameters are now casted to the type expected by Transformers v5. I believe it has no effect on vLLM computations but please correct me if I'm wrong. This silences warnings raised by Transformers
ignore warnings from Transformers relative to apply_yarn_scaling parameter not found that is an argument stored by Mistral config but unknown to Transformers.
infer dtype directly in the Mistral config instead of later in the code. This way errors mentioning the model is not a safetensors repo that is raised when infering dtype in multiple places are removed by only one change instead of multiple ones.

Test Plan

Checked by serving a Mistral model

Test Result

serving worked

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces several patches for Mistral model configuration handling. It silences some warnings from the Transformers library, improves data type casting for RoPE parameters, and refactors the data type inference to be more robust. The changes are generally good, but I've found a critical thread-safety issue with how global constants are being modified. My review includes a suggestion to fix this potential race condition.

gemini-code-assist · 2026-03-15T12:12:39Z

vllm/transformers_utils/config.py

+@contextmanager
+def _mistral_patch_hf_hub_constants() -> Iterator[None]:
+    hf_safetensors_single_file = constants.SAFETENSORS_SINGLE_FILE
+    hf_safetensors_index_file = constants.SAFETENSORS_INDEX_FILE
+    constants.SAFETENSORS_SINGLE_FILE = "consolidated.safetensors"
+    constants.SAFETENSORS_INDEX_FILE = "consolidated.safetensors.index.json"
+    try:
+        yield
+    finally:
+        constants.SAFETENSORS_SINGLE_FILE = hf_safetensors_single_file
+        constants.SAFETENSORS_INDEX_FILE = hf_safetensors_index_file


The modification of global constants in huggingface_hub.constants is not thread-safe. In a scenario where multiple models are loaded concurrently in different threads (e.g., one Mistral model and one standard Hugging Face model), this monkey-patching can create a race condition. One thread might be expecting the default constant values while another has temporarily changed them, potentially leading to FileNotFoundError or other unpredictable behavior during model loading. To prevent this, the critical section where constants are modified should be protected by a lock.

Please also add import threading at the top of the file.

_mistral_patch_lock = threading.Lock() @contextmanager def _mistral_patch_hf_hub_constants() -> Iterator[None]: with _mistral_patch_lock: hf_safetensors_single_file = constants.SAFETENSORS_SINGLE_FILE hf_safetensors_index_file = constants.SAFETENSORS_INDEX_FILE constants.SAFETENSORS_SINGLE_FILE = "consolidated.safetensors" constants.SAFETENSORS_INDEX_FILE = "consolidated.safetensors.index.json" try: yield finally: constants.SAFETENSORS_SINGLE_FILE = hf_safetensors_single_file constants.SAFETENSORS_INDEX_FILE = hf_safetensors_index_file

hmellor · 2026-03-16T08:12:51Z

To clarify, are these warnings that are appearing in v4 or v5?

Also, would it not be better to add apply_yarn_scaling to the list of expected keys in Transformers so that this issue is fixed for you everywhere, not just in vLLM?

juliendenize · 2026-03-16T08:46:14Z

@hmellor it only occurs for v5.

Also, would it not be better to add apply_yarn_scaling to the list of expected keys in Transformers so that this issue is fixed for you everywhere, not just in vLLM?

vLLM use arguments slightly differently than HF that requires adding this key which is not needed in HF so not sure it would make sense. Either way for now it's not in hf so silencing the warnings would be cool as a temporary fix if that's ok to you.

hmellor · 2026-03-16T08:56:15Z

Ok, would a better soltuion be to add apply_yarn_scaling to optional_keys in https://github.com/huggingface/transformers/blob/1af62a6046ae574b719f0418fdd1d5895080d72e/src/transformers/modeling_rope_utils.py#L758-L767?

juliendenize · 2026-03-16T09:40:39Z

Yeah we can do it if you prefer we do it like that. So for this current pr do you want me to discard only this filter warning but keep the casting ?

hmellor · 2026-03-16T10:05:51Z

Yeah I think that's best. It hardens the conversion in vLLM and ensures that you don't get the unexpected key warning anywhere that Mistral configs are loaded with Transformers.

Signed-off-by: juliendenize <julien.denize@mistral.ai>

juliendenize · 2026-03-16T10:33:23Z

Made the Transformers PR here :)
huggingface/transformers#44747

hmellor · 2026-03-16T11:01:45Z

Could we instead pass apply_yarn_scaling to validate_rope in

vllm/vllm/transformers_utils/config.py

Line 370 in bf9a185

config.validate_rope()

(I wasn't aware that this was an option before).

Something like:

ignore_keys = {}
if config_format == "mistral" and config.rope_parameters.type == "yarn":
    ignore_keys.add("apply_yarn_scaling")
config.validate_rope(ignore_keys=ignore_keys)

juliendenize · 2026-03-16T12:47:49Z

Would have been better indeed i'll do a followup pr once i have time

hmellor · 2026-03-17T12:05:34Z

I've implemented this in #37292

Signed-off-by: juliendenize <julien.denize@mistral.ai>

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

Signed-off-by: juliendenize <julien.denize@mistral.ai>

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: EricccYang <yangyang4991@gmail.com>

Signed-off-by: juliendenize <julien.denize@mistral.ai>

juliendenize requested a review from patrickvonplaten as a code owner March 15, 2026 12:11

gemini-code-assist bot reviewed Mar 15, 2026

View reviewed changes

DarkLight1337 requested a review from hmellor March 16, 2026 08:03

juliendenize added 2 commits March 16, 2026 10:27

Patch Mistral config

38ce57d

Signed-off-by: juliendenize <julien.denize@mistral.ai>

remove filter

e8f3e46

Signed-off-by: juliendenize <julien.denize@mistral.ai>

juliendenize force-pushed the silence_warnings_hf branch from 7885f61 to e8f3e46 Compare March 16, 2026 10:27

hmellor approved these changes Mar 16, 2026

View reviewed changes

hmellor enabled auto-merge (squash) March 16, 2026 10:31

juliendenize mentioned this pull request Mar 16, 2026

Add apply_yarn_scaling as optional key to yarn huggingface/transformers#44747

Closed

5 tasks

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 16, 2026

hmellor merged commit ffbc2e5 into vllm-project:main Mar 16, 2026
45 checks passed

Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026

Patch Mistral config (vllm-project#37104)

3271292

Signed-off-by: juliendenize <julien.denize@mistral.ai>

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

Patch Mistral config (vllm-project#37104)

e9514ae

Signed-off-by: juliendenize <julien.denize@mistral.ai>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

Patch Mistral config (vllm-project#37104)

78cc8ec

Signed-off-by: juliendenize <julien.denize@mistral.ai>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

Patch Mistral config (vllm-project#37104)

6512088

Signed-off-by: juliendenize <julien.denize@mistral.ai>

Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026

Patch Mistral config (vllm-project#37104)

7ea6e0f

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026

Patch Mistral config (vllm-project#37104)

284c5c1

Signed-off-by: juliendenize <julien.denize@mistral.ai>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026

Patch Mistral config (vllm-project#37104)

84f16b6

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026

Patch Mistral config (vllm-project#37104)

8bd9221

Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: EricccYang <yangyang4991@gmail.com>

liuchenbing2026 pushed a commit to liuchenbing2026/vllm that referenced this pull request Apr 4, 2026

Patch Mistral config (vllm-project#37104)

44d4217

Signed-off-by: juliendenize <julien.denize@mistral.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Patch Mistral config#37104

Patch Mistral config#37104
hmellor merged 2 commits intovllm-project:mainfrom
juliendenize:silence_warnings_hf

juliendenize commented Mar 15, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 15, 2026

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026 •

edited

Loading

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

juliendenize commented Mar 15, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 16, 2026

Uh oh!

Uh oh!

juliendenize commented Mar 16, 2026

Uh oh!

hmellor commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

juliendenize commented Mar 15, 2026 •

edited by github-actions bot

Loading

juliendenize commented Mar 16, 2026 •

edited

Loading