Add Mistral Large 3 and Ministral 3 by juliendenize · Pull Request #29757 · vllm-project/vllm

juliendenize · 2025-11-30T22:50:05Z

Purpose

This PR adds support to Mistral-Large-3 and Ministral-3.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

chatgpt-codex-connector · 2025-11-30T22:50:10Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request adds support for Mistral Large 3 and its Eagle variant by reusing the DeepseekV2 architecture. The changes are generally well-structured, including new model files, registry updates, and configuration adaptations. However, I've identified a few potential issues concerning robustness and possible regressions that should be addressed to ensure the stability and correctness of the implementation.

gemini-code-assist · 2025-11-30T22:51:52Z

vllm/config/speculative.py


    @staticmethod
    def hf_config_override(hf_config: PretrainedConfig) -> PretrainedConfig:
+        initial_architecture = hf_config.architectures[0]


The code initial_architecture = hf_config.architectures[0] assumes that hf_config.architectures is a non-empty list. However, the architectures attribute in PretrainedConfig can be None or an empty list, which would cause a TypeError or IndexError respectively. This could lead to a crash when loading a model with a malformed or missing architectures field in its config. It's safer to check for the presence of architectures before accessing its elements.

Suggested change

initial_architecture = hf_config.architectures[0]

initial_architecture = hf_config.architectures[0] if hf_config.architectures else None

gemini-code-assist · 2025-11-30T22:51:52Z

vllm/transformers_utils/config.py

-            del config.rope_theta
-    else:
-        # When Transformers v4 is installed, legacy rope_scaling may be present
+    if Version(version("transformers")) < Version("5.0.0.dev0"):


The logic for patching RoPE parameters for transformers>=5.0.0 has been removed. This logic handled backward compatibility for models that use the legacy rope_theta attribute. Removing it could cause a regression, leading to incorrect RoPE configurations for certain models when used with transformers version 5 or higher. This might result in silent correctness issues. Was the removal of this block intentional? If so, the reasoning should be documented. Otherwise, it should be restored to prevent potential regressions.

vllm/transformers_utils/configs/mistral.py

mgoin

🥳 Make sure to update the supported model page https://github.com/vllm-project/vllm/blob/main/docs/models/supported_models.md and the testing registry https://github.com/vllm-project/vllm/blob/main/tests/models/registry.py with is_available_online=False for now

mgoin · 2025-11-30T23:47:19Z

vllm/model_executor/layers/mla.py

+        if llama_4_scaling is not None:
+            q *= llama_4_scaling


Why not just put this in a rotary embedding layer?

Just a choice but no strong opinion about this that i can think of now. We also did in llama.py would it be necessary to refactor now or could it be in later pr ?

TheLocalDrummer · 2025-12-01T08:34:25Z

Bro...

vllm/model_executor/models/mistral_large_3_eagle.py

DarkLight1337 · 2025-12-01T14:51:18Z

vllm/transformers_utils/configs/eagle.py

        # LlamaForCausalLM -> Eagle3LlamaForCausalLM
        # LlamaForCausalLMEagle3 -> LlamaForCausalLMEagle3
-        if method == "eagle":
+        if method is None:


Could you elaborate on why this change is needed?

Also curious why "method": None is being observed here.

I run into a bug when disabling --enforce-eager.

vLLM tries to compute a hash of the config, and this PretrainedConfig is called from transformers with no arguments. From what I understand, it was trying to check the diff between the actual EAGLEConfig and an uninitialized one, which triggers the assert.

I believe the root cause is #26468, changing

hf_config_json = self.hf_config.to_json_string(use_diff=False) https://github.com/vllm-project/vllm/pull/26468/files#diff-998c640befaf137b9af825f29f4e6e47d273caab1fd04093c97df24b18f5c417L345

to x.to_json_string() (removing the use_diff=False, default is True) https://github.com/vllm-project/vllm/pull/26468/files#diff-f0679c95660e953a0b43d241ba6332d28cefd86ea565fc10c0d0cc57dc158cfcR257

Let me know if you need more investigation on my side

cc @zou3519 @youkaichao @hmellor

I think we can put the use_diff=False back, do you want to do this in this PR or should I do it in a separate one?

Should we just revert this to make sure the PR gets in?

Might be too late to revert all of #26468, but I can put the use_diff=False back

@zou3519 can you open a separate PR for this? then we can update this PR accordingly

We reverted the change here, should be good

(use_diff=False) was added here: https://github.com/vllm-project/vllm/pull/29757/files#r2578969697

benchislett · 2025-12-01T15:50:09Z

vllm/model_executor/models/deepseek_v2.py

    return 0.1 * mscale * math.log(scale) + 1.0


+def _get_llama_4_scaling(


Do we have a plan to move these helper functions into a utility file?

I think this could done yes, we also have it in llama.py question would be do we do it now ? It is not exactly the same as the one in llama4.py (a different offset).

Let's address this in a follow-up PR

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

tlrmchlsmth · 2025-12-01T21:38:55Z

tests/models/registry.py

+        # TODO: revert once Mistral-Large-3 and Ministral-3 are publicly available.
+        is_available_online=False,


We should flip this before 0.12 goes out @khluu

I think it's okay to leave this as is since we're cutting a branch today

patrickvonplaten · 2025-12-01T21:39:45Z

vllm/tokenizers/mistral.py

    continue_final_message: bool = False,
    add_generation_prompt: bool = False,
 ) -> tuple[list["ChatCompletionMessageParam"], list[dict[str, Any]] | None]:
+    from mistral_common.protocol.instruct.tool_calls import Function, Tool


why dynamic import here?

it's the same we do for all functions in this file we import at use because some people don't want to install mistral-common.

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

Hashing the config can crash because constructor is called with default arguments only. Pass use_diff=false to avoid this behavior Signed-off-by: Mickael Seznec <mickael@mistral.ai>

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

patrickvonplaten · 2025-12-01T22:44:41Z

vllm/config/utils.py

-        return x.to_json_string()
+        # using `use_diff=False` to avoid initializing object with
+        # default arguments only
+        return x.to_json_string(use_diff=False)


use_diff to make sure we don't have trouble with speculative decoding

ywang96

Amazing! Thanks for all the work and look forward to the release!

Approving as the rest of the comments can be addressed in follow-up PRs.

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

only fix for eagle config, avoid larger impact on the codebase Signed-off-by: Mickael Seznec <mickael@mistral.ai>

Signed-off-by: Roger Wang <hey@rogerw.io>

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

d8c6210 Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Mickael Seznec <mickael@mistral.ai>

…vllm-project#318) d8c6210 plus enablement `ministral-large-3` and `ministral-large` download in the registry vllm-project#29757

Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

Add Mistral Large 3

a4d3772

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

juliendenize requested review from ProExpertProg, WoosukKwon, benchislett, hmellor, houseroad, luccafong, mgoin, patrickvonplaten, pavanimajety, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners November 30, 2025 22:50

mergify bot added deepseek Related to DeepSeek models new-model Requests to new models speculative-decoding v1 labels Nov 30, 2025

gemini-code-assist bot reviewed Nov 30, 2025

View reviewed changes

Merge branch 'main' into add_mistral_large_3

2ce6f4e

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 30, 2025

mgoin reviewed Nov 30, 2025

View reviewed changes

Merge branch 'main' into add_mistral_large_3

ad31919

DarkLight1337 reviewed Dec 1, 2025

View reviewed changes

vllm/model_executor/models/mistral_large_3_eagle.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Dec 1, 2025

View reviewed changes

benchislett reviewed Dec 1, 2025

View reviewed changes

Wip

ff4fd5c

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

juliendenize requested a review from ywang96 as a code owner December 1, 2025 16:36

juliendenize added 2 commits December 1, 2025 19:54

Add batch_decode

e65b9af

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

wip

4750d9f

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

tlrmchlsmth reviewed Dec 1, 2025

View reviewed changes

patrickvonplaten reviewed Dec 1, 2025

View reviewed changes

juliendenize and others added 3 commits December 1, 2025 21:44

wip

1e7fdcc

Signed-off-by: Julien Denize <julien.denize@mistral.ai>

Merge branch 'main' into add_mistral_large_3

e4ea260

fix: eagle and no --enforce-eager

0553462

Hashing the config can crash because constructor is called with default arguments only. Pass use_diff=false to avoid this behavior Signed-off-by: Mickael Seznec <mickael@mistral.ai>

mickaelseznec force-pushed the add_mistral_large_3 branch from 7c5f57c to 0553462 Compare December 1, 2025 22:34

fix: redo default arg in eagle

ffaf85b

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

patrickvonplaten reviewed Dec 1, 2025

View reviewed changes

ywang96 approved these changes Dec 1, 2025

View reviewed changes

mickaelseznec and others added 8 commits December 1, 2025 23:07

rename: triton moe config file

e74fdf8

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

fix: loading mistral eagle

a330734

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

fix: eagle & compile

52ef28a

only fix for eagle config, avoid larger impact on the codebase Signed-off-by: Mickael Seznec <mickael@mistral.ai>

add and skip

efa0e6d

Signed-off-by: Roger Wang <hey@rogerw.io>

fix: default args for llam4_scaling

97a24fc

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

fix: add eagle model in test registry

ce02e1f

Signed-off-by: Mickael Seznec <mickael@mistral.ai>

Merge branch 'main' into add_mistral_large_3

09a8941

Merge branch 'main' into add_mistral_large_3

32f0bbc

DarkLight1337 added this to the v0.12.0 milestone Dec 2, 2025

khluu enabled auto-merge (squash) December 2, 2025 10:03

khluu merged commit d8c6210 into vllm-project:main Dec 2, 2025
61 checks passed

github-project-automation bot moved this to Done in Tool Calling Dec 2, 2025

DocShotgun mentioned this pull request Dec 3, 2025

Can't convert mistral 3 large ggml-org/llama.cpp#17705

Closed

	initial_architecture = hf_config.architectures[0]
	initial_architecture = hf_config.architectures[0] if hf_config.architectures else None

		return 0.1 * mscale * math.log(scale) + 1.0


		def _get_llama_4_scaling(

		# TODO: revert once Mistral-Large-3 and Ministral-3 are publicly available.
		is_available_online=False,

Uh oh!

Conversation

juliendenize commented Nov 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Nov 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mgoin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheLocalDrummer commented Dec 1, 2025

Uh oh!

Uh oh!

DarkLight1337 Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

juliendenize commented Nov 30, 2025 •

edited by github-actions bot

Loading

mgoin left a comment •

edited

Loading

DarkLight1337 Dec 1, 2025 •

edited

Loading