Add Mistral Large 3 and Ministral 3#29757
Conversation
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request adds support for Mistral Large 3 and its Eagle variant by reusing the DeepseekV2 architecture. The changes are generally well-structured, including new model files, registry updates, and configuration adaptations. However, I've identified a few potential issues concerning robustness and possible regressions that should be addressed to ensure the stability and correctness of the implementation.
|
|
||
| @staticmethod | ||
| def hf_config_override(hf_config: PretrainedConfig) -> PretrainedConfig: | ||
| initial_architecture = hf_config.architectures[0] |
There was a problem hiding this comment.
The code initial_architecture = hf_config.architectures[0] assumes that hf_config.architectures is a non-empty list. However, the architectures attribute in PretrainedConfig can be None or an empty list, which would cause a TypeError or IndexError respectively. This could lead to a crash when loading a model with a malformed or missing architectures field in its config. It's safer to check for the presence of architectures before accessing its elements.
| initial_architecture = hf_config.architectures[0] | |
| initial_architecture = hf_config.architectures[0] if hf_config.architectures else None |
| del config.rope_theta | ||
| else: | ||
| # When Transformers v4 is installed, legacy rope_scaling may be present | ||
| if Version(version("transformers")) < Version("5.0.0.dev0"): |
There was a problem hiding this comment.
The logic for patching RoPE parameters for transformers>=5.0.0 has been removed. This logic handled backward compatibility for models that use the legacy rope_theta attribute. Removing it could cause a regression, leading to incorrect RoPE configurations for certain models when used with transformers version 5 or higher. This might result in silent correctness issues. Was the removal of this block intentional? If so, the reasoning should be documented. Otherwise, it should be restored to prevent potential regressions.
There was a problem hiding this comment.
🥳 Make sure to update the supported model page https://github.com/vllm-project/vllm/blob/main/docs/models/supported_models.md and the testing registry https://github.com/vllm-project/vllm/blob/main/tests/models/registry.py with is_available_online=False for now
| if llama_4_scaling is not None: | ||
| q *= llama_4_scaling |
There was a problem hiding this comment.
Why not just put this in a rotary embedding layer?
There was a problem hiding this comment.
Just a choice but no strong opinion about this that i can think of now. We also did in llama.py would it be necessary to refactor now or could it be in later pr ?
|
Bro... |
| # LlamaForCausalLM -> Eagle3LlamaForCausalLM | ||
| # LlamaForCausalLMEagle3 -> LlamaForCausalLMEagle3 | ||
| if method == "eagle": | ||
| if method is None: |
There was a problem hiding this comment.
Could you elaborate on why this change is needed?
There was a problem hiding this comment.
Also curious why "method": None is being observed here.
There was a problem hiding this comment.
I run into a bug when disabling --enforce-eager.
vLLM tries to compute a hash of the config, and this PretrainedConfig is called from transformers with no arguments. From what I understand, it was trying to check the diff between the actual EAGLEConfig and an uninitialized one, which triggers the assert.
I believe the root cause is #26468, changing
hf_config_json = self.hf_config.to_json_string(use_diff=False)https://github.com/vllm-project/vllm/pull/26468/files#diff-998c640befaf137b9af825f29f4e6e47d273caab1fd04093c97df24b18f5c417L345- to
x.to_json_string()(removing theuse_diff=False, default is True) https://github.com/vllm-project/vllm/pull/26468/files#diff-f0679c95660e953a0b43d241ba6332d28cefd86ea565fc10c0d0cc57dc158cfcR257
Let me know if you need more investigation on my side
There was a problem hiding this comment.
There was a problem hiding this comment.
I think we can put the use_diff=False back, do you want to do this in this PR or should I do it in a separate one?
There was a problem hiding this comment.
Should we just revert this to make sure the PR gets in?
There was a problem hiding this comment.
Might be too late to revert all of #26468, but I can put the use_diff=False back
There was a problem hiding this comment.
@zou3519 can you open a separate PR for this? then we can update this PR accordingly
There was a problem hiding this comment.
We reverted the change here, should be good
There was a problem hiding this comment.
(use_diff=False) was added here: https://github.com/vllm-project/vllm/pull/29757/files#r2578969697
| return 0.1 * mscale * math.log(scale) + 1.0 | ||
|
|
||
|
|
||
| def _get_llama_4_scaling( |
There was a problem hiding this comment.
Do we have a plan to move these helper functions into a utility file?
There was a problem hiding this comment.
I think this could done yes, we also have it in llama.py question would be do we do it now ? It is not exactly the same as the one in llama4.py (a different offset).
There was a problem hiding this comment.
Let's address this in a follow-up PR
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
| # TODO: revert once Mistral-Large-3 and Ministral-3 are publicly available. | ||
| is_available_online=False, |
There was a problem hiding this comment.
I think it's okay to leave this as is since we're cutting a branch today
| continue_final_message: bool = False, | ||
| add_generation_prompt: bool = False, | ||
| ) -> tuple[list["ChatCompletionMessageParam"], list[dict[str, Any]] | None]: | ||
| from mistral_common.protocol.instruct.tool_calls import Function, Tool |
There was a problem hiding this comment.
why dynamic import here?
There was a problem hiding this comment.
it's the same we do for all functions in this file we import at use because some people don't want to install mistral-common.
Hashing the config can crash because constructor is called with default arguments only. Pass use_diff=false to avoid this behavior Signed-off-by: Mickael Seznec <mickael@mistral.ai>
7c5f57c to
0553462
Compare
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
vllm/config/utils.py
Outdated
| return x.to_json_string() | ||
| # using `use_diff=False` to avoid initializing object with | ||
| # default arguments only | ||
| return x.to_json_string(use_diff=False) |
There was a problem hiding this comment.
use_diff to make sure we don't have trouble with speculative decoding
ywang96
left a comment
There was a problem hiding this comment.
Amazing! Thanks for all the work and look forward to the release!
Approving as the rest of the comments can be addressed in follow-up PRs.
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
only fix for eagle config, avoid larger impact on the codebase Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
d8c6210 Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Mickael Seznec <mickael@mistral.ai>
…vllm-project#318) d8c6210 plus enablement `ministral-large-3` and `ministral-large` download in the registry vllm-project#29757
Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Mickael Seznec <mickael@mistral.ai> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Purpose
This PR adds support to Mistral-Large-3 and Ministral-3.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.