[Transformers v5] Add SarvamMLAConfig to fix SarvamMLAForCausalLM (#38734)#38767
Closed
Zelys-DFKH wants to merge 1 commit intovllm-project:mainfrom
Closed
[Transformers v5] Add SarvamMLAConfig to fix SarvamMLAForCausalLM (#38734)#38767Zelys-DFKH wants to merge 1 commit intovllm-project:mainfrom
Zelys-DFKH wants to merge 1 commit intovllm-project:mainfrom
Conversation
…lm-project#38734) The remote configuration at sarvamai/sarvam-105b calls `validate_rope(ignore_keys=...)`, which broke in transformers v5 when upstream PR #41250 removed the `ignore_keys` parameter. Register `SarvamMLAConfig` in vLLM's `_CONFIG_REGISTRY` and `_CLASS_TO_MODULE` so that `HFConfigParser.parse()` selects this local class instead of the remote one, bypassing `trust_remote_code` and the broken API call entirely. `SarvamMLAConfig` exposes all fields accessed by `vllm/model_executor/models/sarvam.py`, handles the `rope_scaling → rope_parameters` conversion (including the `"type" → "rope_type"` key rename used in transformers v5), and maps the single `intermediate_size` field from `config.json` to both `intermediate_size` and `moe_intermediate_size` as required by the model. Fixes: vllm-project#38734 Co-authored-by: Claude
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces a local configuration class, SarvamMLAConfig, to support the SarvamMLA model. By registering this class in vLLM's configuration registry, the implementation bypasses the need for 'trust_remote_code' and avoids a breaking API change in transformers v5. I have no feedback to provide as there are no review comments.
Member
|
Closing in favour of #38804 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's the problem?
Fixes #38734
SarvamMLAForCausalLM(sarvamai/sarvam-105b) fails to load undertransformers v5 because the remote
configuration_sarvam_mla.pycallsvalidate_rope(ignore_keys=...). Upstream PR huggingface/transformers#41250(transformers v5) removed the
ignore_keysparameter, causing:Fix
Add
SarvamMLAConfigto vLLM's local config registry. Once registered,HFConfigParser.parse()matchesmodel_type="sarvam_mla"and loads thislocal class instead of the remote code — bypassing
trust_remote_codeandthe broken API call entirely.
SarvamMLAConfigexposes all fields accessed byvllm/model_executor/models/sarvam.pyand handles therope_scaling → rope_parametersconversion (including the"type" → "rope_type"key rename required by transformers v5).The single
intermediate_sizefield from the model'sconfig.jsonisexposed as both
intermediate_sizeandmoe_intermediate_sizeto satisfydirect accesses in the model code.
Why this is not a duplicate
There are no open PRs addressing this issue. A user asked to be assigned
in the issue comments but has not submitted a PR.
Tests
This change is pure configuration — no runtime logic is added or removed.
The fix can be verified by loading
sarvamai/sarvam-105bwithtrust_remote_code=Trueunder transformers v5; before this PR theTypeErroris raised duringget_config(); after it the config loadscleanly.
CI note: The model is gated and cannot be tested in CI without credentials.
The pattern follows the established approach used by
NemotronConfig,IsaacConfig, and every other entry in_CONFIG_REGISTRY.AI assistance was used in developing this fix.