Skip to content

[Transformers v5] Add SarvamMLAConfig to fix SarvamMLAForCausalLM (#38734)#38767

Closed
Zelys-DFKH wants to merge 1 commit intovllm-project:mainfrom
Zelys-DFKH:fix/sarvam-mla-transformers-v5
Closed

[Transformers v5] Add SarvamMLAConfig to fix SarvamMLAForCausalLM (#38734)#38767
Zelys-DFKH wants to merge 1 commit intovllm-project:mainfrom
Zelys-DFKH:fix/sarvam-mla-transformers-v5

Conversation

@Zelys-DFKH
Copy link
Copy Markdown

@Zelys-DFKH Zelys-DFKH commented Apr 2, 2026

What's the problem?

Fixes #38734

SarvamMLAForCausalLM (sarvamai/sarvam-105b) fails to load under
transformers v5 because the remote configuration_sarvam_mla.py calls
validate_rope(ignore_keys=...). Upstream PR huggingface/transformers#41250
(transformers v5) removed the ignore_keys parameter, causing:

TypeError: RotaryEmbeddingConfigMixin.validate_rope() got an unexpected
keyword argument 'ignore_keys'

Fix

Add SarvamMLAConfig to vLLM's local config registry. Once registered,
HFConfigParser.parse() matches model_type="sarvam_mla" and loads this
local class instead of the remote code — bypassing trust_remote_code and
the broken API call entirely.

SarvamMLAConfig exposes all fields accessed by
vllm/model_executor/models/sarvam.py and handles the
rope_scaling → rope_parameters conversion (including the
"type" → "rope_type" key rename required by transformers v5).
The single intermediate_size field from the model's config.json is
exposed as both intermediate_size and moe_intermediate_size to satisfy
direct accesses in the model code.

Why this is not a duplicate

There are no open PRs addressing this issue. A user asked to be assigned
in the issue comments but has not submitted a PR.

Tests

This change is pure configuration — no runtime logic is added or removed.
The fix can be verified by loading sarvamai/sarvam-105b with
trust_remote_code=True under transformers v5; before this PR the
TypeError is raised during get_config(); after it the config loads
cleanly.

CI note: The model is gated and cannot be tested in CI without credentials.
The pattern follows the established approach used by NemotronConfig,
IsaacConfig, and every other entry in _CONFIG_REGISTRY.

AI assistance was used in developing this fix.

…lm-project#38734)

The remote configuration at sarvamai/sarvam-105b calls
`validate_rope(ignore_keys=...)`, which broke in transformers v5 when
upstream PR #41250 removed the `ignore_keys` parameter.

Register `SarvamMLAConfig` in vLLM's `_CONFIG_REGISTRY` and
`_CLASS_TO_MODULE` so that `HFConfigParser.parse()` selects this local
class instead of the remote one, bypassing `trust_remote_code` and the
broken API call entirely.

`SarvamMLAConfig` exposes all fields accessed by
`vllm/model_executor/models/sarvam.py`, handles the
`rope_scaling → rope_parameters` conversion (including the
`"type" → "rope_type"` key rename used in transformers v5), and maps
the single `intermediate_size` field from `config.json` to both
`intermediate_size` and `moe_intermediate_size` as required by the model.

Fixes: vllm-project#38734

Co-authored-by: Claude
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a local configuration class, SarvamMLAConfig, to support the SarvamMLA model. By registering this class in vLLM's configuration registry, the implementation bypasses the need for 'trust_remote_code' and avoids a breaking API change in transformers v5. I have no feedback to provide as there are no review comments.

@hmellor
Copy link
Copy Markdown
Member

hmellor commented Apr 2, 2026

Closing in favour of #38804

@hmellor hmellor closed this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Transformers v5] SarvamMLAForCausalLM

2 participants