Revert "Various Transformers v5 fixes" (#38127) by zhewenl · Pull Request #38229 · vllm-project/vllm

zhewenl · 2026-03-26T12:06:10Z

Revert of #38127

This reverts commit 3c3c084.

Reason

PR #38127 changed deepseek_vl2.py to remove kv_lora_rank when None before passing to DeepseekV2Config. This caused KV cache initialization failures in NixlConnector PD tests using deepseek-ai/deepseek-vl2-tiny:

CrossLayer KV layout Distributed NixlConnector PD accuracy tests (4 GPUs): AssertionError: KV cache sizes must match between P and D when replicated
DP EP Distributed NixlConnector PD accuracy tests (4 GPUs): RuntimeError: All kv cache tensors must have the same number of blocks
Distributed NixlConnector PD accuracy (4 GPUs): AssertionError: All kv cache tensors must have the same number of blocks

All 3 failures are new in build #58181.

Auto-generated by CI failure analyzer.

This reverts commit 3c3c084.

gemini-code-assist

Code Review

This pull request primarily introduces support for Olmo3Config by moving its definition to vllm.transformers_utils.configs and integrating it into the vLLM configuration system. However, two high-severity issues were identified in the review. The first issue is a re-introduced bug in vllm/config/model.py where sliding_window=0 is not correctly handled as disabled, potentially leading to incorrect behavior. The second issue is a re-introduced bug in vllm/transformers_utils/configs/deepseek_vl2.py where passing kv_lora_rank: None in language_config can cause a TypeError.

gemini-code-assist · 2026-03-26T12:09:02Z

        self.original_max_model_len = self.max_model_len
        self.max_model_len = self.get_and_verify_max_len(self.max_model_len)


This revert re-introduces an issue where sliding_window=0 is not correctly handled as 'disabled'. Some models use 0 to indicate that sliding window attention is disabled, but vLLM expects None. Without this conversion, a sliding_window of 0 might be passed to attention layers, which could lead to incorrect behavior or errors.

The original change seems correct and it's likely this was reverted as part of a larger revert. It should be re-introduced to prevent issues with models that use sliding_window: 0.

# Some checkpoints set sliding_window to 0 to indicate that sliding window is # disabled, but vLLM uses None for that. Convert 0 to None to avoid errors. # Set before get_and_verify_max_len to ensure that max_model_len does not get # capped to 0. if self.get_sliding_window() == 0: self.disable_sliding_window = True self.hf_text_config.sliding_window = None self.original_max_model_len = self.max_model_len self.max_model_len = self.get_and_verify_max_len(self.max_model_len)

gemini-code-assist · 2026-03-26T12:09:02Z

        language_config = kwargs.get("language_config", {})
-        # remove kv_lora_rank if not specified, passing None is prohibited
-        if language_config.get("kv_lora_rank") is None:
-            language_config.pop("kv_lora_rank", None)
        self.text_config = DeepseekV2Config(**language_config)


This revert re-introduces a potential bug. If language_config contains kv_lora_rank: None, it will be passed to DeepseekV2Config, which can cause a TypeError later on when kv_lora_rank is used in arithmetic operations (e.g., in DeepseekV2Attention).

The original change attempted to fix this by removing the key, but that seems to have caused other issues, likely due to in-place modification of the language_config dictionary. A safer fix that avoids this side effect is to create a copy before modification.

Consider this alternative which should be safer and prevent both the TypeError and the CI failures seen in the original PR:

Suggested change

language_config = kwargs.get("language_config", {})

# remove kv_lora_rank if not specified, passing None is prohibited

if language_config.get("kv_lora_rank") is None:

language_config.pop("kv_lora_rank", None)

self.text_config = DeepseekV2Config(**language_config)

language_config = kwargs.get("language_config", {}).copy()

if language_config.get("kv_lora_rank") is None:

language_config.pop("kv_lora_rank", None)

self.text_config = DeepseekV2Config(**language_config)

hmellor · 2026-03-26T15:29:02Z

Reverting this will block vLLM from upgrading to Transformers v5 because it is misusing DeepseekV2Config.

hmellor · 2026-03-26T15:30:26Z

This also reverts a bunch of unrelated things. I'm going to fix this properly.

hmellor · 2026-03-26T16:08:04Z

The fix for the specific issue is applied in #38247

Revert "Various Transformers v5 fixes (vllm-project#38127)"

039c1ae

This reverts commit 3c3c084.

mergify bot added the deepseek Related to DeepSeek models label Mar 26, 2026

gemini-code-assist bot reviewed Mar 26, 2026

View reviewed changes

hmellor closed this Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "Various Transformers v5 fixes" (#38127)#38229

Revert "Various Transformers v5 fixes" (#38127)#38229
zhewenl wants to merge 1 commit intovllm-project:mainfrom
zhewenl:auto-revert/pr-38127

zhewenl commented Mar 26, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 26, 2026

Uh oh!

gemini-code-assist bot Mar 26, 2026

Uh oh!

hmellor commented Mar 26, 2026

Uh oh!

hmellor commented Mar 26, 2026

Uh oh!

hmellor commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.original_max_model_len = self.max_model_len
		self.max_model_len = self.get_and_verify_max_len(self.max_model_len)

Uh oh!

Conversation

zhewenl commented Mar 26, 2026

Revert of #38127

Reason

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

hmellor commented Mar 26, 2026

Uh oh!

hmellor commented Mar 26, 2026

Uh oh!

hmellor commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hmellor commented Mar 26, 2026 •

edited

Loading