Skip to content

Remove padding_index from models that don't use it for better Transformers v5 compatibility#35189

Merged
vllm-bot merged 3 commits intovllm-project:mainfrom
hmellor:v5-intern-s1-pro
Feb 24, 2026
Merged

Remove padding_index from models that don't use it for better Transformers v5 compatibility#35189
vllm-bot merged 3 commits intovllm-project:mainfrom
hmellor:v5-intern-s1-pro

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Feb 24, 2026

Currently, some model implementations expect to have pad_token_id, which is set to self.padding_index. In Transfomers v4 this was always present in PreTrainedConfig, but has been moved to the individual configs that use it in Transformers v5.

Since it is not actually used in any of these models, we remove it so that they may more easily run in Transformers v5.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a compatibility issue for the InternS1-Pro model that arises from changes in Transformers v5. The parent class Qwen3MoeLLMModel expects a pad_token_id attribute in the configuration, which is no longer universally present in PreTrainedConfig and is missing in InternS1ProTextConfig. The proposed change correctly patches the configuration by adding pad_token_id = None if it's absent, ensuring the model initializes without errors. This fix is concise, correctly placed within the model's __init__, and consistent with existing patterns in the codebase for handling such configuration discrepancies. The change is approved.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Feb 24, 2026

Actually I think we can simply remove padding_idx from Qwen3-MoE (and a bunch of other models) since they are unused.

@hmellor
Copy link
Member Author

hmellor commented Feb 24, 2026

Ok, which models?

@DarkLight1337
Copy link
Member

You can search for .padding_idx

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@hmellor hmellor requested a review from sighingnow as a code owner February 24, 2026 12:30
@hmellor hmellor changed the title Fix InternS1-Pro for Transformers v5 Remove padding_index from models that don't use it for better Transformers v5 copatibility Feb 24, 2026
@mergify mergify bot added the qwen Related to Qwen models label Feb 24, 2026
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 24, 2026
@DarkLight1337 DarkLight1337 changed the title Remove padding_index from models that don't use it for better Transformers v5 copatibility Remove padding_index from models that don't use it for better Transformers v5 compatibility Feb 24, 2026
@vllm-bot vllm-bot merged commit c38b8d5 into vllm-project:main Feb 24, 2026
59 of 64 checks passed
@hmellor hmellor deleted the v5-intern-s1-pro branch February 24, 2026 16:06
tom-zju pushed a commit to tom-zju/vllm that referenced this pull request Feb 26, 2026
…formers v5 compatibility (vllm-project#35189)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
…formers v5 compatibility (vllm-project#35189)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
…formers v5 compatibility (vllm-project#35189)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants