Remove padding_index from models that don't use it for better Transformers v5 compatibility#35189
Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a compatibility issue for the InternS1-Pro model that arises from changes in Transformers v5. The parent class Qwen3MoeLLMModel expects a pad_token_id attribute in the configuration, which is no longer universally present in PreTrainedConfig and is missing in InternS1ProTextConfig. The proposed change correctly patches the configuration by adding pad_token_id = None if it's absent, ensuring the model initializes without errors. This fix is concise, correctly placed within the model's __init__, and consistent with existing patterns in the codebase for handling such configuration discrepancies. The change is approved.
|
Actually I think we can simply remove |
|
Ok, which models? |
|
You can search for |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
padding_index from models that don't use it for better Transformers v5 copatibility
padding_index from models that don't use it for better Transformers v5 copatibilitypadding_index from models that don't use it for better Transformers v5 compatibility
…formers v5 compatibility (vllm-project#35189) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…formers v5 compatibility (vllm-project#35189) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…formers v5 compatibility (vllm-project#35189) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Currently, some model implementations expect to have
pad_token_id, which is set toself.padding_index. In Transfomers v4 this was always present inPreTrainedConfig, but has been moved to the individual configs that use it in Transformers v5.Since it is not actually used in any of these models, we remove it so that they may more easily run in Transformers v5.