Can't load models with a gamma or beta parameter


It seems that you cannot create parameters with the string `gamma` or `beta` in any modules you write if you intend to save/load them with the transformers library. There is a small function called `_fix_keys`  implemented in the model loading ([link](https://github.com/huggingface/transformers/blob/0290ec19c901adc0f1230ebdccad11c40af026f5/src/transformers/modeling_utils.py#L3637-L3642)). It renames __all__  instances of  `beta` or `gamma` in any substring of the sate_dict keys to be `bias` and  `weight`.  This means if your modules actually have a parameter with these names, they won't be loaded when using a pretrained model.

As far as I can tell, it's completely undocumented that people shouldn't create any parameters with the string `gamma` or `beta` in them. 

Here is a minimal reproducible example:

```
import torch
import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig

class Model(PreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.gamma = nn.Parameter(torch.zeros(4))

    def forward(self):
        return self.gamma.sum()


if __name__ == '__main__':
    config = PretrainedConfig()

    # 1) First run this
    #model = Model(config)
    #print(model())

    #model.save_pretrained('test_out')

    # 2) Then try this
    model = Model.from_pretrained('test_out', config=config)
    print(model())
```

When you run this code, you get the following error:
```
Some weights of Model were not initialized from the model checkpoint at test_out and are newly initialized: ['gamma']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't load models with a gamma or beta parameter #29554

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can't load models with a gamma or beta parameter #29554

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions