-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Labels
Feature requestRequest for a new featureRequest for a new featureGood Difficult IssueShould FixThis has been identified as a bug and should be fixed.This has been identified as a bug and should be fixed.
Description
It seems that you cannot create parameters with the string gamma or beta in any modules you write if you intend to save/load them with the transformers library. There is a small function called _fix_keys implemented in the model loading (link). It renames all instances of beta or gamma in any substring of the sate_dict keys to be bias and weight. This means if your modules actually have a parameter with these names, they won't be loaded when using a pretrained model.
As far as I can tell, it's completely undocumented that people shouldn't create any parameters with the string gamma or beta in them.
Here is a minimal reproducible example:
import torch
import torch.nn as nn
from transformers import PreTrainedModel, PretrainedConfig
class Model(PreTrainedModel):
def __init__(self, config):
super().__init__(config)
self.gamma = nn.Parameter(torch.zeros(4))
def forward(self):
return self.gamma.sum()
if __name__ == '__main__':
config = PretrainedConfig()
# 1) First run this
#model = Model(config)
#print(model())
#model.save_pretrained('test_out')
# 2) Then try this
model = Model.from_pretrained('test_out', config=config)
print(model())
When you run this code, you get the following error:
Some weights of Model were not initialized from the model checkpoint at test_out and are newly initialized: ['gamma']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
fzyzcjy, zealota, bfshi and zhixuan-lin
Metadata
Metadata
Assignees
Labels
Feature requestRequest for a new featureRequest for a new featureGood Difficult IssueShould FixThis has been identified as a bug and should be fixed.This has been identified as a bug and should be fixed.