You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have used pytorch_transformers to convert bert-large-uncased form tensorflow version into pytorch version. And then I run the first script. However the model can't load the weights from the pre-trained model. Here is the information:
07/31/2019 10:14:14 - INFO - main - output_dir: out/extract/01
07/31/2019 10:14:14 - INFO - main - ***** Preparing model *****
07/31/2019 10:14:18 - INFO - absa.run_base - Weights of BertForSpanAspectExtraction not initialized from pretrained model: ['bert.embeddings.LayerNorm.gamma', 'bert.embeddings.LayerNorm.beta', 'bert.encoder.layer.0.attention.output.LayerNorm.gamma', 'bert.encoder.layer.0.attention.output.LayerNorm.beta', 'bert.encoder.layer.0.output.LayerNorm.gamma', 'bert.encoder.layer.0.output.LayerNorm.beta', 'bert.encoder.layer.1.attention.output.LayerNorm.gamma', 'bert.encoder.layer.1.attention.output.LayerNorm.beta', 'bert.encoder.layer.1.output.LayerNorm.gamma', 'bert.encoder.layer.1.output.LayerNorm.beta', 'bert.encoder.layer.2.attention.output.LayerNorm.gamma', 'bert.encoder.layer.2.attention.output.LayerNorm.beta', 'bert.encoder.layer.2.output.LayerNorm.gamma', 'bert.encoder.layer.2.output.LayerNorm.beta', 'bert.encoder.layer.3.attention.output.LayerNorm.gamma', 'bert.encoder.layer.3.attention.output.LayerNorm.beta', 'bert.encoder.layer.3.output.LayerNorm.gamma', 'bert.encoder.layer.3.output.LayerNorm.beta', 'bert.encoder.layer.4.attention.output.LayerNorm.gamma', 'bert.encoder.layer.4.attention.output.LayerNorm.beta', 'bert.encoder.layer.4.output.LayerNorm.gamma', 'bert.encoder.layer.4.output.LayerNorm.beta', 'bert.encoder.layer.5.attention.output.LayerNorm.gamma', 'bert.encoder.layer.5.attention.output.LayerNorm.beta', 'bert.encoder.layer.5.output.LayerNorm.gamma', 'bert.encoder.layer.5.output.LayerNorm.beta', 'bert.encoder.layer.6.attention.output.LayerNorm.gamma', 'bert.encoder.layer.6.attention.output.LayerNorm.beta', 'bert.encoder.layer.6.output.LayerNorm.gamma', 'bert.encoder.layer.6.output.LayerNorm.beta', 'bert.encoder.layer.7.attention.output.LayerNorm.gamma', 'bert.encoder.layer.7.attention.output.LayerNorm.beta', 'bert.encoder.layer.7.output.LayerNorm.gamma', 'bert.encoder.layer.7.output.LayerNorm.beta', 'bert.encoder.layer.8.attention.output.LayerNorm.gamma', 'bert.encoder.layer.8.attention.output.LayerNorm.beta', 'bert.encoder.layer.8.output.LayerNorm.gamma', 'bert.encoder.layer.8.output.LayerNorm.beta', 'bert.encoder.layer.9.attention.output.LayerNorm.gamma', 'bert.encoder.layer.9.attention.output.LayerNorm.beta', 'bert.encoder.layer.9.output.LayerNorm.gamma', 'bert.encoder.layer.9.output.LayerNorm.beta', 'bert.encoder.layer.10.attention.output.LayerNorm.gamma', 'bert.encoder.layer.10.attention.output.LayerNorm.beta', 'bert.encoder.layer.10.output.LayerNorm.gamma', 'bert.encoder.layer.10.output.LayerNorm.beta', 'bert.encoder.layer.11.attention.output.LayerNorm.gamma', 'bert.encoder.layer.11.attention.output.LayerNorm.beta', 'bert.encoder.layer.11.output.LayerNorm.gamma', 'bert.encoder.layer.11.output.LayerNorm.beta', 'bert.encoder.layer.12.attention.output.LayerNorm.gamma', 'bert.encoder.layer.12.attention.output.LayerNorm.beta', 'bert.encoder.layer.12.output.LayerNorm.gamma', 'bert.encoder.layer.12.output.LayerNorm.beta', 'bert.encoder.layer.13.attention.output.LayerNorm.gamma', 'bert.encoder.layer.13.attention.output.LayerNorm.beta', 'bert.encoder.layer.13.output.LayerNorm.gamma', 'bert.encoder.layer.13.output.LayerNorm.beta', 'bert.encoder.layer.14.attention.output.LayerNorm.gamma', 'bert.encoder.layer.14.attention.output.LayerNorm.beta', 'bert.encoder.layer.14.output.LayerNorm.gamma', 'bert.encoder.layer.14.output.LayerNorm.beta', 'bert.encoder.layer.15.attention.output.LayerNorm.gamma', 'bert.encoder.layer.15.attention.output.LayerNorm.beta', 'bert.encoder.layer.15.output.LayerNorm.gamma', 'bert.encoder.layer.15.output.LayerNorm.beta', 'bert.encoder.layer.16.attention.output.LayerNorm.gamma', 'bert.encoder.layer.16.attention.output.LayerNorm.beta', 'bert.encoder.layer.16.output.LayerNorm.gamma', 'bert.encoder.layer.16.output.LayerNorm.beta', 'bert.encoder.layer.17.attention.output.LayerNorm.gamma', 'bert.encoder.layer.17.attention.output.LayerNorm.beta', 'bert.encoder.layer.17.output.LayerNorm.gamma', 'bert.encoder.layer.17.output.LayerNorm.beta', 'bert.encoder.layer.18.attention.output.LayerNorm.gamma', 'bert.encoder.layer.18.attention.output.LayerNorm.beta', 'bert.encoder.layer.18.output.LayerNorm.gamma', 'bert.encoder.layer.18.output.LayerNorm.beta', 'bert.encoder.layer.19.attention.output.LayerNorm.gamma', 'bert.encoder.layer.19.attention.output.LayerNorm.beta', 'bert.encoder.layer.19.output.LayerNorm.gamma', 'bert.encoder.layer.19.output.LayerNorm.beta', 'bert.encoder.layer.20.attention.output.LayerNorm.gamma', 'bert.encoder.layer.20.attention.output.LayerNorm.beta', 'bert.encoder.layer.20.output.LayerNorm.gamma', 'bert.encoder.layer.20.output.LayerNorm.beta', 'bert.encoder.layer.21.attention.output.LayerNorm.gamma', 'bert.encoder.layer.21.attention.output.LayerNorm.beta', 'bert.encoder.layer.21.output.LayerNorm.gamma', 'bert.encoder.layer.21.output.LayerNorm.beta', 'bert.encoder.layer.22.attention.output.LayerNorm.gamma', 'bert.encoder.layer.22.attention.output.LayerNorm.beta', 'bert.encoder.layer.22.output.LayerNorm.gamma', 'bert.encoder.layer.22.output.LayerNorm.beta', 'bert.encoder.layer.23.attention.output.LayerNorm.gamma', 'bert.encoder.layer.23.attention.output.LayerNorm.beta', 'bert.encoder.layer.23.output.LayerNorm.gamma', 'bert.encoder.layer.23.output.LayerNorm.beta', 'qa_outputs.weight', 'qa_outputs.bias']
07/31/2019 10:14:18 - INFO - absa.run_base - Weights from pretrained model not used in BertForSpanAspectExtraction: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'bert.embeddings.LayerNorm.weight', 'bert.embeddings.LayerNorm.bias', 'bert.encoder.layer.0.attention.output.LayerNorm.weight', 'bert.encoder.layer.0.attention.output.LayerNorm.bias', 'bert.encoder.layer.0.output.LayerNorm.weight', 'bert.encoder.layer.0.output.LayerNorm.bias', 'bert.encoder.layer.1.attention.output.LayerNorm.weight', 'bert.encoder.layer.1.attention.output.LayerNorm.bias', 'bert.encoder.layer.1.output.LayerNorm.weight', 'bert.encoder.layer.1.output.LayerNorm.bias', 'bert.encoder.layer.2.attention.output.LayerNorm.weight', 'bert.encoder.layer.2.attention.output.LayerNorm.bias', 'bert.encoder.layer.2.output.LayerNorm.weight', 'bert.encoder.layer.2.output.LayerNorm.bias', 'bert.encoder.layer.3.attention.output.LayerNorm.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.bias', 'bert.encoder.layer.3.output.LayerNorm.weight', 'bert.encoder.layer.3.output.LayerNorm.bias', 'bert.encoder.layer.4.attention.output.LayerNorm.weight', 'bert.encoder.layer.4.attention.output.LayerNorm.bias', 'bert.encoder.layer.4.output.LayerNorm.weight', 'bert.encoder.layer.4.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.output.LayerNorm.weight', 'bert.encoder.layer.5.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.output.LayerNorm.weight', 'bert.encoder.layer.5.output.LayerNorm.bias', 'bert.encoder.layer.6.attention.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.output.LayerNorm.bias', 'bert.encoder.layer.6.output.LayerNorm.weight', 'bert.encoder.layer.6.output.LayerNorm.bias', 'bert.encoder.layer.7.attention.output.LayerNorm.weight', 'bert.encoder.layer.7.attention.output.LayerNorm.bias', 'bert.encoder.layer.7.output.LayerNorm.weight', 'bert.encoder.layer.7.output.LayerNorm.bias', 'bert.encoder.layer.8.attention.output.LayerNorm.weight', 'bert.encoder.layer.8.attention.output.LayerNorm.bias', 'bert.encoder.layer.8.output.LayerNorm.weight', 'bert.encoder.layer.8.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.output.LayerNorm.weight', 'bert.encoder.layer.9.attention.output.LayerNorm.bias', 'bert.encoder.layer.9.output.LayerNorm.weight', 'bert.encoder.layer.9.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.output.LayerNorm.weight', 'bert.encoder.layer.10.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.output.LayerNorm.weight', 'bert.encoder.layer.10.output.LayerNorm.bias', 'bert.encoder.layer.11.attention.output.LayerNorm.weight', 'bert.encoder.layer.11.attention.output.LayerNorm.bias', 'bert.encoder.layer.11.output.LayerNorm.weight', 'bert.encoder.layer.11.output.LayerNorm.bias', 'bert.encoder.layer.12.attention.output.LayerNorm.weight', 'bert.encoder.layer.12.attention.output.LayerNorm.bias', 'bert.encoder.layer.12.output.LayerNorm.weight', 'bert.encoder.layer.12.output.LayerNorm.bias', 'bert.encoder.layer.13.attention.output.LayerNorm.weight', 'bert.encoder.layer.13.attention.output.LayerNorm.bias', 'bert.encoder.layer.13.output.LayerNorm.weight', 'bert.encoder.layer.13.output.LayerNorm.bias', 'bert.encoder.layer.14.attention.output.LayerNorm.weight', 'bert.encoder.layer.14.attention.output.LayerNorm.bias', 'bert.encoder.layer.14.output.LayerNorm.weight', 'bert.encoder.layer.14.output.LayerNorm.bias', 'bert.encoder.layer.15.attention.output.LayerNorm.weight', 'bert.encoder.layer.15.attention.output.LayerNorm.bias', 'bert.encoder.layer.15.output.LayerNorm.weight', 'bert.encoder.layer.15.output.LayerNorm.bias', 'bert.encoder.layer.16.attention.output.LayerNorm.weight', 'bert.encoder.layer.16.attention.output.LayerNorm.bias', 'bert.encoder.layer.16.output.LayerNorm.weight', 'bert.encoder.layer.16.output.LayerNorm.bias', 'bert.encoder.layer.17.attention.output.LayerNorm.weight', 'bert.encoder.layer.17.attention.output.LayerNorm.bias', 'bert.encoder.layer.17.output.LayerNorm.weight', 'bert.encoder.layer.17.output.LayerNorm.bias', 'bert.encoder.layer.18.attention.output.LayerNorm.weight', 'bert.encoder.layer.18.attention.output.LayerNorm.bias', 'bert.encoder.layer.18.output.LayerNorm.weight', 'bert.encoder.layer.18.output.LayerNorm.bias', 'bert.encoder.layer.19.attention.output.LayerNorm.weight', 'bert.encoder.layer.19.attention.output.LayerNorm.bias', 'bert.encoder.layer.19.output.LayerNorm.weight', 'bert.encoder.layer.19.output.LayerNorm.bias', 'bert.encoder.layer.20.attention.output.LayerNorm.weight', 'bert.encoder.layer.20.attention.output.LayerNorm.bias', 'bert.encoder.layer.20.output.LayerNorm.weight', 'bert.encoder.layer.20.output.LayerNorm.bias', 'bert.encoder.layer.21.attention.output.LayerNorm.weight', 'bert.encoder.layer.21.attention.output.LayerNorm.bias', 'bert.encoder.layer.21.output.LayerNorm.weight', 'bert.encoder.layer.21.output.LayerNorm.bias', 'bert.encoder.layer.22.attention.output.LayerNorm.weight', 'bert.encoder.layer.22.attention.output.LayerNorm.bias', 'bert.encoder.layer.22.output.LayerNorm.weight', 'bert.encoder.layer.22.output.LayerNorm.bias', 'bert.encoder.layer.23.attention.output.LayerNorm.weight', 'bert.encoder.layer.23.attention.output.LayerNorm.bias', 'bert.encoder.layer.23.output.LayerNorm.weight', 'bert.encoder.layer.23.output.LayerNorm.bias']
I guess the problem may be caused by the version of pytorch-transformers, tensorflow and pytorch. My environments are: pytorch-transformers=1.1.0, tensorflow-gpu=1.10.0, pytorch=1.1.0
How about your environments or can you release the bert-large-uncased of pytorch?
The text was updated successfully, but these errors were encountered:
I have found the solution about the problem:
since pytorch-pretrained-bert => pytorch-transformers.
so you should install pytorch-pretrained-bert <=0.3.0
I have used pytorch_transformers to convert bert-large-uncased form tensorflow version into pytorch version. And then I run the first script. However the model can't load the weights from the pre-trained model. Here is the information:
07/31/2019 10:14:14 - INFO - main - output_dir: out/extract/01
07/31/2019 10:14:14 - INFO - main - ***** Preparing model *****
07/31/2019 10:14:18 - INFO - absa.run_base - Weights of BertForSpanAspectExtraction not initialized from pretrained model: ['bert.embeddings.LayerNorm.gamma', 'bert.embeddings.LayerNorm.beta', 'bert.encoder.layer.0.attention.output.LayerNorm.gamma', 'bert.encoder.layer.0.attention.output.LayerNorm.beta', 'bert.encoder.layer.0.output.LayerNorm.gamma', 'bert.encoder.layer.0.output.LayerNorm.beta', 'bert.encoder.layer.1.attention.output.LayerNorm.gamma', 'bert.encoder.layer.1.attention.output.LayerNorm.beta', 'bert.encoder.layer.1.output.LayerNorm.gamma', 'bert.encoder.layer.1.output.LayerNorm.beta', 'bert.encoder.layer.2.attention.output.LayerNorm.gamma', 'bert.encoder.layer.2.attention.output.LayerNorm.beta', 'bert.encoder.layer.2.output.LayerNorm.gamma', 'bert.encoder.layer.2.output.LayerNorm.beta', 'bert.encoder.layer.3.attention.output.LayerNorm.gamma', 'bert.encoder.layer.3.attention.output.LayerNorm.beta', 'bert.encoder.layer.3.output.LayerNorm.gamma', 'bert.encoder.layer.3.output.LayerNorm.beta', 'bert.encoder.layer.4.attention.output.LayerNorm.gamma', 'bert.encoder.layer.4.attention.output.LayerNorm.beta', 'bert.encoder.layer.4.output.LayerNorm.gamma', 'bert.encoder.layer.4.output.LayerNorm.beta', 'bert.encoder.layer.5.attention.output.LayerNorm.gamma', 'bert.encoder.layer.5.attention.output.LayerNorm.beta', 'bert.encoder.layer.5.output.LayerNorm.gamma', 'bert.encoder.layer.5.output.LayerNorm.beta', 'bert.encoder.layer.6.attention.output.LayerNorm.gamma', 'bert.encoder.layer.6.attention.output.LayerNorm.beta', 'bert.encoder.layer.6.output.LayerNorm.gamma', 'bert.encoder.layer.6.output.LayerNorm.beta', 'bert.encoder.layer.7.attention.output.LayerNorm.gamma', 'bert.encoder.layer.7.attention.output.LayerNorm.beta', 'bert.encoder.layer.7.output.LayerNorm.gamma', 'bert.encoder.layer.7.output.LayerNorm.beta', 'bert.encoder.layer.8.attention.output.LayerNorm.gamma', 'bert.encoder.layer.8.attention.output.LayerNorm.beta', 'bert.encoder.layer.8.output.LayerNorm.gamma', 'bert.encoder.layer.8.output.LayerNorm.beta', 'bert.encoder.layer.9.attention.output.LayerNorm.gamma', 'bert.encoder.layer.9.attention.output.LayerNorm.beta', 'bert.encoder.layer.9.output.LayerNorm.gamma', 'bert.encoder.layer.9.output.LayerNorm.beta', 'bert.encoder.layer.10.attention.output.LayerNorm.gamma', 'bert.encoder.layer.10.attention.output.LayerNorm.beta', 'bert.encoder.layer.10.output.LayerNorm.gamma', 'bert.encoder.layer.10.output.LayerNorm.beta', 'bert.encoder.layer.11.attention.output.LayerNorm.gamma', 'bert.encoder.layer.11.attention.output.LayerNorm.beta', 'bert.encoder.layer.11.output.LayerNorm.gamma', 'bert.encoder.layer.11.output.LayerNorm.beta', 'bert.encoder.layer.12.attention.output.LayerNorm.gamma', 'bert.encoder.layer.12.attention.output.LayerNorm.beta', 'bert.encoder.layer.12.output.LayerNorm.gamma', 'bert.encoder.layer.12.output.LayerNorm.beta', 'bert.encoder.layer.13.attention.output.LayerNorm.gamma', 'bert.encoder.layer.13.attention.output.LayerNorm.beta', 'bert.encoder.layer.13.output.LayerNorm.gamma', 'bert.encoder.layer.13.output.LayerNorm.beta', 'bert.encoder.layer.14.attention.output.LayerNorm.gamma', 'bert.encoder.layer.14.attention.output.LayerNorm.beta', 'bert.encoder.layer.14.output.LayerNorm.gamma', 'bert.encoder.layer.14.output.LayerNorm.beta', 'bert.encoder.layer.15.attention.output.LayerNorm.gamma', 'bert.encoder.layer.15.attention.output.LayerNorm.beta', 'bert.encoder.layer.15.output.LayerNorm.gamma', 'bert.encoder.layer.15.output.LayerNorm.beta', 'bert.encoder.layer.16.attention.output.LayerNorm.gamma', 'bert.encoder.layer.16.attention.output.LayerNorm.beta', 'bert.encoder.layer.16.output.LayerNorm.gamma', 'bert.encoder.layer.16.output.LayerNorm.beta', 'bert.encoder.layer.17.attention.output.LayerNorm.gamma', 'bert.encoder.layer.17.attention.output.LayerNorm.beta', 'bert.encoder.layer.17.output.LayerNorm.gamma', 'bert.encoder.layer.17.output.LayerNorm.beta', 'bert.encoder.layer.18.attention.output.LayerNorm.gamma', 'bert.encoder.layer.18.attention.output.LayerNorm.beta', 'bert.encoder.layer.18.output.LayerNorm.gamma', 'bert.encoder.layer.18.output.LayerNorm.beta', 'bert.encoder.layer.19.attention.output.LayerNorm.gamma', 'bert.encoder.layer.19.attention.output.LayerNorm.beta', 'bert.encoder.layer.19.output.LayerNorm.gamma', 'bert.encoder.layer.19.output.LayerNorm.beta', 'bert.encoder.layer.20.attention.output.LayerNorm.gamma', 'bert.encoder.layer.20.attention.output.LayerNorm.beta', 'bert.encoder.layer.20.output.LayerNorm.gamma', 'bert.encoder.layer.20.output.LayerNorm.beta', 'bert.encoder.layer.21.attention.output.LayerNorm.gamma', 'bert.encoder.layer.21.attention.output.LayerNorm.beta', 'bert.encoder.layer.21.output.LayerNorm.gamma', 'bert.encoder.layer.21.output.LayerNorm.beta', 'bert.encoder.layer.22.attention.output.LayerNorm.gamma', 'bert.encoder.layer.22.attention.output.LayerNorm.beta', 'bert.encoder.layer.22.output.LayerNorm.gamma', 'bert.encoder.layer.22.output.LayerNorm.beta', 'bert.encoder.layer.23.attention.output.LayerNorm.gamma', 'bert.encoder.layer.23.attention.output.LayerNorm.beta', 'bert.encoder.layer.23.output.LayerNorm.gamma', 'bert.encoder.layer.23.output.LayerNorm.beta', 'qa_outputs.weight', 'qa_outputs.bias']
07/31/2019 10:14:18 - INFO - absa.run_base - Weights from pretrained model not used in BertForSpanAspectExtraction: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'bert.embeddings.LayerNorm.weight', 'bert.embeddings.LayerNorm.bias', 'bert.encoder.layer.0.attention.output.LayerNorm.weight', 'bert.encoder.layer.0.attention.output.LayerNorm.bias', 'bert.encoder.layer.0.output.LayerNorm.weight', 'bert.encoder.layer.0.output.LayerNorm.bias', 'bert.encoder.layer.1.attention.output.LayerNorm.weight', 'bert.encoder.layer.1.attention.output.LayerNorm.bias', 'bert.encoder.layer.1.output.LayerNorm.weight', 'bert.encoder.layer.1.output.LayerNorm.bias', 'bert.encoder.layer.2.attention.output.LayerNorm.weight', 'bert.encoder.layer.2.attention.output.LayerNorm.bias', 'bert.encoder.layer.2.output.LayerNorm.weight', 'bert.encoder.layer.2.output.LayerNorm.bias', 'bert.encoder.layer.3.attention.output.LayerNorm.weight', 'bert.encoder.layer.3.attention.output.LayerNorm.bias', 'bert.encoder.layer.3.output.LayerNorm.weight', 'bert.encoder.layer.3.output.LayerNorm.bias', 'bert.encoder.layer.4.attention.output.LayerNorm.weight', 'bert.encoder.layer.4.attention.output.LayerNorm.bias', 'bert.encoder.layer.4.output.LayerNorm.weight', 'bert.encoder.layer.4.output.LayerNorm.bias', 'bert.encoder.layer.5.attention.output.LayerNorm.weight', 'bert.encoder.layer.5.attention.output.LayerNorm.bias', 'bert.encoder.layer.5.output.LayerNorm.weight', 'bert.encoder.layer.5.output.LayerNorm.bias', 'bert.encoder.layer.6.attention.output.LayerNorm.weight', 'bert.encoder.layer.6.attention.output.LayerNorm.bias', 'bert.encoder.layer.6.output.LayerNorm.weight', 'bert.encoder.layer.6.output.LayerNorm.bias', 'bert.encoder.layer.7.attention.output.LayerNorm.weight', 'bert.encoder.layer.7.attention.output.LayerNorm.bias', 'bert.encoder.layer.7.output.LayerNorm.weight', 'bert.encoder.layer.7.output.LayerNorm.bias', 'bert.encoder.layer.8.attention.output.LayerNorm.weight', 'bert.encoder.layer.8.attention.output.LayerNorm.bias', 'bert.encoder.layer.8.output.LayerNorm.weight', 'bert.encoder.layer.8.output.LayerNorm.bias', 'bert.encoder.layer.9.attention.output.LayerNorm.weight', 'bert.encoder.layer.9.attention.output.LayerNorm.bias', 'bert.encoder.layer.9.output.LayerNorm.weight', 'bert.encoder.layer.9.output.LayerNorm.bias', 'bert.encoder.layer.10.attention.output.LayerNorm.weight', 'bert.encoder.layer.10.attention.output.LayerNorm.bias', 'bert.encoder.layer.10.output.LayerNorm.weight', 'bert.encoder.layer.10.output.LayerNorm.bias', 'bert.encoder.layer.11.attention.output.LayerNorm.weight', 'bert.encoder.layer.11.attention.output.LayerNorm.bias', 'bert.encoder.layer.11.output.LayerNorm.weight', 'bert.encoder.layer.11.output.LayerNorm.bias', 'bert.encoder.layer.12.attention.output.LayerNorm.weight', 'bert.encoder.layer.12.attention.output.LayerNorm.bias', 'bert.encoder.layer.12.output.LayerNorm.weight', 'bert.encoder.layer.12.output.LayerNorm.bias', 'bert.encoder.layer.13.attention.output.LayerNorm.weight', 'bert.encoder.layer.13.attention.output.LayerNorm.bias', 'bert.encoder.layer.13.output.LayerNorm.weight', 'bert.encoder.layer.13.output.LayerNorm.bias', 'bert.encoder.layer.14.attention.output.LayerNorm.weight', 'bert.encoder.layer.14.attention.output.LayerNorm.bias', 'bert.encoder.layer.14.output.LayerNorm.weight', 'bert.encoder.layer.14.output.LayerNorm.bias', 'bert.encoder.layer.15.attention.output.LayerNorm.weight', 'bert.encoder.layer.15.attention.output.LayerNorm.bias', 'bert.encoder.layer.15.output.LayerNorm.weight', 'bert.encoder.layer.15.output.LayerNorm.bias', 'bert.encoder.layer.16.attention.output.LayerNorm.weight', 'bert.encoder.layer.16.attention.output.LayerNorm.bias', 'bert.encoder.layer.16.output.LayerNorm.weight', 'bert.encoder.layer.16.output.LayerNorm.bias', 'bert.encoder.layer.17.attention.output.LayerNorm.weight', 'bert.encoder.layer.17.attention.output.LayerNorm.bias', 'bert.encoder.layer.17.output.LayerNorm.weight', 'bert.encoder.layer.17.output.LayerNorm.bias', 'bert.encoder.layer.18.attention.output.LayerNorm.weight', 'bert.encoder.layer.18.attention.output.LayerNorm.bias', 'bert.encoder.layer.18.output.LayerNorm.weight', 'bert.encoder.layer.18.output.LayerNorm.bias', 'bert.encoder.layer.19.attention.output.LayerNorm.weight', 'bert.encoder.layer.19.attention.output.LayerNorm.bias', 'bert.encoder.layer.19.output.LayerNorm.weight', 'bert.encoder.layer.19.output.LayerNorm.bias', 'bert.encoder.layer.20.attention.output.LayerNorm.weight', 'bert.encoder.layer.20.attention.output.LayerNorm.bias', 'bert.encoder.layer.20.output.LayerNorm.weight', 'bert.encoder.layer.20.output.LayerNorm.bias', 'bert.encoder.layer.21.attention.output.LayerNorm.weight', 'bert.encoder.layer.21.attention.output.LayerNorm.bias', 'bert.encoder.layer.21.output.LayerNorm.weight', 'bert.encoder.layer.21.output.LayerNorm.bias', 'bert.encoder.layer.22.attention.output.LayerNorm.weight', 'bert.encoder.layer.22.attention.output.LayerNorm.bias', 'bert.encoder.layer.22.output.LayerNorm.weight', 'bert.encoder.layer.22.output.LayerNorm.bias', 'bert.encoder.layer.23.attention.output.LayerNorm.weight', 'bert.encoder.layer.23.attention.output.LayerNorm.bias', 'bert.encoder.layer.23.output.LayerNorm.weight', 'bert.encoder.layer.23.output.LayerNorm.bias']
I guess the problem may be caused by the version of pytorch-transformers, tensorflow and pytorch. My environments are: pytorch-transformers=1.1.0, tensorflow-gpu=1.10.0, pytorch=1.1.0
How about your environments or can you release the bert-large-uncased of pytorch?
The text was updated successfully, but these errors were encountered: