[megatron-bert-uncased-345m] fix conversion #16639

stas00 · 2022-04-06T23:05:08Z

The original conversion script made an assumption that all released megatron-bert-*-345m checkpoints had the same vocab, but https://huggingface.co/nvidia/megatron-bert-cased-345m/blob/main/vocab.txt and https://huggingface.co/nvidia/megatron-bert-uncased-345m/blob/main/vocab.txt are quite different.

This PR sets config.vocab_size to the actual size of one of the params of vocab dimension.

I tested that both checkpoints mentioned above convert and load correctly:

python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py megatron-bert-cased-345m/checkpoint.zip
python -c 'from transformers import MegatronBertForMaskedLM; MegatronBertForMaskedLM.from_pretrained("megatron-bert-cased-345m")'

python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py megatron-bert-uncased-345m/checkpoint.zip
python -c 'from transformers import MegatronBertForMaskedLM; MegatronBertForMaskedLM.from_pretrained("megatron-bert-uncased-345m")'

both succeed.

Before this PR only the former worked, and the 2nd failed with:

RuntimeError: Error(s) in loading state_dict for MegatronBertForMaskedLM:
	size mismatch for cls.predictions.bias: copying a param with shape torch.Size([30592]) from checkpoint, the shape in current model is torch.Size([29056])

29056 is the vocab size of megatron-bert-cased-345m

@LysandreJik, @sgugger

HuggingFaceDocBuilderDev · 2022-04-06T23:20:35Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for fixing!

[megatron-bert-uncased-345m] fix conversion

85b79f8

stas00 mentioned this pull request Apr 6, 2022

MegatronBertForMaskedLM #16638

Closed

sgugger approved these changes Apr 7, 2022

View reviewed changes

stas00 merged commit 080e42d into main Apr 7, 2022

stas00 deleted the meg-bert-conversion-uncased branch April 7, 2022 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[megatron-bert-uncased-345m] fix conversion #16639

[megatron-bert-uncased-345m] fix conversion #16639

stas00 commented Apr 6, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 6, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[megatron-bert-uncased-345m] fix conversion #16639

[megatron-bert-uncased-345m] fix conversion #16639

Conversation

stas00 commented Apr 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stas00 commented Apr 6, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 6, 2022 •

edited

Loading