Skip to content

Conversation

@stas00
Copy link
Contributor

@stas00 stas00 commented Sep 20, 2021

This PR extendes @tjruwase work here: deepspeedai/Megatron-DeepSpeed#14 to support a direct conversion from Meg-DS checkpoint to HF Transformers checkpoint, bypassing the Meg-DS to Meg conversion stage.

I'm cheating and combining the 2 stages into one script importing code from both stages. i.e. the script needs the latest transformers master for this to work.

@stas00 stas00 merged commit bea5ded into main Sep 20, 2021
@stas00 stas00 deleted the convert-meg-ds-to-hf branch September 20, 2021 19:03
ofirpress pushed a commit to ofirpress/Megatron-DeepSpeed that referenced this pull request Sep 23, 2021
SaulLu added a commit to SaulLu/Megatron-DeepSpeed that referenced this pull request Sep 24, 2021
adammoody pushed a commit to adammoody/Megatron-DeepSpeed that referenced this pull request Dec 20, 2021
* add direct meg-ds to hf format script (bigscience-workshop#110)

* add direct meg-ds to hf format script (part2) (bigscience-workshop#111)

* add direct meg-ds to hf format script

* split into 2 function

* update the usage doc

* make scripts executable

* add shebang

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants