add direct meg-ds to hf format script #110
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR extendes @tjruwase work here: deepspeedai/Megatron-DeepSpeed#14 to support a direct conversion from Meg-DS checkpoint to HF Transformers checkpoint, bypassing the Meg-DS to Meg conversion stage.
I'm cheating and combining the 2 stages into one script importing code from both stages. i.e. the script needs the latest transformers master for this to work.