Skip to content

Conversation

@jlamypoirier
Copy link
Collaborator

@jlamypoirier jlamypoirier commented Feb 7, 2023

Create a new model identical to GPT2. Actual changes will be added later to track changes more easily. Based on huggingface/transformers#21253, without the doc and model changes. Also using the name GPTBigCode instead because it does a lot more than MQA, but the name can be changed later.

Tested with bigcode-project/bigcode-inference-benchmark#14.
Lots of things probably won't work until updated (tests, onnx, pretrained models, etc.)

@jlamypoirier jlamypoirier marked this pull request as ready for review February 7, 2023 16:16
@jlamypoirier jlamypoirier merged commit 3ba6b3d into main Feb 8, 2023
@jlamypoirier jlamypoirier deleted the gpt2_bigcode branch February 8, 2023 19:14
@jlamypoirier
Copy link
Collaborator Author

Merged to speed things up, it doesn't affect the rest of transformers anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants