Generate README.md with GPT-3 few-shot learning
alreadyme-ai-research is a core project for generating README.md
from source codes in any repository. The AI model reads some parts of the source codes and write a corresponding README.md
document. ALREADYME.md team is currently providing a service about this feature, and you can find our results on this page.
This repository contains several subprojects. You can see the detailed descriptions in the directories.
- data-preparation: The source codes for preparing a train dataset.
- model-finetuning: How to fine-tune large-scale language models efficiently.
- sentence-generation: Efficient and scalable way to generate sentences for model serving.
As the large-scale models like GPT-3 have shown, few-shot learning is the most important key for building the generalized language model. They can understand what they should have to write according to the previous prompt and few-shot examples. Using this features, they can do almost anything without fine-tuning. They can summarize the news, answer the questions, and even make a conversation!
OpenAI Codex introduced new large-scale langauge model for programming languages by fine-tuning GPT-3. Now we can expect the generalized performance (few-shot learning) on the programming languages. For instance, create a docstring from the source code, write new code from the description (and this is how Copilot works), and translate from Python to Java.
We use BLOOM which is for open-science and open-access of large-scale language model. BLOOM supports multilingual which are not only natural languages, but the programming languages as well. We designed prompt templates and found best version of them.
&&&&&&
$ head -n 30 model-finetuning/src/data.py
from __future__ import annotations
from dataclasses import dataclass
import torch
[...]
&&&&&&
$ head -n 37 model-finetuning/src/train.py
from __future__ import annotations
import argparse
import os
[...]
&&&&&&
$ git config --get remote.origin.url
https://github.com/readme-generator/alreadyme-ai-research.git
&&&&&&
$ cat README.md
[...]
All the examples will be separated by &&&&&&
. We designed to make BLOOM to perform (or simulate) the linux bash command. BLOOM will read some parts of the source codes from the given prompt and generate a proper README.md
file.
For more details, check out our model-finetuning subproject.
alreadyme-ai-research is released under the Apache License 2.0. License can be found in here.
@misc{https://doi.org/10.48550/arxiv.2005.14165,
title = {Language Models are Few-Shot Learners},
author = {Brown, Tom B. and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel M. and Wu, Jeffrey and Winter, Clemens and Hesse, Christopher and Chen, Mark and Sigler, Eric and Litwin, Mateusz and Gray, Scott and Chess, Benjamin and Clark, Jack and Berner, Christopher and McCandlish, Sam and Radford, Alec and Sutskever, Ilya and Amodei, Dario},
year = 2020,
publisher = {arXiv},
doi = {10.48550/ARXIV.2005.14165},
url = {https://arxiv.org/abs/2005.14165},
copyright = {arXiv.org perpetual, non-exclusive license},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}
}
@misc{https://doi.org/10.48550/arxiv.2107.03374,
title = {Evaluating Large Language Models Trained on Code},
author = {Chen, Mark and Tworek, Jerry and Jun, Heewoo and Yuan, Qiming and Pinto, Henrique Ponde de Oliveira and Kaplan, Jared and Edwards, Harri and Burda, Yuri and Joseph, Nicholas and Brockman, Greg and Ray, Alex and Puri, Raul and Krueger, Gretchen and Petrov, Michael and Khlaaf, Heidy and Sastry, Girish and Mishkin, Pamela and Chan, Brooke and Gray, Scott and Ryder, Nick and Pavlov, Mikhail and Power, Alethea and Kaiser, Lukasz and Bavarian, Mohammad and Winter, Clemens and Tillet, Philippe and Such, Felipe Petroski and Cummings, Dave and Plappert, Matthias and Chantzis, Fotios and Barnes, Elizabeth and Herbert-Voss, Ariel and Guss, William Hebgen and Nichol, Alex and Paino, Alex and Tezak, Nikolas and Tang, Jie and Babuschkin, Igor and Balaji, Suchir and Jain, Shantanu and Saunders, William and Hesse, Christopher and Carr, Andrew N. and Leike, Jan and Achiam, Josh and Misra, Vedant and Morikawa, Evan and Radford, Alec and Knight, Matthew and Brundage, Miles and Murati, Mira and Mayer, Katie and Welinder, Peter and McGrew, Bob and Amodei, Dario and McCandlish, Sam and Sutskever, Ilya and Zaremba, Wojciech},
year = 2021,
publisher = {arXiv},
doi = {10.48550/ARXIV.2107.03374},
url = {https://arxiv.org/abs/2107.03374},
copyright = {arXiv.org perpetual, non-exclusive license},
keywords = {Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences}
}
@misc{https://doi.org/10.48550/arxiv.2106.09685,
title = {LoRA: Low-Rank Adaptation of Large Language Models},
author = {Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
year = 2021,
publisher = {arXiv},
doi = {10.48550/ARXIV.2106.09685},
url = {https://arxiv.org/abs/2106.09685},
copyright = {arXiv.org perpetual, non-exclusive license},
keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences}
}
@misc{bigscience_2022,
title = {Bigscience large open-science openaccess multilingual language model.},
author = {BigScience},
year = 2022,
journal = {bigscience/bloom · Hugging Face},
url = {https://huggingface.co/bigscience/bloom}
}