To reduce error in NLP Datasets using LLMs.
The datasets used currently are SQuAD and RACE which are Extractive Question Answering Datasets.
Fork and Clone this Repository.
- If you are using a OpenAI API key, create a
.env
file in the same directory and add your OpenAI API key.
OPENAI_API_KEY = YOUR_API_KEY
- If you are using a LLM model locally using a local server, then use the files inside of the LocalLLM folder instead of the default files. The folder /ollama consists of my notebooks which were using Ollama to run LLMs locally.
PS - I use LM Studio and Ollamato run the LLMs locally.
- SQuAD: 100,000+ Questions for Machine Comprehension of Text (Rajpurkar et al., EMNLP 2016)
- RACE: Large-scale ReAding Comprehension Dataset From Examinations (Lai et al., EMNLP 2017)