title | emoji | colorFrom | colorTo | sdk | pinned | license | short_description |
---|---|---|---|---|---|---|---|
LLMEval Dataset Parser |
⚡ |
green |
gray |
docker |
false |
mit |
A collection of parsers for LLM benchmark datasets |
LLMDataParser is a Python library that provides parsers for benchmark datasets used in evaluating Large Language Models (LLMs). It offers a unified interface for loading and parsing datasets like MMLU, GSM8k, and others, streamlining dataset preparation for LLM evaluation. The library aims to simplify the process of working with common LLM benchmark datasets through a consistent API.
Spaces: You can also try out the online demo on Hugging Face Spaces: LLMEval Dataset Parser Demo
- Unified Interface: Consistent
DatasetParser
for all datasets. - Easy to Use: Simple methods and built-in Python types.
- Extensible: Easily add support for new datasets.
- Gradio: Built-in Gradio interface for interactive dataset exploration and testing.
You can install the package directly using pip
. Even with only a pyproject.toml
file, this method works for standard installations.
-
Clone the Repository:
git clone https://github.com/jeff52415/LLMDataParser.git cd LLMDataParser
-
Install Dependencies with pip:
pip install .
Poetry manages the virtual environment and dependencies automatically, so you don't need to create a conda environment first.
-
Install Dependencies with Poetry:
poetry install
-
Activate the Virtual Environment:
poetry shell
- MMLUDatasetParser
- MMLUProDatasetParser
- MMLUReduxDatasetParser
- TMMLUPlusDatasetParser
- GSM8KDatasetParser
- MATHDatasetParser
- MGSMDatasetParser
- HumanEvalDatasetParser
- HumanEvalDatasetPlusParser
- BBHDatasetParser
- MBPPDatasetParser
- IFEvalDatasetParser
- TWLegalDatasetParser
- TMLUDatasetParser
Here's a simple example demonstrating how to use the library:
from llmdataparser import ParserRegistry
# list all available parsers
ParserRegistry.list_parsers()
# get a parser
parser = ParserRegistry.get_parser("mmlu")
# load the parser
parser.load() # optional: task_name, split
# parse the parser
parser.parse() # optional: split_names
print(parser.task_names)
print(parser.split_names)
print(parser.get_dataset_description)
print(parser.get_huggingface_link)
print(parser.total_tasks)
data = parser.get_parsed_data
We also provide a Gradio demo for interactive testing:
python app.py
To add support for a new dataset, please refer to our detailed guide in docs/adding_new_parser.md. The guide includes:
- Step-by-step instructions for creating a new parser
- Code examples and templates
- Best practices and common patterns
- Testing guidelines
This project is licensed under the MIT License. See the LICENSE file for details.
For questions or support, please open an issue on GitHub or contact [email protected].