Holistic Evaluation of Language Models

Welcome! The crfm-helm Python package contains code used in the Holistic Evaluation of Language Models project (paper, website) by Stanford CRFM. This package includes the following features:

Collection of datasets in a standard format (e.g., NaturalQuestions)
Collection of models accessible via a unified API (e.g., GPT-3, MT-NLG, OPT, BLOOM)
Collection of metrics beyond accuracy (efficiency, bias, toxicity, etc.)
Collection of perturbations for evaluating robustness and fairness (e.g., typos, dialect)
Modular framework for constructing prompts from datasets
Proxy server for managing accounts and providing unified interface to access models

To get started, refer to the documentation on Read the Docs for how to install and run the package.

Name	Name	Last commit message	Last commit date
Latest commit timothylimyl prevent error with default localhost setting (stanford-crfm#1842 ) Sep 15, 2023 62f817e · Sep 15, 2023 History 4,431 Commits
.github/workflows	.github/workflows	Speed up pre-commit by moving `pip install` to a separate script (sta…	Aug 22, 2023
docs	docs	Speed up pre-commit by moving `pip install` to a separate script (sta…	Aug 22, 2023
scripts	scripts	Annotate tqdm type in compute_request_limits (stanford-crfm#1836 )	Sep 14, 2023
src/helm	src/helm	prevent error with default localhost setting (stanford-crfm#1842 )	Sep 15, 2023
.gitignore	.gitignore	Update .gitignore for MechanicalTurkCritiqueClient and SlurmRunner (s…	Jun 22, 2023
.pre-commit-config.yaml	.pre-commit-config.yaml	add back auto pre-commit	Dec 29, 2021
.readthedocs.yaml	.readthedocs.yaml	Fix ReadTheDocs YAML configuration	Nov 22, 2022
CHANGELOG.md	CHANGELOG.md	Release v0.2.3 (stanford-crfm#1739 )	Jul 25, 2023
LICENSE	LICENSE	Fill in license template	Oct 12, 2022
MANIFEST.in	MANIFEST.in	Group dependencies and remove requirements.txt (stanford-crfm#1681 )	Jun 22, 2023
README.md	README.md	Update HELM URL (stanford-crfm#1428 )	Mar 22, 2023
demo.py	demo.py	Rename modules and commands	Nov 16, 2022
install-dev.sh	install-dev.sh	Optional dependencies (stanford-crfm#1798 )	Aug 24, 2023
json-urls-root.js	json-urls-root.js	Storage Cost Reduction (stanford-crfm#1657 )	Sep 6, 2023
json-urls.js	json-urls.js	Storage Cost Reduction (stanford-crfm#1657 )	Sep 6, 2023
mkdocs.yml	mkdocs.yml	Add documentation for adding new models (stanford-crfm#1325 )	May 9, 2023
pre-commit.sh	pre-commit.sh	Optional dependencies (stanford-crfm#1798 )	Aug 24, 2023
pyproject.toml	pyproject.toml	Install using the setuptools.build_meta backend (stanford-crfm#1535 )	May 13, 2023
requirements-dev.txt	requirements-dev.txt	Clean up requirements (stanford-crfm#1392 )	Mar 9, 2023
requirements-freeze.txt	requirements-freeze.txt	Deprecate openai/chat-gpt in favor of openai/gpt-3.5-turbo-0613 (stan…	Aug 24, 2023
setup.cfg	setup.cfg	Deprecate openai/chat-gpt in favor of openai/gpt-3.5-turbo-0613 (stan…	Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Holistic Evaluation of Language Models

About

Releases

Packages

Languages

License

vijilAI/helm

Folders and files

Latest commit

History

Repository files navigation

Holistic Evaluation of Language Models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages