Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
2833138
[MIEB] "capability measured"-Abstask 1-1 matching refactor [1/3]: rei…
gowitheflow-1998 Mar 22, 2025
065159d
Update tasks table
github-actions[bot] Mar 22, 2025
e8faf3f
fix: Add option to remove benchmark from leaderboard (#2417)
KennethEnevoldsen Mar 23, 2025
a25dadb
1.36.31
invalid-email-address Mar 23, 2025
9d9b0b4
fix: Add VDR Multilingual Dataset (#2408)
ayush1298 Mar 23, 2025
34edcd5
Update tasks table
github-actions[bot] Mar 23, 2025
0cdf2e0
1.36.32
invalid-email-address Mar 23, 2025
071741d
HOTFIX: pin setuptools (#2423)
Samoed Mar 24, 2025
39cee62
add __init__.py Clustering > kor folder, And edit __init__.py in C…
OnAnd0n Mar 25, 2025
55c542b
Update tasks table
github-actions[bot] Mar 25, 2025
731c4fc
Update speed dependencies with new setuptools release (#2429)
Samoed Mar 25, 2025
98ab0ef
add richinfoai models (#2427)
richinfo-ai Mar 25, 2025
d3eab6f
Added Memory Usage column on leaderboard (#2428)
ayush1298 Mar 25, 2025
0db0a20
docs: typos; Standardize spacing; Chronological order (#2436)
Muennighoff Mar 26, 2025
8a024be
fix: Add model specific dependencies in pyproject.toml (#2424)
ayush1298 Mar 26, 2025
6ae420d
1.36.33
invalid-email-address Mar 26, 2025
65446e5
[MIEB] "capability measured"-Abstask 1-1 matching refactor [2/3]: rei…
gowitheflow-1998 Mar 26, 2025
19dc625
Update tasks table
github-actions[bot] Mar 26, 2025
dadafbe
Error while evaluating MIRACLRetrievalHardNegatives: 'trust_remote_co…
KennethEnevoldsen Mar 27, 2025
43adb0c
Feat/searchmap preview (#2420)
Free-tek Mar 28, 2025
5af5547
Add Background Gradients in Summary and Task Table (#2392)
ayush1298 Mar 29, 2025
61d3c6c
add ops_moa_models (#2439)
ahxgw Mar 29, 2025
35a8a5b
leaderboard fix (#2456)
ayush1298 Mar 29, 2025
d11934f
ci: cache `~/.cache/huggingface` (#2464)
sam-hey Mar 31, 2025
8799126
[MIEB] "capability measured"-Abstask 1-1 matching refactor [3/3]: rei…
gowitheflow-1998 Apr 1, 2025
5b567bf
Update tasks table
github-actions[bot] Apr 1, 2025
f293d8b
fix: Adds family of NeuML/pubmedbert-base-embedding models (#2443)
nadshe Apr 1, 2025
c617598
fix: add nb_sbert model (#2339)
theatollersrud Apr 1, 2025
42068c6
1.36.34
invalid-email-address Apr 1, 2025
e837b09
suppress logging warnings on leaderboard (#2406)
Samoed Apr 2, 2025
6c8c8d2
fix: E5 instruct now listed as sbert compatible (#2475)
KennethEnevoldsen Apr 2, 2025
eef52be
1.36.35
invalid-email-address Apr 2, 2025
295ad0a
[MIEB] rename VisionCentric to VisionCentricQA (#2479)
isaac-chung Apr 2, 2025
17b53b4
ci: Run dataset loading only when pushing to main (#2480)
isaac-chung Apr 2, 2025
f5881b0
fix table in tasks.md (#2483)
ayush1298 Apr 3, 2025
9117c2f
Update tasks table
github-actions[bot] Apr 3, 2025
ed5f7d2
merge
Samoed Apr 4, 2025
aff9a3c
fix imports
Samoed Apr 4, 2025
a763df7
update model loader
Samoed Apr 4, 2025
a5a6cdb
remove unused imports
Samoed Apr 4, 2025
573614f
fix clip name
Samoed Apr 4, 2025
a458ca3
fix moco models
Samoed Apr 4, 2025
1d808d0
fix tests
Samoed Apr 4, 2025
1011b62
fix tests
Samoed Apr 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/dataset_loading.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
name: Datasets available on HuggingFace

on:
pull_request:
push:
branches: [main]

Expand All @@ -21,7 +20,8 @@ jobs:

- name: Install dependencies
run: |
make install-for-tests
make install-for-tests

- name: Run dataset loading tests
run: |
make dataset-load-test
7 changes: 7 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@ jobs:
steps:
- uses: actions/checkout@v3

- name: Cache Hugging Face
id: cache-hf
uses: actions/cache@v4
with:
path: ~/.cache/huggingface
key: ${{ runner.os }}-hf

- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ serve-docs:

model-load-test:
@echo "--- 🚀 Running model load test ---"
pip install ".[dev, speedtask, pylate,gritlm,xformers,model2vec]"
pip install ".[dev, pylate,gritlm,xformers,model2vec]"
python scripts/extract_model_names.py $(BASE_BRANCH) --return_one_model_name_per_file
python tests/test_models/model_loading.py --model_name_file scripts/model_names.txt

Expand Down
41 changes: 20 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ mteb run -m sentence-transformers/all-MiniLM-L6-v2 \
Note that using multiple GPUs in parallel can be done by just having a custom encode function that distributes the inputs to multiple GPUs like e.g. [here](https://github.com/microsoft/unilm/blob/b60c741f746877293bb85eed6806736fc8fa0ffd/e5/mteb_eval.py#L60) or [here](https://github.com/ContextualAI/gritlm/blob/09d8630f0c95ac6a456354bcb6f964d7b9b6a609/gritlm/gritlm.py#L75). See [custom models](docs/usage/usage.md#using-a-custom-model) for more information.



## Usage Documentation
The following links to the main sections in the usage documentation.

Expand Down Expand Up @@ -102,16 +101,16 @@ The following links to the main sections in the usage documentation.

## Overview

| Overview | |
| Overview | |
|--------------------------------|-------------------------------------------------------------------------------------|
| 📈 [Leaderboard] | The interactive leaderboard of the benchmark |
| 📋 [Tasks] | Overview of available tasks |
| 📐 [Benchmarks] | Overview of available benchmarks |
| **Contributing** | |
| 🤖 [Adding a model] | Information related to how to submit a model to MTEB and to the leaderboard |
| 👩‍🔬 [Reproducible workflows] | Information related to how to create reproducible workflows with MTEB |
| 👩‍💻 [Adding a dataset] | How to add a new task/dataset to MTEB |
| 👩‍💻 [Adding a benchmark] | How to add a new benchmark to MTEB and to the leaderboard |
| **Contributing** | |
| 🤖 [Adding a model] | Information related to how to submit a model to MTEB and to the leaderboard |
| 👩‍🔬 [Reproducible workflows] | Information related to how to create reproducible workflows with MTEB |
| 👩‍💻 [Adding a dataset] | How to add a new task/dataset to MTEB |
| 👩‍💻 [Adding a benchmark] | How to add a new benchmark to MTEB and to the leaderboard |
| 🤝 [Contributing] | How to contribute to MTEB and set it up for development |

[Tasks]: docs/tasks.md
Expand All @@ -125,23 +124,13 @@ The following links to the main sections in the usage documentation.

## Citing

MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.org/abs/2210.07316)", and heavily expanded in "[MMTEB: Massive Multilingual Text Embedding Benchmark](https://arxiv.org/abs/2502.13595)". When using `mteb` we recommend that you cite both articles.
MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.org/abs/2210.07316)", and heavily expanded in "[MMTEB: Massive Multilingual Text Embedding Benchmark](https://arxiv.org/abs/2502.13595)". When using `mteb`, we recommend that you cite both articles.

<details>
<summary> Bibtex Citation (click to unfold) </summary>


```bibtex
@article{enevoldsen2025mmtebmassivemultilingualtext,
title={MMTEB: Massive Multilingual Text Embedding Benchmark},
author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
publisher = {arXiv},
journal={arXiv preprint arXiv:2502.13595},
year={2025},
url={https://arxiv.org/abs/2502.13595},
doi = {10.48550/arXiv.2502.13595},
}

@article{muennighoff2022mteb,
author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo{\"\i}c and Reimers, Nils},
title = {MTEB: Massive Text Embedding Benchmark},
Expand All @@ -151,21 +140,31 @@ MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.o
url = {https://arxiv.org/abs/2210.07316},
doi = {10.48550/ARXIV.2210.07316},
}

@article{enevoldsen2025mmtebmassivemultilingualtext,
title={MMTEB: Massive Multilingual Text Embedding Benchmark},
author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
publisher = {arXiv},
journal={arXiv preprint arXiv:2502.13595},
year={2025},
url={https://arxiv.org/abs/2502.13595},
doi = {10.48550/arXiv.2502.13595},
}
```
</details>


If you use any of the specific benchmark we also recommend that you cite the authors.
If you use any of the specific benchmarks, we also recommend that you cite the authors.

```py
benchmark = mteb.get_benchmark("MTEB(eng, v2)")
benchmark.citation # get citation for a specific benchmarks
benchmark.citation # get citation for a specific benchmark

# you can also create a table of the task for the appendix using:
benchmark.tasks.to_latex()
```

Some of these amazing publications include:
Some of these amazing publications include (ordered chronologically):
- Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff. "[C-Pack: Packaged Resources To Advance General Chinese Embedding](https://arxiv.org/abs/2309.07597)" arXiv 2023
- Michael Günther, Jackmin Ong, Isabelle Mohr, Alaeddine Abdessalem, Tanguy Abel, Mohammad Kalim Akram, Susana Guzman, Georgios Mastrapas, Saba Sturua, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao. "[Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents](https://arxiv.org/abs/2310.19923)" arXiv 2023
- Silvan Wehrli, Bert Arnrich, Christopher Irrgang. "[German Text Embedding Clustering Benchmark](https://arxiv.org/abs/2401.02709)" arXiv 2024
Expand Down
2 changes: 1 addition & 1 deletion docs/adding_a_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ model = SentenceTransformer(model_name)
evaluation = MTEB(tasks=[YourNewTask()])
```

- [ ] I have run the following models on the task (adding the results to the pr). These can be run using the `mteb -m {model_name} -t {task_name}` command.
- [ ] I have run the following models on the task (adding the results to the pr). These can be run using the `mteb run -m {model_name} -t {task_name}` command.
- [ ] `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`
- [ ] `intfloat/multilingual-e5-small`
- [ ] I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
Expand Down
14 changes: 14 additions & 0 deletions docs/adding_a_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,3 +132,17 @@ model = ModelMeta(
...
)
```

##### Adding model dependencies in pyproject.toml
If your are adding a model that requires additional dependencies, you can add them to the `pyproject.toml` file and instead of checking whether dependencies are installed or not make use of `requires_package` from [requires_package.py](../mteb/requires_packages.py). For example:

In the [voyage_models.py](../mteb/models/voyage_models.py) file, we have added the following code:
```python
requires_package(self, "voyageai", model_name, "pip install 'mteb[voyageai]'")
```
and also updated [pyproject.toml]((../pyproject.toml)) file with the following code:
```python
voyageai = ["voyageai>=1.0.0,<2.0.0"]
```
so that it will check whether voyageai is installed or not. If not, then it will give an error message to install voyageai. This has done so as to give clear installation warnings.
If you want to give suggestion instead of warning, you can use `suggest_package` from [requires_package.py](../mteb/requires_packages.py).
Loading