embeddings-benchmark · Samoed · Apr 4, 2025 · Mar 22, 2025 · Mar 22, 2025 · Mar 23, 2025
diff --git a/.github/workflows/dataset_loading.yml b/.github/workflows/dataset_loading.yml
@@ -1,7 +1,6 @@
 name: Datasets available on HuggingFace
 
 on:
-  pull_request:
   push:
     branches: [main]
 
@@ -21,7 +20,8 @@ jobs:
 
     - name: Install dependencies
       run: |
-        make install-for-tests
+          make install-for-tests
+
     - name: Run dataset loading tests
       run: |
         make dataset-load-test
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -24,6 +24,13 @@ jobs:
     steps:
       - uses: actions/checkout@v3
 
+      - name: Cache Hugging Face
+        id: cache-hf
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/huggingface
+          key: ${{ runner.os }}-hf
+
       - name: Setup Python ${{ matrix.python-version }}
         uses: actions/setup-python@v4
         with:

diff --git a/Makefile b/Makefile
@@ -46,7 +46,7 @@ serve-docs:
 
 model-load-test:
 	@echo "--- 🚀 Running model load test ---"
-	pip install ".[dev, speedtask, pylate,gritlm,xformers,model2vec]"
+	pip install ".[dev, pylate,gritlm,xformers,model2vec]"
 	python scripts/extract_model_names.py $(BASE_BRANCH) --return_one_model_name_per_file
 	python tests/test_models/model_loading.py --model_name_file scripts/model_names.txt
 

diff --git a/README.md b/README.md
@@ -70,7 +70,6 @@ mteb run -m sentence-transformers/all-MiniLM-L6-v2 \
 Note that using multiple GPUs in parallel can be done by just having a custom encode function that distributes the inputs to multiple GPUs like e.g. [here](https://github.com/microsoft/unilm/blob/b60c741f746877293bb85eed6806736fc8fa0ffd/e5/mteb_eval.py#L60) or [here](https://github.com/ContextualAI/gritlm/blob/09d8630f0c95ac6a456354bcb6f964d7b9b6a609/gritlm/gritlm.py#L75). See [custom models](docs/usage/usage.md#using-a-custom-model) for more information.
 
 
-
 ## Usage Documentation
 The following links to the main sections in the usage documentation.
 
@@ -102,16 +101,16 @@ The following links to the main sections in the usage documentation.
 
 ## Overview
 
-| Overview                  |                                                                                     |
+| Overview                       |                                                                                     |
 |--------------------------------|-------------------------------------------------------------------------------------|
 | 📈 [Leaderboard]               | The interactive leaderboard of the benchmark                                        |
 | 📋 [Tasks]                     | Overview of available tasks                                                         |
 | 📐 [Benchmarks]                | Overview of available benchmarks                                                    |
-| **Contributing**           |   |
-| 🤖 [Adding a model]            | Information related to how to submit a model to MTEB and to the leaderboard |
-| 👩‍🔬 [Reproducible workflows] | Information related to how to create reproducible workflows with MTEB |
-| 👩‍💻 [Adding a dataset]       | How to add a new task/dataset to MTEB                                               |
-| 👩‍💻 [Adding a benchmark]     | How to add a new benchmark to MTEB and to the leaderboard                           |
+| **Contributing**               |                                                                                     |
+| 🤖 [Adding a model]            | Information related to how to submit a model to MTEB and to the leaderboard         |
+| 👩‍🔬 [Reproducible workflows]    | Information related to how to create reproducible workflows with MTEB               |
+| 👩‍💻 [Adding a dataset]          | How to add a new task/dataset to MTEB                                               |
+| 👩‍💻 [Adding a benchmark]        | How to add a new benchmark to MTEB and to the leaderboard                           |
 | 🤝 [Contributing]              | How to contribute to MTEB and set it up for development                             |
 
 [Tasks]: docs/tasks.md
@@ -125,23 +124,13 @@ The following links to the main sections in the usage documentation.
 
 ## Citing
 
-MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.org/abs/2210.07316)", and heavily expanded in "[MMTEB: Massive Multilingual Text Embedding Benchmark](https://arxiv.org/abs/2502.13595)". When using `mteb` we recommend that you cite both articles.
+MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.org/abs/2210.07316)", and heavily expanded in "[MMTEB: Massive Multilingual Text Embedding Benchmark](https://arxiv.org/abs/2502.13595)". When using `mteb`, we recommend that you cite both articles.
 
 <details>
   <summary> Bibtex Citation (click to unfold) </summary>
 
 
 ```bibtex
-@article{enevoldsen2025mmtebmassivemultilingualtext,
-      title={MMTEB: Massive Multilingual Text Embedding Benchmark},
-      author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
-      publisher = {arXiv},
-      journal={arXiv preprint arXiv:2502.13595},
-      year={2025},
-      url={https://arxiv.org/abs/2502.13595},
-      doi = {10.48550/arXiv.2502.13595},
-}
-
 @article{muennighoff2022mteb,
   author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo{\"\i}c and Reimers, Nils},
   title = {MTEB: Massive Text Embedding Benchmark},
@@ -151,21 +140,31 @@ MTEB was introduced in "[MTEB: Massive Text Embedding Benchmark](https://arxiv.o
   url = {https://arxiv.org/abs/2210.07316},
   doi = {10.48550/ARXIV.2210.07316},
 }
+
+@article{enevoldsen2025mmtebmassivemultilingualtext,
+  title={MMTEB: Massive Multilingual Text Embedding Benchmark},
+  author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff},
+  publisher = {arXiv},
+  journal={arXiv preprint arXiv:2502.13595},
+  year={2025},
+  url={https://arxiv.org/abs/2502.13595},
+  doi = {10.48550/arXiv.2502.13595},
+}
 ```
 </details>
 
 
-If you use any of the specific benchmark we also recommend that you cite the authors.
+If you use any of the specific benchmarks, we also recommend that you cite the authors.
 
 ```py
 benchmark = mteb.get_benchmark("MTEB(eng, v2)")
-benchmark.citation # get citation for a specific benchmarks
+benchmark.citation # get citation for a specific benchmark
 
 # you can also create a table of the task for the appendix using:
 benchmark.tasks.to_latex()
 ```
 
-Some of these amazing publications include:
+Some of these amazing publications include (ordered chronologically):
 - Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff. "[C-Pack: Packaged Resources To Advance General Chinese Embedding](https://arxiv.org/abs/2309.07597)" arXiv 2023
 - Michael Günther, Jackmin Ong, Isabelle Mohr, Alaeddine Abdessalem, Tanguy Abel, Mohammad Kalim Akram, Susana Guzman, Georgios Mastrapas, Saba Sturua, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao. "[Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents](https://arxiv.org/abs/2310.19923)" arXiv 2023
 - Silvan Wehrli, Bert Arnrich, Christopher Irrgang. "[German Text Embedding Clustering Benchmark](https://arxiv.org/abs/2401.02709)" arXiv 2024

diff --git a/docs/adding_a_dataset.md b/docs/adding_a_dataset.md
@@ -252,7 +252,7 @@ model = SentenceTransformer(model_name)
 evaluation = MTEB(tasks=[YourNewTask()])
 ```
 
-- [ ] I have run the following models on the task (adding the results to the pr). These can be run using the `mteb -m {model_name} -t {task_name}` command.
+- [ ] I have run the following models on the task (adding the results to the pr). These can be run using the `mteb run -m {model_name} -t {task_name}` command.
   - [ ] `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`
   - [ ] `intfloat/multilingual-e5-small`
 - [ ] I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).

diff --git a/docs/adding_a_model.md b/docs/adding_a_model.md
@@ -132,3 +132,17 @@ model = ModelMeta(
    ...
 )
 ```
+
+##### Adding model dependencies in pyproject.toml
+If your are adding a model that requires additional dependencies, you can add them to the `pyproject.toml` file and instead of checking whether dependencies are installed or not make use of `requires_package` from [requires_package.py](../mteb/requires_packages.py). For example:
+
+In the [voyage_models.py](../mteb/models/voyage_models.py) file, we have added the following code:
+```python
+requires_package(self, "voyageai", model_name, "pip install 'mteb[voyageai]'")
+```
+and also updated [pyproject.toml]((../pyproject.toml)) file with the following code:
+```python
+voyageai = ["voyageai>=1.0.0,<2.0.0"]
+```
+so that it will check whether voyageai is installed or not. If not, then it will give an error message to install voyageai. This has done so as to give clear installation warnings.
+If you want to give suggestion instead of warning, you can use `suggest_package` from [requires_package.py](../mteb/requires_packages.py).