embeddings-benchmark · isaac-chung · Jan 30, 2025 · Jan 24, 2025 · Jan 25, 2025 · Jan 26, 2025
diff --git a/README.md b/README.md
@@ -472,24 +472,24 @@ evaluation.run(model, ...)
 
 ## Documentation
 
-| Documentation                  |                        |
-| ------------------------------ | ---------------------- |
-| 📋 [Tasks] | Overview of available tasks |
-| 📐 [Benchmarks] | Overview of available benchmarks |
-| 📈 [Leaderboard] | The interactive leaderboard of the benchmark |
-| 🤖 [Adding a model] | Information related to how to submit a model to the leaderboard |
+| Documentation                  |                                                                                     |
+|--------------------------------|-------------------------------------------------------------------------------------|
+| 📋 [Tasks]                     | Overview of available tasks                                                         |
+| 📐 [Benchmarks]                | Overview of available benchmarks                                                    |
+| 📈 [Leaderboard]               | The interactive leaderboard of the benchmark                                        |
+| 🤖 [Adding a model]            | Information related to how to submit a model to MTEB and to the leaderboard |
 | 👩‍🔬 [Reproducible workflows] | Information related to how to reproduce and create reproducible workflows with MTEB |
-| 👩‍💻 [Adding a dataset] | How to add a new task/dataset to MTEB | 
-| 👩‍💻 [Adding a leaderboard tab] | How to add a new leaderboard tab to MTEB | 
-| 🤝 [Contributing] | How to contribute to MTEB and set it up for development |
-| 🌐 [MMTEB] | An open-source effort to extend MTEB to cover a broad set of languages |  
+| 👩‍💻 [Adding a dataset]       | How to add a new task/dataset to MTEB                                               |
+| 👩‍💻 [Adding a benchmark]     | How to add a new benchmark to MTEB and to the leaderboard                           |
+| 🤝 [Contributing]              | How to contribute to MTEB and set it up for development                             |
+| 🌐 [MMTEB]                     | An open-source effort to extend MTEB to cover a broad set of languages              |
 
 [Tasks]: docs/tasks.md
 [Benchmarks]: docs/benchmarks.md
 [Contributing]: CONTRIBUTING.md
 [Adding a model]: docs/adding_a_model.md
 [Adding a dataset]: docs/adding_a_dataset.md
-[Adding a leaderboard tab]: docs/adding_a_leaderboard_tab.md
+[Adding a benchmark]: docs/adding_a_benchmark.md
 [Leaderboard]: https://huggingface.co/spaces/mteb/leaderboard
 [MMTEB]: docs/mmteb/readme.md
 [Reproducible workflows]: docs/reproducible_workflow.md

diff --git a/docs/adding_a_benchmark.md b/docs/adding_a_benchmark.md
@@ -0,0 +1,7 @@
+## Adding a benchmark
+
+The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/leaderboard) and we encourage additions of new benchmarks. To add a new benchmark:
+
+1. Add your benchmark to [benchmark.py](../mteb/benchmarks/benchmarks.py) as a `Benchmark` object, and select the MTEB tasks that will be in the benchmark. If some of the tasks do not exist in MTEB, follow the "add a dataset" instructions to add them.
+2. Open a PR at https://github.com/embedding-benchmark/results with results of models on your benchmark.
+3. When PRs are merged, your benchmark will be added to the leaderboard automatically after the next workflow trigger.
diff --git a/docs/adding_a_leaderboard_tab.md b/docs/adding_a_leaderboard_tab.md
diff --git a/docs/adding_a_model.md b/docs/adding_a_model.md
@@ -2,7 +2,63 @@
 
 The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/leaderboard). To submit to it:
 
-1. **Run the desired model on MTEB:**
+1. **Add meta information about your model to [model dir](../mteb/models/)**.
+   ```python
+   from mteb.model_meta import ModelMeta
+
+   bge_m3 = ModelMeta(
+       name="model_name",
+       languages=["model_languages"], # in format eng-Latn
+       open_weights=True,
+       revision="5617a9f61b028005a4858fdac845db406aefb181",
+       release_date="2024-06-28",
+       n_parameters=568_000_000,
+       embed_dim=4096,
+       license="mit",
+       max_tokens=8194,
+       reference="https://huggingface.co/BAAI/bge-m3",
+       similarity_fn_name="cosine",
+       framework=["Sentence Transformers", "PyTorch"],
+       use_instructions=False,
+       public_training_code=None,
+       public_training_data="https://huggingface.co/datasets/cfli/bge-full-data",
+       training_datasets={"your_dataset": ["train"]},
+   )
+   ```
+   By default, the model will run using the [`sentence_transformers_loader`](../mteb/models/sentence_transformer_wrapper.py) loader function. If you need to use a custom implementation, you can specify the `loader` parameter in the `ModelMeta` class. For example:
+   ```python
+   from mteb.models.wrapper import Wrapper
+   from mteb.encoder_interface import PromptType
+   import numpy as np
+
+   class CustomWrapper(Wrapper):
+       def __init__(self, model_name, model_revision):
+           super().__init__(model_name, model_revision)
+           # your custom implementation here
+
+       def encode(
+            self,
+            sentences: list[str],
+            *,
+            task_name: str,
+            prompt_type: PromptType | None = None,
+            **kwargs
+       ) -> np.ndarray:
+           # your custom implementation here
+           return np.zeros((len(sentences), self.embed_dim))
+   ```
+   Then you can specify the `loader` parameter in the `ModelMeta` class:
+   ```python
+   your_model = ModelMeta(
+       loader=partial(
+            CustomWrapper, 
+            model_name="model_name",
+            model_revision="5617a9f61b028005a4858fdac845db406aefb181"
+       ),
+       ...
+   )
+   ```
+2. **Run the desired model on MTEB:**
 
 Either use the Python API:
 
@@ -32,45 +88,35 @@ These will save the results in a folder called `results/{model_name}/{model_revi
 
 To add results to the public leaderboard you can push your results to the [results repository](https://github.com/embeddings-benchmark/results) via a PR. Once merged they will appear on the leaderboard after a day.
 
-
-3. (Optional) **Add results to the model card:**
-
-`mteb` implements a cli for adding results to the model card:
-
-```bash
-mteb create_meta --results_folder results/{model_name}/{model_revision} --output_path model_card.md
-```
-
-To add the content to the public model simply copy the content of the `model_card.md` file to the top of a `README.md` file of your model on the Hub. See [here](https://huggingface.co/Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit/blob/main/README.md) for an example.
-
-If the readme already exists:
-
-```bash
-mteb create_meta --results_folder results/{model_name}/{model_revision} --output_path model_card.md --from_existing your_existing_readme.md 
-```
-
-Note that running the model on many tasks may lead to a huge readme front matter.
-
-4. **Wait for a refresh the leaderboard:**
-
-The leaderboard [automatically refreshes daily](https://github.com/embeddings-benchmark/leaderboard/commits/main/) so once submitted you only need to wait for the automatic refresh. You can find the workflows for the leaderboard refresh [here](https://github.com/embeddings-benchmark/leaderboard/tree/main/.github/workflows). If you experience issues with the leaderboard please create an [issue](https://github.com/embeddings-benchmark/mteb/issues).
+3. **Wait for a refresh the leaderboard**
 
 **Notes:**
-- We remove models with scores that cannot be reproduced, so please ensure that your model is accessible and scores can be reproduced.
 
-- ##### Using Prompts with Sentence Transformers
+##### Using Prompts with Sentence Transformers
 
-    If your model uses Sentence Transformers and requires different prompts for encoding the queries and corpus, you can take advantage of the `prompts` [parameter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer). 
-
-    Internally, `mteb` uses the prompt named `query` for encoding the queries and `passage` as the prompt name for encoding the corpus. This is aligned with the default names used by Sentence Transformers.
+If your model uses Sentence Transformers and requires different prompts for encoding the queries and corpus, you can take advantage of the `prompts` [parameter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer). 
 
-    ###### Adding the prompts in the model configuration (Preferred)
+Internally, `mteb` uses `query` for encoding the queries and `passage` as the prompt names for encoding the corpus. This is aligned with the default names used by Sentence Transformers.
 
-    You can directly add the prompts when saving and uploading your model to the Hub. For an example, refer to this [configuration file](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5/blob/3b5a16eaf17e47bd997da998988dce5877a57092/config_sentence_transformers.json).
+###### Adding the prompts in the model configuration (Preferred)
 
-    ###### Instantiating the Model with Prompts
+You can directly add the prompts when saving and uploading your model to the Hub. For an example, refer to this [configuration file](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5/blob/3b5a16eaf17e47bd997da998988dce5877a57092/config_sentence_transformers.json). These prompts can then be specified in the ModelMeta object.
 
-    If you are unable to directly add the prompts in the model configuration, you can instantiate the model using the `sentence_transformers_loader` and pass `prompts` as an argument. For more details, see the `mteb/models/bge_models.py` file.
+
+```python
+model = ModelMeta(
+    loader=partial(  # type: ignore
+        sentence_transformers_loader,
+        model_name="intfloat/multilingual-e5-small",
+        revision="fd1525a9fd15316a2d503bf26ab031a61d056e98",
+        model_prompts={
+           "query": "query: ",
+           "passage": "passage: ",
+        },
+    ),
+)
+```
+If you are unable to directly add the prompts in the model configuration, you can instantiate the model using the `sentence_transformers_loader` and pass `prompts` as an argument. For more details, see the `mteb/models/bge_models.py` file.
 
 ##### Adding instruction models
 
@@ -85,4 +131,4 @@ model = ModelMeta(
     ),
    ...
 )
-```
+```