Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -472,24 +472,24 @@ evaluation.run(model, ...)

## Documentation

| Documentation | |
| ------------------------------ | ---------------------- |
| 📋 [Tasks] Overview of available tasks |
| 📐 [Benchmarks] | Overview of available benchmarks |
| 📈 [Leaderboard] | The interactive leaderboard of the benchmark |
| 🤖 [Adding a model] | Information related to how to submit a model to the leaderboard |
| Documentation | |
|--------------------------------|-------------------------------------------------------------------------------------|
| 📋 [Tasks] | Overview of available tasks |
| 📐 [Benchmarks] | Overview of available benchmarks |
| 📈 [Leaderboard] | The interactive leaderboard of the benchmark |
| 🤖 [Adding a model] | Information related to how to submit a model to MTEB and to the leaderboard |
| 👩‍🔬 [Reproducible workflows] | Information related to how to reproduce and create reproducible workflows with MTEB |
| 👩‍💻 [Adding a dataset] | How to add a new task/dataset to MTEB
| 👩‍💻 [Adding a leaderboard tab] | How to add a new leaderboard tab to MTEB
| 🤝 [Contributing] | How to contribute to MTEB and set it up for development |
| 🌐 [MMTEB] | An open-source effort to extend MTEB to cover a broad set of languages |  
| 👩‍💻 [Adding a dataset] | How to add a new task/dataset to MTEB |
| 👩‍💻 [Adding a benchmark] | How to add a new benchmark to MTEB and to the leaderboard |
| 🤝 [Contributing] | How to contribute to MTEB and set it up for development |
| 🌐 [MMTEB] | An open-source effort to extend MTEB to cover a broad set of languages |

[Tasks]: docs/tasks.md
[Benchmarks]: docs/benchmarks.md
[Contributing]: CONTRIBUTING.md
[Adding a model]: docs/adding_a_model.md
[Adding a dataset]: docs/adding_a_dataset.md
[Adding a leaderboard tab]: docs/adding_a_leaderboard_tab.md
[Adding a benchmark]: docs/adding_a_benchmark.md
[Leaderboard]: https://huggingface.co/spaces/mteb/leaderboard
[MMTEB]: docs/mmteb/readme.md
[Reproducible workflows]: docs/reproducible_workflow.md
Expand Down
7 changes: 7 additions & 0 deletions docs/adding_a_benchmark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Adding a benchmark

The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/leaderboard) and we encourage additions of new benchmarks. To add a new benchmark:

1. Add your benchmark to [benchmark.py](../mteb/benchmarks/benchmarks.py) as a `Benchmark` object, and select the MTEB tasks that will be in the benchmark. If some of the tasks do not exist in MTEB, follow the "add a dataset" instructions to add them.
2. Open a PR at https://github.com/embedding-benchmark/results with results of models on your benchmark.
3. When PRs are merged, your benchmark will be added to the leaderboard automatically after the next workflow trigger.
15 changes: 0 additions & 15 deletions docs/adding_a_leaderboard_tab.md

This file was deleted.

112 changes: 79 additions & 33 deletions docs/adding_a_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,63 @@

The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/leaderboard). To submit to it:

1. **Run the desired model on MTEB:**
1. **Add meta information about your model to [model dir](../mteb/models/)**.
```python
from mteb.model_meta import ModelMeta

bge_m3 = ModelMeta(
name="model_name",
languages=["model_languages"], # in format eng-Latn
open_weights=True,
revision="5617a9f61b028005a4858fdac845db406aefb181",
release_date="2024-06-28",
n_parameters=568_000_000,
embed_dim=4096,
license="mit",
max_tokens=8194,
reference="https://huggingface.co/BAAI/bge-m3",
similarity_fn_name="cosine",
framework=["Sentence Transformers", "PyTorch"],
use_instructions=False,
public_training_code=None,
public_training_data="https://huggingface.co/datasets/cfli/bge-full-data",
training_datasets={"your_dataset": ["train"]},
)
```
By default, the model will run using the [`sentence_transformers_loader`](../mteb/models/sentence_transformer_wrapper.py) loader function. If you need to use a custom implementation, you can specify the `loader` parameter in the `ModelMeta` class. For example:
```python
from mteb.models.wrapper import Wrapper
from mteb.encoder_interface import PromptType
import numpy as np

class CustomWrapper(Wrapper):
def __init__(self, model_name, model_revision):
super().__init__(model_name, model_revision)
# your custom implementation here

def encode(
self,
sentences: list[str],
*,
task_name: str,
prompt_type: PromptType | None = None,
**kwargs
) -> np.ndarray:
# your custom implementation here
return np.zeros((len(sentences), self.embed_dim))
```
Then you can specify the `loader` parameter in the `ModelMeta` class:
```python
your_model = ModelMeta(
loader=partial(
CustomWrapper,
model_name="model_name",
model_revision="5617a9f61b028005a4858fdac845db406aefb181"
),
...
)
```
2. **Run the desired model on MTEB:**

Either use the Python API:

Expand Down Expand Up @@ -32,45 +88,35 @@ These will save the results in a folder called `results/{model_name}/{model_revi

To add results to the public leaderboard you can push your results to the [results repository](https://github.com/embeddings-benchmark/results) via a PR. Once merged they will appear on the leaderboard after a day.


3. (Optional) **Add results to the model card:**

`mteb` implements a cli for adding results to the model card:

```bash
mteb create_meta --results_folder results/{model_name}/{model_revision} --output_path model_card.md
```

To add the content to the public model simply copy the content of the `model_card.md` file to the top of a `README.md` file of your model on the Hub. See [here](https://huggingface.co/Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit/blob/main/README.md) for an example.

If the readme already exists:

```bash
mteb create_meta --results_folder results/{model_name}/{model_revision} --output_path model_card.md --from_existing your_existing_readme.md
```

Note that running the model on many tasks may lead to a huge readme front matter.

4. **Wait for a refresh the leaderboard:**

The leaderboard [automatically refreshes daily](https://github.com/embeddings-benchmark/leaderboard/commits/main/) so once submitted you only need to wait for the automatic refresh. You can find the workflows for the leaderboard refresh [here](https://github.com/embeddings-benchmark/leaderboard/tree/main/.github/workflows). If you experience issues with the leaderboard please create an [issue](https://github.com/embeddings-benchmark/mteb/issues).
3. **Wait for a refresh the leaderboard**

**Notes:**
- We remove models with scores that cannot be reproduced, so please ensure that your model is accessible and scores can be reproduced.

- ##### Using Prompts with Sentence Transformers
##### Using Prompts with Sentence Transformers

If your model uses Sentence Transformers and requires different prompts for encoding the queries and corpus, you can take advantage of the `prompts` [parameter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer).

Internally, `mteb` uses the prompt named `query` for encoding the queries and `passage` as the prompt name for encoding the corpus. This is aligned with the default names used by Sentence Transformers.
If your model uses Sentence Transformers and requires different prompts for encoding the queries and corpus, you can take advantage of the `prompts` [parameter](https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer).

###### Adding the prompts in the model configuration (Preferred)
Internally, `mteb` uses `query` for encoding the queries and `passage` as the prompt names for encoding the corpus. This is aligned with the default names used by Sentence Transformers.

You can directly add the prompts when saving and uploading your model to the Hub. For an example, refer to this [configuration file](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5/blob/3b5a16eaf17e47bd997da998988dce5877a57092/config_sentence_transformers.json).
###### Adding the prompts in the model configuration (Preferred)

###### Instantiating the Model with Prompts
You can directly add the prompts when saving and uploading your model to the Hub. For an example, refer to this [configuration file](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5/blob/3b5a16eaf17e47bd997da998988dce5877a57092/config_sentence_transformers.json). These prompts can then be specified in the ModelMeta object.

If you are unable to directly add the prompts in the model configuration, you can instantiate the model using the `sentence_transformers_loader` and pass `prompts` as an argument. For more details, see the `mteb/models/bge_models.py` file.

```python
model = ModelMeta(
loader=partial( # type: ignore
sentence_transformers_loader,
model_name="intfloat/multilingual-e5-small",
revision="fd1525a9fd15316a2d503bf26ab031a61d056e98",
model_prompts={
"query": "query: ",
"passage": "passage: ",
},
),
)
```
If you are unable to directly add the prompts in the model configuration, you can instantiate the model using the `sentence_transformers_loader` and pass `prompts` as an argument. For more details, see the `mteb/models/bge_models.py` file.

##### Adding instruction models

Expand All @@ -85,4 +131,4 @@ model = ModelMeta(
),
...
)
```
```