Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
171 commits
Select commit Hold shift + click to select a range
9556f99
misc: Add image classification descriptive stats implementation (#2045)
isaac-chung Feb 13, 2025
fadba48
Update tasks table
github-actions[bot] Feb 13, 2025
01fd6fb
fix: Add column descriptions to leaderboard (#2039)
KennethEnevoldsen Feb 13, 2025
3537223
fix: Add BRIGHT (long) and fix bug in TaskResult.filter_and_validate(…
KennethEnevoldsen Feb 13, 2025
68ff565
1.34.12
invalid-email-address Feb 13, 2025
eb32719
misc: Add image clustering descriptive stats implementation (#2057)
isaac-chung Feb 13, 2025
50b8e7b
fix: Update embed_dim for jina models (#2058)
KennethEnevoldsen Feb 13, 2025
48ef6f4
Update tasks table
github-actions[bot] Feb 13, 2025
8b7f2f8
1.34.13
invalid-email-address Feb 13, 2025
02d2583
Add giga embeddings (#1741)
Samoed Feb 13, 2025
20df284
misc: Add ZS and multilabel image classification descriptive stats im…
isaac-chung Feb 14, 2025
e090330
Update tasks table
github-actions[bot] Feb 14, 2025
bef4046
Rename MIEB task classes with duplicated names (#2061)
Samoed Feb 14, 2025
3cf7b15
misc: Add VisualSTS descriptive stats (#2062)
isaac-chung Feb 14, 2025
479fa20
Update tasks table
github-actions[bot] Feb 14, 2025
76e05dd
fix: Added gte models (#1539)
KennethEnevoldsen Feb 14, 2025
8604e07
fix: Add climate fever v2 (#1873)
mina-parham Feb 14, 2025
11ced79
Update tasks table
github-actions[bot] Feb 14, 2025
c6829d3
fix: Updating paper scripts (#1958)
KennethEnevoldsen Feb 14, 2025
26708c5
1.34.14
invalid-email-address Feb 14, 2025
5f4b593
Add datasets for a benchmark newly introduced for "Engineering" domai…
mehrzadshm Feb 15, 2025
dbda3c5
Update tasks table
github-actions[bot] Feb 15, 2025
50cc1c9
misc: update model names to adjust for adding to results repo (#2074)
isaac-chung Feb 16, 2025
04c9993
misc: Add all image classification descriptive stats (#2073)
isaac-chung Feb 17, 2025
3dbdeb1
Update tasks table
github-actions[bot] Feb 17, 2025
efaa990
ci: Rerun tests that fail due to networking issues. (#2029)
sam-hey Feb 17, 2025
26360a0
fix: generate metadata (#2063)
sam-hey Feb 17, 2025
8d4adbf
1.34.15
invalid-email-address Feb 17, 2025
efe2578
fix: add missing `e5` training datasets (#2065)
Samoed Feb 17, 2025
8ef26d0
1.34.16
invalid-email-address Feb 17, 2025
b14963f
fix: Ensure voyage model uses different naming scheme (#2083)
KennethEnevoldsen Feb 17, 2025
2d1f10d
1.34.17
invalid-email-address Feb 17, 2025
07562f4
fix: Freeze model/rank columns in leaderboard (#2044)
shikhar1729 Feb 17, 2025
879b243
1.34.18
invalid-email-address Feb 17, 2025
12d9b96
fix: Fixed previous incorrect specification of splits for CMTEB ( MTE…
KennethEnevoldsen Feb 17, 2025
72d454f
1.34.19
invalid-email-address Feb 17, 2025
c6e5123
Remove duplicated string in docstring of TaskMetadata class (#2087)
dantetemplar Feb 17, 2025
1006770
fix: Smarter leaderboard caching with cachetools (#2085)
x-tabdeveloping Feb 17, 2025
6637ff9
fix: Missing fixes for #2086 - change MultilingualSentiment split fro…
KennethEnevoldsen Feb 17, 2025
1f9cfc8
1.34.20
invalid-email-address Feb 17, 2025
1b1d327
merge gme models (#2089)
Samoed Feb 17, 2025
3deb7ea
fix: Add back task filtering by modalities (#2080)
isaac-chung Feb 18, 2025
544bcd1
1.34.21
invalid-email-address Feb 18, 2025
bbfbc45
Added gtr-t5-base/large/xl/xxl metadata to mteb (#2092)
sufen-f Feb 18, 2025
0371102
misc: Add Any2TextMutipleChoice Descriptive Statistics (#2095)
isaac-chung Feb 18, 2025
9ca55f0
Update tasks table
github-actions[bot] Feb 18, 2025
e0b364b
fix: Updated model annotations for GTE, e5, gritlm, and SFR models (#…
KennethEnevoldsen Feb 19, 2025
6b9f945
fix: Update links (#2098)
Muennighoff Feb 19, 2025
06489ab
1.34.22
invalid-email-address Feb 19, 2025
c69b8c3
Add model inf-retriever-v1-1.5b (#2106)
SamuelYang1 Feb 20, 2025
caa0b77
docs: Fix typos & refine text (#2102)
Muennighoff Feb 20, 2025
56a7b1a
misc: Run Zeroshot Classification Descriptive Stats (#2105)
isaac-chung Feb 20, 2025
6e0c87a
Update tasks table
github-actions[bot] Feb 20, 2025
6a71485
fix: add warning about task category conversion (#2108)
isaac-chung Feb 20, 2025
c91fbd0
1.34.23
invalid-email-address Feb 20, 2025
c052bbb
fix: Add codesage-large-v2 (#2090)
Aradhye2002 Feb 20, 2025
226b652
1.34.24
invalid-email-address Feb 20, 2025
cb42f4a
fix: add training data to BGE-m3-custom-fr (#2110)
KennethEnevoldsen Feb 20, 2025
dbe7559
1.34.25
invalid-email-address Feb 20, 2025
fb14e0c
fix: Upgrade ruff to be gradio compatible (#2111)
KennethEnevoldsen Feb 20, 2025
7538a2d
1.34.26
invalid-email-address Feb 20, 2025
276840f
docs: Follow google docstring format (#2115)
KennethEnevoldsen Feb 20, 2025
f3e4a9a
Update leaderboard_refresh.yaml (#2121)
Samoed Feb 21, 2025
463ca54
fix InstructSentenceTransformer Model name (#2125)
Samoed Feb 21, 2025
b032f98
fix voyage (#2127)
Samoed Feb 21, 2025
44cfa9b
fix: update e5 instruct training data (#2129)
Samoed Feb 21, 2025
d5a40e6
1.34.27
invalid-email-address Feb 21, 2025
950e3ab
format
KennethEnevoldsen Feb 21, 2025
de2e3e3
Update tasks table
github-actions[bot] Feb 21, 2025
e7735b2
fix: Add 2 new Static Sentence Transformer models (#2112)
tomaarsen Feb 21, 2025
2874e0c
1.34.28
invalid-email-address Feb 21, 2025
e6eb473
add is_cross_encoder (#1869)
Samoed Feb 21, 2025
17a120a
Qodo embed 1 1.5 b (#2137)
talshef Feb 23, 2025
4389501
misc: merge summary retrieval into bitext mining (#2140)
isaac-chung Feb 24, 2025
0163342
test: fix dataset availability test (#2141)
KennethEnevoldsen Feb 24, 2025
760fcaf
fix: Update NVIDIA-Embed training data (#2143)
KennethEnevoldsen Feb 24, 2025
9f6cc4e
1.34.29
invalid-email-address Feb 24, 2025
8538e93
fix: Add annotations for Voyage exp (#2144)
KennethEnevoldsen Feb 24, 2025
25cd62d
1.34.30
invalid-email-address Feb 24, 2025
8e97d36
Fix tokens num in cde models (#2148)
Samoed Feb 24, 2025
0e624b2
feat: Add Qodo-Embed-1-7B model metadata and rename existing model (#…
talshef Feb 24, 2025
4d23c6c
1.35.0
invalid-email-address Feb 24, 2025
bd2a67c
misc: add Any2AnyRetrievalDescriptiveStatistics (#2139)
isaac-chung Feb 24, 2025
ef3f4f0
Update tasks table
github-actions[bot] Feb 24, 2025
a7dc95a
Added zero-shot percentages and different filtering scheme (#2153)
x-tabdeveloping Feb 25, 2025
565e29c
fix: Incorrect annotations for Mistral-based embedding models (#2157)
KennethEnevoldsen Feb 25, 2025
90ec21c
1.35.1
invalid-email-address Feb 25, 2025
8afb78a
Update FaMTEBRetrieval.py (#2171)
garciasces Feb 26, 2025
331cded
Update tasks table
github-actions[bot] Feb 26, 2025
6cc1822
fix: Add Training data annotations (#2173)
KennethEnevoldsen Feb 26, 2025
ed0cb31
1.35.2
invalid-email-address Feb 26, 2025
dea231b
feat: Add MIEB and MIEB-lite as benchmarks (#2035)
isaac-chung Feb 27, 2025
dbcbf54
Update tasks table
github-actions[bot] Feb 27, 2025
afe1739
1.36.0
invalid-email-address Feb 27, 2025
62b33f2
fix: update training datasets and revision for jina models (#2179)
Feb 27, 2025
1959c73
fix: Add more training data annotations (#2178)
KennethEnevoldsen Feb 27, 2025
4a0bb5c
1.36.1
invalid-email-address Feb 27, 2025
43d15f1
Added training data annotation for e5-base-4k (#2186)
x-tabdeveloping Feb 28, 2025
1b23d4e
fix: Added training data annotations to MXBAI (#2185)
x-tabdeveloping Feb 28, 2025
7daf893
fix: Update MTEB(Scandinavian) to use new DanFEVER (#2180)
KennethEnevoldsen Feb 28, 2025
0307102
fix: Added training data annotation for MMLW models (#2188)
x-tabdeveloping Feb 28, 2025
7642c07
1.36.2
invalid-email-address Feb 28, 2025
0901cf6
fix: Added training data for sentence-croissant (#2189)
x-tabdeveloping Feb 28, 2025
d4b691f
1.36.3
invalid-email-address Feb 28, 2025
3325f7e
fix: update ru models annotation (#2181)
Samoed Feb 28, 2025
c04d158
1.36.4
invalid-email-address Feb 28, 2025
fee6fc0
fix: Alphabetical ordering of tasks in dropdowns (#2191)
ayush1298 Feb 28, 2025
0631089
1.36.5
invalid-email-address Feb 28, 2025
7345235
misc: Speed up qrel creation in any2anyretrieval (#2196)
isaac-chung Feb 28, 2025
29464ac
use 'mteb.MTEB' instead of 'MTEB' for custom model (#2199)
yaya-sy Feb 28, 2025
1c8d715
add base models for e5 (#2183)
Samoed Mar 2, 2025
7af37d4
add similar datasets (#2205)
Samoed Mar 2, 2025
587892d
add labse annotation (#2182)
Samoed Mar 2, 2025
761a174
fix: Fixed leaderboard crash (#2221)
x-tabdeveloping Mar 3, 2025
e57cd50
1.36.6
invalid-email-address Mar 3, 2025
2dd1391
fix: More training data annotations (#2220)
x-tabdeveloping Mar 3, 2025
546e0c4
1.36.7
invalid-email-address Mar 3, 2025
4ee4e7c
Add LLM2CLIP (OpenAI variants) (#2222)
isaac-chung Mar 3, 2025
c5fded2
Change `dataset on HF` test to use official api (#2213)
Samoed Mar 3, 2025
3e991bd
Descriptive stats functions for Any2AnyMC and ImageTextPC (#2197)
imenelydiaker Mar 3, 2025
cc47225
Update tasks table
github-actions[bot] Mar 3, 2025
ee514cb
fix: Add training data annotations to uderver-bloom models (#2210)
KennethEnevoldsen Mar 3, 2025
4de58c3
1.36.8
invalid-email-address Mar 3, 2025
a87927b
Add comment to `voyage-3-m-exp` model (#2229)
Samoed Mar 3, 2025
3a9d271
docs: Update description of EURLex (#2231)
KennethEnevoldsen Mar 4, 2025
7f7d3e8
Automatically add similar tasks to training_tasks (#2228)
Samoed Mar 4, 2025
6129282
Remove overlapping legends from radar chart (#2195)
ayush1298 Mar 5, 2025
40b89db
misc: Run Any2AnyRetrieval descriptive stats (#2223)
isaac-chung Mar 6, 2025
e81d109
Update tasks table
github-actions[bot] Mar 6, 2025
43cb205
misc: Add rest of the vision centric and compositionality descriptive…
isaac-chung Mar 6, 2025
d8e73e7
Update tasks table
github-actions[bot] Mar 6, 2025
a4456ec
Fix `calculate_memory_usage_mb` in adding_a_model.md (#2271)
Samoed Mar 6, 2025
f964829
Add Arabic-Triplet-Matryoshka-V2 model metadata to MTEB (#2270)
omarnj-lab Mar 7, 2025
9d6e1a9
fix: Add WebFAQ Retrieval dataset (#2236)
michaeldinzinger Mar 7, 2025
a67c4d0
Update tasks table
github-actions[bot] Mar 7, 2025
1841aca
1.36.9
invalid-email-address Mar 7, 2025
c456111
fix: Formatting issue in Performance Plot (#2237)
ayush1298 Mar 7, 2025
1d41474
1.36.10
invalid-email-address Mar 7, 2025
55b9a0e
ci: run test_dataset_on_hf separately (#2201)
sam-hey Mar 7, 2025
fb1b04c
add gemini-embedding-exp-03-07 (#2279)
jhyuklee Mar 7, 2025
9513f15
update link (#2281)
jhyuklee Mar 7, 2025
e628bce
fix: Run remaining MIEB desc stats (#2288)
isaac-chung Mar 8, 2025
dd7008d
Update tasks table
github-actions[bot] Mar 8, 2025
18ed1bb
1.36.11
invalid-email-address Mar 8, 2025
f840f7d
fix: Added Filter Modality (#2262)
ayush1298 Mar 9, 2025
6284f25
1.36.12
invalid-email-address Mar 9, 2025
5dce601
fix: Add `ModelMeta` license & custom validations (#2293)
Samoed Mar 9, 2025
02003b1
1.36.13
invalid-email-address Mar 9, 2025
5b30d84
ci: Add pre-commit hook (#2194)
sam-hey Mar 10, 2025
5e3915e
Update tasks table
github-actions[bot] Mar 10, 2025
6193db1
fix: bug in voyage implementation (#2304)
KennethEnevoldsen Mar 10, 2025
c4d2888
1.36.14
invalid-email-address Mar 10, 2025
746b411
fix: Update voyage name to include Org. (#2322)
KennethEnevoldsen Mar 11, 2025
5f6872e
1.36.15
invalid-email-address Mar 11, 2025
7965aad
Added VDR Model (#2290)
ayush1298 Mar 11, 2025
8f6bf45
fix: Resolve conflicting dependencies (#2323)
KennethEnevoldsen Mar 11, 2025
122eaa1
1.36.16
invalid-email-address Mar 11, 2025
fc176ad
fix: remove SyntaxWarnings in py312 (#2325)
KennethEnevoldsen Mar 11, 2025
8b14281
1.36.17
invalid-email-address Mar 11, 2025
034da4d
fix: add annotation models for stella zh (#2277)
KennethEnevoldsen Mar 11, 2025
d58f229
1.36.18
invalid-email-address Mar 11, 2025
ae83b5f
fix: Add ModelMeta rubert-mini-frida, BERTA (#2330)
sergeyz-zh Mar 11, 2025
849efbb
docs: fix typos
Muennighoff Mar 11, 2025
f16b3f9
1.36.19
invalid-email-address Mar 11, 2025
04cfe4d
fix: Add WebFAQ bitext mining tasks (#2326)
michaeldinzinger Mar 12, 2025
d716408
Update tasks table
github-actions[bot] Mar 12, 2025
c40747f
1.36.20
invalid-email-address Mar 12, 2025
c5bb156
Merge remote-tracking branch 'origin/main' into maeb-main-merge-20250312
isaac-chung Mar 12, 2025
5597c34
make lint
isaac-chung Mar 12, 2025
096d499
fix validation for license
isaac-chung Mar 12, 2025
6305df7
fix remaining validation errors
isaac-chung Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
8 changes: 4 additions & 4 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@
- [ ] I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
- [ ] If the dataset is too big (e.g. >2048 examples), considering using `self.stratified_subsampling() under dataset_transform()`
- [ ] I have filled out the metadata object in the dataset file (find documentation on it [here](https://github.com/embeddings-benchmark/mteb/blob/main/docs/adding_a_dataset.md#2-creating-the-metadata-object)).
- [ ] Run tests locally to make sure nothing is broken using `make test`.
- [ ] Run the formatter to format the code using `make lint`.
- [ ] Run tests locally to make sure nothing is broken using `make test`.
- [ ] Run the formatter to format the code using `make lint`.


### Adding a model checklist
<!--
<!--
When adding a model to the model registry
see also https://github.com/embeddings-benchmark/mteb/blob/main/docs/reproducible_workflow.md
-->
Expand All @@ -43,4 +43,4 @@ see also https://github.com/embeddings-benchmark/mteb/blob/main/docs/reproducibl
- [ ] I have ensured that my model can be loaded using
- [ ] `mteb.get_model(model_name, revision)` and
- [ ] `mteb.get_model_meta(model_name, revision)`
- [ ] I have tested the implementation works on a representative set of tasks.
- [ ] I have tested the implementation works on a representative set of tasks.
27 changes: 27 additions & 0 deletions .github/workflows/dataset_loading.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Datasets available on HuggingFace

on:
pull_request:
push:
branches: [main]

jobs:
extract-and-run:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
cache: 'pip'

- name: Install dependencies
run: |
make install-for-tests
- name: Run dataset loading tests
run: |
make dataset-load-test
3 changes: 1 addition & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
- name: Create table
run: |
make build-docs

- name: Push table
run: |
git config --global user.email "github-actions[bot]@users.noreply.github.com"
Expand All @@ -60,4 +60,3 @@ jobs:
git commit -m "Update tasks table"
git push
fi

6 changes: 3 additions & 3 deletions .github/workflows/leaderboard_refresh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ name: Daily Space Rebuild
on:
schedule:
# Runs at midnight Pacific Time (8 AM UTC)
- cron: '0 8 * * *'
workflow_dispatch: # Allows manual triggering
- cron: "0 8 * * *"
workflow_dispatch: # Allows manual triggering

jobs:
rebuild:
Expand All @@ -12,5 +12,5 @@ jobs:
- name: Trigger Factory Rebuild
run: |
curl -X POST \
"https://huggingface.co/api/spaces/mteb/leaderboard_2_demo/restart?factory=true" \
"https://huggingface.co/api/spaces/mteb/leaderboard/restart?factory=true" \
-H "Authorization: Bearer ${{ secrets.HF_TOKEN }}"
1 change: 0 additions & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,3 @@ jobs:
id: lint
run: |
make lint-check

22 changes: 11 additions & 11 deletions .github/workflows/model_loading.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,22 @@ name: Model Loading
on:
pull_request:
paths:
- 'mteb/models/**.py'
- "mteb/models/**.py"

jobs:
extract-and-run:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Checkout repository
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
cache: 'pip'
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
cache: "pip"

- name: Install dependencies and run tests
run: |
make model-load-test BASE_BRANCH=${{ github.event.pull_request.base.ref }}
- name: Install dependencies and run tests
run: |
make model-load-test BASE_BRANCH=${{ github.event.pull_request.base.ref }}
7 changes: 3 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ jobs:
runs-on: ubuntu-latest
concurrency: release
permissions:
id-token: write # IMPORTANT: this permission is mandatory for trusted publishing using PyPI

id-token: write # IMPORTANT: this permission is mandatory for trusted publishing using PyPI

if: ${{ github.ref == 'refs/heads/main' && github.event.workflow_run.conclusion == 'success'}}
steps:
Expand All @@ -40,8 +39,8 @@ jobs:
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
if: steps.release.outputs.released == 'true'
# This action supports PyPI's trusted publishing implementation, which allows authentication to PyPI without a manually
# configured API token or username/password combination. To perform trusted publishing with this action, your project's
# This action supports PyPI's trusted publishing implementation, which allows authentication to PyPI without a manually
# configured API token or username/password combination. To perform trusted publishing with this action, your project's
# publisher must already be configured on PyPI.

- name: Publish package distributions to GitHub Releases
Expand Down
4 changes: 1 addition & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
# 1) install Python dependencies
# 2) run make test


name: Test
on:
push:
Expand Down Expand Up @@ -30,7 +29,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
cache: "pip"

- name: Install dependencies
shell: bash
run: |
Expand All @@ -53,4 +52,3 @@ jobs:
# if it fails again, the workflow will fail.
# If it passes the first time the test will not run again
make test || make test

2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -151,4 +151,4 @@ model_names.txt
mteb/leaderboard/__cached_results.json

# gradio
.gradio/
.gradio/
31 changes: 31 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
fail_fast: true

repos:
- repo: https://github.com/abravalheri/validate-pyproject
rev: v0.23
hooks:
- id: validate-pyproject

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-yaml
- id: check-json
- id: pretty-format-json
args:
- "--autofix"
- "--indent=4"
- "--no-sort-keys"
- id: end-of-file-fixer # generated a lot of changes
- id: trailing-whitespace
- id: check-toml

- repo: local
hooks:
- id: lint
name: lint
description: "Run 'make lint'"
entry: make lint
language: python
types_or: [python]
minimum_pre_commit_version: "2.9.2"
2 changes: 1 addition & 1 deletion .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
"recommendations": [
"charliermarsh.ruff"
]
}
}
2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.defaultFormatter": "charliermarsh.ruff"
}
16 changes: 7 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
## Contributing to MTEB
We welcome contributions such as new datasets to MTEB! Please see detailed see the related [issue](https://github.com/embeddings-benchmark/mteb/issues/360) for more information.

Once you have decided on your contribution, this document describes how to set up the repository for development.
We welcome contributions. Please see the current open issues or open an issue yourself. Once you have decided on what you'd like to contribute, this document describes how to set up the repository for development.

### Development Installation
If you want to submit a dataset or on other ways contribute to MTEB, you can install the package in development mode:
If you want to submit a dataset or in other ways contribute to MTEB, you can install the package in development mode:

```bash
git clone https://github.com/embeddings-benchmark/mteb
Expand All @@ -21,10 +19,10 @@ To run the tests, you can use the following command:
make test
```

This is also run by the CI pipeline, so you can be sure that your changes do not break the package. We recommend running the tests in the lowest version of python supported by the package (see the pyproject.toml) to ensure compatibility.
This is also run by the CI pipeline, so you can be sure that your changes do not break the package. We recommend running the tests in the lowest version of Python supported by the package (see the pyproject.toml) to ensure compatibility.

### Running linting
To run the linting before a PR you can use the following command:
To run the linting before a PR, you can use the following command:

```bash
make lint
Expand All @@ -33,12 +31,12 @@ make lint
This command is equivalent to the command run during CI. It will check for code style and formatting issues.

## Semantic Versioning and Releases
MTEB follows [semantic versioning](https://semver.org/). This means that the version number of the package is composed of three numbers: `MAJOR.MINOR.PATCH`. This allow us to use existing tools to automatically manage the versioning of the package. For maintainers (and contributors), this means that commits with the following prefixes will automatically trigger a version bump:
MTEB follows [semantic versioning](https://semver.org/). This means that the version number of the package is composed of three numbers: `MAJOR.MINOR.PATCH`. This allows us to use existing tools to manage the versioning of the package automatically. For maintainers (and contributors), this means that commits with the following prefixes will automatically trigger a version bump:

- `fix:` for patches
- `feat:` for minor versions
- `breaking:` for major versions

Any commit with one of these prefixes will trigger a version bump upon merging to the main branch as long as tests pass. A version bump will then trigger a new release on PyPI as well as a new release on GitHub.
Any commit with one of these prefixes will trigger a version bump upon merging to the main branch, as long as the tests pass. A version bump will then trigger a new release on PyPI as well as a new release on GitHub.

Other prefixes will not trigger a version bump. For example, `docs:`, `chore:`, `refactor:`, etc., however they will structure the commit history and the changelog. You can find more information about this in the [python-semantic-release documentation](https://python-semantic-release.readthedocs.io/en/latest/). If you do not intend to trigger a version bump you're not required to follow this convention when contributing to MTEB.
Other prefixes will not trigger a version bump. For example, `docs:`, `chore:`, `refactor:`, etc., however they will structure the commit history and the changelog. You can find more information about this in the [python-semantic-release documentation](https://python-semantic-release.readthedocs.io/en/latest/). If you do not intend to trigger a version bump, you're not required to follow this convention when contributing to MTEB.
26 changes: 22 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
install:
@echo "--- 🚀 Installing project dependencies ---"
pip install -e ".[dev]"
pre-commit install

install-for-tests:
@echo "--- 🚀 Installing project dependencies for test ---"
Expand All @@ -10,7 +11,7 @@ install-for-tests:
lint:
@echo "--- 🧹 Running linters ---"
ruff format . # running ruff formatting
ruff check . --fix # running ruff linting
ruff check . --fix --exit-non-zero-on-fix # running ruff linting # --exit-non-zero-on-fix is used for the pre-commit hook to work

lint-check:
@echo "--- 🧹 Check is project is linted ---"
Expand All @@ -20,11 +21,12 @@ lint-check:

test:
@echo "--- 🧪 Running tests ---"
pytest -n auto --durations=5
pytest -n auto -m "not test_datasets"


test-with-coverage:
@echo "--- 🧪 Running tests with coverage ---"
pytest -n auto --durations=5 --cov-report=term-missing --cov-config=pyproject.toml --cov=mteb
pytest -n auto --cov-report=term-missing --cov-config=pyproject.toml --cov=mteb

pr:
@echo "--- 🚀 Running requirements for a PR ---"
Expand All @@ -42,4 +44,20 @@ model-load-test:
@echo "--- 🚀 Running model load test ---"
pip install ".[dev, speedtask, pylate,gritlm,xformers,model2vec]"
python scripts/extract_model_names.py $(BASE_BRANCH) --return_one_model_name_per_file
python tests/test_models/model_loading.py --model_name_file scripts/model_names.txt
python tests/test_models/model_loading.py --model_name_file scripts/model_names.txt


dataset-load-test:
@echo "--- 🚀 Running dataset load test ---"
pytest -n auto -m test_datasets


run-leaderboard:
@echo "--- 🚀 Running leaderboard locally ---"
python -m mteb.leaderboard.app


.PHONY: check
check: ## Run code quality tools.
@echo "--- 🧹 Running code quality tools ---"
@pre-commit run -a
Loading