[v2] Merge main by Samoed · Pull Request #2204 · embeddings-benchmark/mteb

Samoed · 2025-03-01T12:42:11Z

Code Quality

Code Formatted: Format the code using make lint to maintain consistent style.

Testing

New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

This simplified the test and also make it a lot simpler. It also removed about 100 test cases which where all to the same API call.

Added a few missing annotations for nvidia-embed

Automatically generated by python-semantic-release

* fix: Update NVIDIA-Embed training data Added a few missing annotations for nvidia-embed * fix update annotationf for voyage exp

Automatically generated by python-semantic-release

fix tokens

…2146) * feat: Add Qodo-Embed-1-7B model metadata and rename existing model * lint * fix revision * update license name --------- Co-authored-by: Tal Sheffer <tal.s@codium.ai>

Automatically generated by python-semantic-release

add Any2AnyRetrievalDescriptiveStatistics

* Added zero-shot percentages and different filtering scheme * Update mteb/model_meta.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Fixes #2155

Automatically generated by python-semantic-release

The URL pointed to the settings page instead of the main repo URL. Now it is fixed.

* redo to voyage to only training data * Add training data annotation for Kalm embeddings #2168 * Add correct training data annotations to Stella #2164 * removed fiqa PL as it does not exist * remove ArxivClusteringS2S.v2 as it does not exist * Add training data annotation for GIST embedding #2166 * fix max tokens for kalm models #2162 * remove eli 5

Automatically generated by python-semantic-release

* add mieb and mieb-lite to benchmarks * add CompositionalityEvaluation and DocumentUnderstanding types * add VisionCentric type * add missing comma * split STS17MultilingualVisualSTS and STSBenchmarkMultilingualSTS to eng and non-eng * use aggregate task instead so we can name the subsets * shorten names * fix import * alternative strategy to avoid using get_task * follow other aggregate tasks and skip metadata test * run LB without errors when selecting MIEB(-lite) * add back the capability as taks type * typo * extend description * split into mieb(eng) and mieb(multilingual) * remove unneeded files * remove aggtask additions for test * edit descriptions based on screenshots * shorten * rename to Compositionality and include ImageCoDeT2IMultiChoice * re-tag missing VisionCentric tasks * re-tag rparis and roxford as retrieval and include fixes * re-tag voc2007 as image cls * make lint * correct num task types in descriptions * add one model to models_to_annotate * add mieb reference models * update task types * relabel to multilingual retrieval task type to align with paper * fix reference and bibtex * edit task list to match with final list * add back agg task to reproduce table column in paper * fix filtering and import * update tests * mieb lite add back missing tasks * fix metadata test * multi should have all 4 variants * fix task counts * lite has 10 task types * fix visualSTS-17 lang splits * Aggregate task can now use subsets & eval langs to filter TaskResults * fix test and mark VisualSTS17 as multilingual * fix tests * add agg task running script * add voyage meta * fix citations * capitalize * add coarse/fine labels --------- Co-authored-by: gowitheflow-1998 <jsbs54@durham.ac.uk>

Automatically generated by python-semantic-release

* feat: update training datasets and revision for jina models * feat: update training datasets and revision for jina models

* redo to voyage to only training data * Add training data annotation for Kalm embeddings #2168 * Add correct training data annotations to Stella #2164 * removed fiqa PL as it does not exist * remove ArxivClusteringS2S.v2 as it does not exist * Add training data annotation for GIST embedding #2166 * fix max tokens for kalm models #2162 * remove eli 5 * fix: add training data for Bilingual Embeddings fixes #2167

Automatically generated by python-semantic-release

This also resolves the missing data in the leaderboard. Fixes #2172

* Added training data annotation for MMLW models * Added GIST annotations Kenneth missed * Added Stella en 400m training data'

Automatically generated by python-semantic-release

* add similar datasets * add nano * update is filled * Update mteb/abstasks/TaskMetadata.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* add labse annotation * Update mteb/models/sentence_transformers_models.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* Fixed leaderboard crash * Fixed language selection error * Ran linting

Automatically generated by python-semantic-release

* Added training data annotation for bge-gemma * Added missing annotations for Voyage models * Added training data for sts-multilingual-mpnet * Added all mteb datasets to STS-multilingual training data

Automatically generated by python-semantic-release

* model loading and get_text_embeddings * add image_emb, fused_emb, and calc probs methods * add b16 model * add llm2clip_openai_l_14_224 (not working yet) * got llm2clip_openai_l_14_224 working * make lint * add training sets and allow py files

* refactor dataset checking * increase timeout * increase timeout * remove timeout

* Add Any2AnyMC descriptive stats * Add descriptive stats function for ImageTextPC * add descriptive stats examples * linter * update multi choice descriptive stats

* fix: Add training data annotations to uderver-bloom models fixes #2193 * fix: add mixedbread --------- Co-authored-by: Márton Kardos <power.up1163@gmail.com>

Automatically generated by python-semantic-release

* remove model size from voyage-3-m-exp model * Update mteb/models/voyage_models.py * Update mteb/models/voyage_models.py

# Conflicts: # mteb/abstasks/Image/AbsTaskAny2AnyMultiChoice.py # mteb/models/bge_models.py # mteb/models/e5_instruct.py # mteb/models/e5_models.py

* refactor dataset checking * increase timeout * increase timeout * remove timeout * start * automatically find datasets * update comment * fix aggregate task metadata * fixes * lint * rename * update fetch check

# Conflicts: # mteb/models/ru_sentence_models.py

Samoed · 2025-03-04T09:20:29Z

@isaac-chung can you review this PR? There is mostly changes to MIEB tasks

isaac-chung

The MIEB parts look good, thanks!

KennethEnevoldsen and others added 30 commits February 24, 2025 15:23

test: fix dataset availability test (#2141)

0163342

This simplified the test and also make it a lot simpler. It also removed about 100 test cases which where all to the same API call.

fix: Update NVIDIA-Embed training data (#2143)

760fcaf

Added a few missing annotations for nvidia-embed

1.34.29

9f6cc4e

Automatically generated by python-semantic-release

fix: Add annotations for Voyage exp (#2144)

8538e93

* fix: Update NVIDIA-Embed training data Added a few missing annotations for nvidia-embed * fix update annotationf for voyage exp

1.34.30

25cd62d

Automatically generated by python-semantic-release

Fix tokens num in cde models (#2148)

8e97d36

fix tokens

feat: Add Qodo-Embed-1-7B model metadata and rename existing model (#…

0e624b2

…2146) * feat: Add Qodo-Embed-1-7B model metadata and rename existing model * lint * fix revision * update license name --------- Co-authored-by: Tal Sheffer <tal.s@codium.ai>

1.35.0

4d23c6c

Automatically generated by python-semantic-release

misc: add Any2AnyRetrievalDescriptiveStatistics (#2139)

bd2a67c

add Any2AnyRetrievalDescriptiveStatistics

Update tasks table

ef3f4f0

Added zero-shot percentages and different filtering scheme (#2153)

a7dc95a

* Added zero-shot percentages and different filtering scheme * Update mteb/model_meta.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

fix: Incorrect annotations for Mistral-based embedding models (#2157)

565e29c

Fixes #2155

1.35.1

90ec21c

Automatically generated by python-semantic-release

Update FaMTEBRetrieval.py (#2171)

8afb78a

The URL pointed to the settings page instead of the main repo URL. Now it is fixed.

Update tasks table

331cded

1.35.2

ed0cb31

Automatically generated by python-semantic-release

Update tasks table

dbcbf54

1.36.0

afe1739

Automatically generated by python-semantic-release

fix: update training datasets and revision for jina models (#2179)

62b33f2

* feat: update training datasets and revision for jina models * feat: update training datasets and revision for jina models

1.36.1

4a0bb5c

Automatically generated by python-semantic-release

Added training data annotation for e5-base-4k (#2186)

43d15f1

fix: Added training data annotations to MXBAI (#2185)

1b23d4e

fix: Update MTEB(Scandinavian) to use new DanFEVER (#2180)

7daf893

This also resolves the missing data in the leaderboard. Fixes #2172

fix: Added training data annotation for MMLW models (#2188)

0307102

* Added training data annotation for MMLW models * Added GIST annotations Kenneth missed * Added Stella en 400m training data'

1.36.2

7642c07

Automatically generated by python-semantic-release

fix: Added training data for sentence-croissant (#2189)

0901cf6

1.36.3

d4b691f

Automatically generated by python-semantic-release

Samoed and others added 16 commits March 1, 2025 16:20

fix aggregated

3097740

add base models for e5 (#2183)

1c8d715

add similar datasets (#2205)

7af37d4

* add similar datasets * add nano * update is filled * Update mteb/abstasks/TaskMetadata.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

add labse annotation (#2182)

587892d

* add labse annotation * Update mteb/models/sentence_transformers_models.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

fix: Fixed leaderboard crash (#2221)

761a174

* Fixed leaderboard crash * Fixed language selection error * Ran linting

1.36.6

e57cd50

Automatically generated by python-semantic-release

fix: More training data annotations (#2220)

2dd1391

* Added training data annotation for bge-gemma * Added missing annotations for Voyage models * Added training data for sts-multilingual-mpnet * Added all mteb datasets to STS-multilingual training data

1.36.7

546e0c4

Automatically generated by python-semantic-release

Add LLM2CLIP (OpenAI variants) (#2222)

4ee4e7c

* model loading and get_text_embeddings * add image_emb, fused_emb, and calc probs methods * add b16 model * add llm2clip_openai_l_14_224 (not working yet) * got llm2clip_openai_l_14_224 working * make lint * add training sets and allow py files

Change dataset on HF test to use official api (#2213)

c5fded2

* refactor dataset checking * increase timeout * increase timeout * remove timeout

Descriptive stats functions for Any2AnyMC and ImageTextPC (#2197)

3e991bd

* Add Any2AnyMC descriptive stats * Add descriptive stats function for ImageTextPC * add descriptive stats examples * linter * update multi choice descriptive stats

Update tasks table

cc47225

fix: Add training data annotations to uderver-bloom models (#2210)

ee514cb

* fix: Add training data annotations to uderver-bloom models fixes #2193 * fix: add mixedbread --------- Co-authored-by: Márton Kardos <power.up1163@gmail.com>

1.36.8

4de58c3

Automatically generated by python-semantic-release

Add comment to voyage-3-m-exp model (#2229)

a87927b

* remove model size from voyage-3-m-exp model * Update mteb/models/voyage_models.py * Update mteb/models/voyage_models.py

Merge branch 'refs/heads/main' into merge_main

fe4c17a

# Conflicts: # mteb/abstasks/Image/AbsTaskAny2AnyMultiChoice.py # mteb/models/bge_models.py # mteb/models/e5_instruct.py # mteb/models/e5_models.py

Samoed mentioned this pull request Mar 4, 2025

Automatically add similar tasks to training_tasks #2228

Merged

4 tasks

docs: Update description of EURLex (#2231)

3a9d271

KennethEnevoldsen mentioned this pull request Mar 4, 2025

ci: Add pre-commit hook #2194

Merged

Samoed and others added 3 commits March 4, 2025 09:15

Automatically add similar tasks to training_tasks (#2228)

7f7d3e8

* refactor dataset checking * increase timeout * increase timeout * remove timeout * start * automatically find datasets * update comment * fix aggregate task metadata * fixes * lint * rename * update fetch check

Merge branch 'refs/heads/main' into merge_main

a857b10

# Conflicts: # mteb/models/ru_sentence_models.py

lint

5759f84

Samoed added 4 commits March 4, 2025 12:39

refactor

d786633

update BEIR-PL annotation

4cf714e

fix

6b03f0f

update test

07e6ae5

isaac-chung approved these changes Mar 4, 2025

View reviewed changes

Samoed merged commit d491800 into v2.0.0 Mar 4, 2025
9 checks passed

Samoed deleted the merge_main branch March 4, 2025 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v2] Merge main#2204

[v2] Merge main#2204
Samoed merged 63 commits intov2.0.0from
merge_main

Samoed commented Mar 1, 2025 •

edited

Loading

Uh oh!

Samoed commented Mar 4, 2025

Uh oh!

isaac-chung left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

Samoed commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Quality

Testing

Uh oh!

Samoed commented Mar 4, 2025

Uh oh!

isaac-chung left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Samoed commented Mar 1, 2025 •

edited

Loading