Merge main 11 10#3321

Merged

Samoed merged 21 commits intov2.0.0from

merge_main_11_10

Oct 12, 2025

Member

Samoed commented Oct 11, 2025

If you add a model or a dataset, please add the corresponding checklist:

q275343119 and others added 19 commits

October 6, 2025 14:06


          fix: Move zero-shot percentage calculation to the end of summary (#3231)

65829bd

* Refactor: Move zero-shot percentage calculation to the end of summary table creation which only apply to RTEB table.

* Update RTEB benchmark name from "RTEB(beta)" to "RTEB" for consistency in display.

* feat - RTEB(beta)

* feat - remove Zero-shot

---------

Co-authored-by: ethan <smiletoye@gmail.com>


          model: Add ReasonIR (#3221)

f2504bd

* model: Add ReasonIR

* Update mteb/models/reasonir_model.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update mteb/models/reasonir_model.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* update n_parameters of ReasonIR

Co-authored-by: Niklas <n.muennighoff@gmail.com>

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Niklas <n.muennighoff@gmail.com>


          fix: Only pin model name and rank (#3263)

58a81a9

Currently we pin 3 columns, this makes it hard or impossible to view on phones. The 3rd column is also no longer garuanteed as RTEB leaderboard does not use the zero-shot column


          1.39.3

bc953bf

Automatically generated by python-semantic-release


          fix: resolve flash-attention dependency issue (#3265)

1e29385

* fix: Only pin model name and rank

Currently we pin 3 columns, this makes it hard or impossible to view on phones. The 3rd column is also no longer garuanteed as RTEB leaderboard does not use the zero-shot column

* fix: resolve flash-attention dependency issue

This has been tested and works.

fixed Resolve flash-attention dependency issues
Fixes #3240


          1.39.4

0f61c9f

Automatically generated by python-semantic-release


          fix: Add retry and token counting in Cohere models (#3253)

e81c94f

* Retry and token counting in Cohere models

* Retry and token counting in Cohere models

* Retry and token counting in Cohere models

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>


          1.39.5

479c2a0

Automatically generated by python-semantic-release


          Align MIEB leaderboards with paper (#3272)

30de619

* sort by mean task type and use pure rank for MIEB LBs

* lint

* rename task type column for readability


          fix: add prompt for MIRACLRetrievalHardNegatives (#3266)

9b6f320

* add prompt for MIRACLRetrievalHardNegatives

* add `MIRACLRetrievalHardNegatives.v2`

* Update mteb/tasks/Retrieval/multilingual/MIRACLRetrieval.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* move common metadata to dict

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          Update tasks & benchmarks tables

5a5bcfd


          Add Regression task mock (#3271)

e176ba6


          1.39.6

4936fe2

Automatically generated by python-semantic-release


          fix: Change language for task SlovakMovieReviewSentimentClassification (

0a902a3

#3296)


          Update tasks & benchmarks tables

94aa0d5


          1.39.7

d2c704c

Automatically generated by python-semantic-release


          Add english code retriever model (#3302)

67f7ad9

* Add en code retriever model

* fix model_name

* Update mteb/models/en_code_retriever.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* correct lint

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>


          Merge branch 'main' into merge_main_11_10

5c0757d

# Conflicts:
#	mteb/benchmarks/benchmarks/benchmarks.py
#	mteb/evaluation/evaluators/RegressionEvaluator.py
#	mteb/leaderboard/app.py
#	mteb/models/model_implementations/cohere_models.py
#	mteb/models/model_implementations/cohere_v.py
#	mteb/models/overview.py
#	mteb/tasks/Classification/__init__.py
#	mteb/tasks/Classification/svk/__init__.py
#	mteb/tasks/Retrieval/multilingual/MIRACLRetrieval.py
#	pyproject.toml
#	tests/test_benchmark/mock_tasks.py


          updates after merge

2faad6c

Samoed requested a review from KennethEnevoldsen

October 11, 2025 17:51

Samoed added the v2 label


          fix tests

bdd5a93

KennethEnevoldsen approved these changes

View reviewed changes


          fix tests

74059ca

Samoed merged commit a08e6a6 into v2.0.0

12 checks passed

Samoed deleted the merge_main_11_10 branch

October 12, 2025 06:57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet