Skip to content

Comments

Adding voyage-4-large (2048d) model configs#3970

Merged
KennethEnevoldsen merged 3 commits intoembeddings-benchmark:mainfrom
fzoll:voyage-4-large-2048d
Jan 20, 2026
Merged

Adding voyage-4-large (2048d) model configs#3970
KennethEnevoldsen merged 3 commits intoembeddings-benchmark:mainfrom
fzoll:voyage-4-large-2048d

Conversation

@fzoll
Copy link
Contributor

@fzoll fzoll commented Jan 19, 2026

If you add a model or a dataset, please add the corresponding checklist:

@fzoll
Copy link
Contributor Author

fzoll commented Jan 19, 2026

@KennethEnevoldsen @Samoed, can you please help me? How can I add the same model with a different output dimension? If i remember right, this was an issue previously and i added different versions (different output type, dimension) to Cohere models, but i guess, those models would fail on the new model name check as well.

@Samoed
Copy link
Member

Samoed commented Jan 19, 2026

We haven't craeted other options yet. What check are you talking about?

@fzoll
Copy link
Contributor Author

fzoll commented Jan 19, 2026

@Samoed The Model Loading fails:
uv run --no-sync python tests/test_models/model_loading.py --model_name_file scripts/model_names.txt INFO:mteb.models.get_model_meta:Model not found in model registry. Attempting to extract metadata by loading the model ({model_name}) using HuggingFace. .... pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelMeta name Value error, Model name must be in the format 'organization/model_name' [type=value_error, input_value='2048d', input_type=str]

@Samoed
Copy link
Member

Samoed commented Jan 19, 2026

Somewhere in model loading this handled incorrectly. I think you can rename like voyageai/voyage-4-large_2048d to resolve an issue

@KennethEnevoldsen
Copy link
Contributor

@Samoed I think this is where the error happens:

Attempting to extract metadata by loading the model ({model_name}) using HuggingFace. ....

I suspect we need to add a skip to model loading for API models. There seems to be one but it is inside if model_meta.n_parameters is not None

@fzoll
Copy link
Contributor Author

fzoll commented Jan 19, 2026

@KennethEnevoldsen Shall i remove the n_parameters from this model meta?

@Samoed Samoed added the new model Questions related to adding a new model to the benchmark label Jan 20, 2026
@KennethEnevoldsen
Copy link
Contributor

No @fzoll I think your PR is correct

@KennethEnevoldsen KennethEnevoldsen merged commit 961a43b into embeddings-benchmark:main Jan 20, 2026
10 of 11 checks passed
Samoed added a commit that referenced this pull request Jan 31, 2026
* fix: Simplify conflicts (#3875)

* simplify conflicts

* add lock

* remove torch

* 2.6.6

Automatically generated by python-semantic-release

* model: add missing sentence transformers and jina models (#3808)

* add sentence transformers models

* add jina v2

* fix modalities

* Don't sync make lint (#3841)

* don't sync make lint

* don't sync make typecheck

* upd ci

* upd ci

* upd ci

* upd ci

* upd ci

* swap

* fix: nv embed version (#3715)

* fix nv embed wrapper

* try to fix

* fix sbert version

* 2.6.7

Automatically generated by python-semantic-release

* add dataset: KoViDoRe(v2) (#3876)

* add dataset: KoViDoRe v2

* fix citation format

* add direct loading

* lint format

* delete benchmark language view

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Add typehint for encode kwargs (#3831)

* add typehint for encode kwargs

* remove num_proc

* remove all num proc

* fix import

* fix docstrings

* model: mixedbread-ai/mxbai-rerank-large-v1 (#3905)

* Add model: mixedbread-ai/mxbai-rerank-large-v1

* apply suggestions

* Added xsmall and base version of reranker models

* lintter

* add model: bflhc/Octen-Embedding-0.6B (#3906)

* fix: KoVidore2EnergyRetrieval revision fix (#3913)

* 2.6.8

Automatically generated by python-semantic-release

* Artifacts for llama-embed-nemotron-8b model (#3919)

add artifacts for llama-embed-nemotron-8b model

* fix: model load test (#3914)

* fix model load test

* trigger on dependencies change

* 2.6.9

Automatically generated by python-semantic-release

* model: Adding voyage-4-large, voyage-4 and voyage-4-lite (#3885)

* Adding voyage-4-large and voyage-4-lite

* Adding voyage-4-large and voyage-4-lite

* Adding voyage-4

* Reverting voyage-4 (as the tokenizer is not yet available publicly)

* added superseeded_by

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* model: Update the nemo retriever reversions to avoid error when loading the model (#3925)

* Update the nemo retriever versions to fix the crash issue with visual_config

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* docs: Resolve problems with missing documentation links (#3834)

* resolve problems with missing documentation links

* split into files

* feat: Add vLLM support (#3794)

* init

* init

Signed-off-by: wang.yuqi <noooop@126.com>

* ruff

Signed-off-by: wang.yuqi <noooop@126.com>

* - vllm_loader

Signed-off-by: wang.yuqi <noooop@126.com>

* + TYPE_CHECKING

Signed-off-by: wang.yuqi <noooop@126.com>

* Make vLLM exit properly.

Signed-off-by: wang.yuqi <noooop@126.com>

* rename

Signed-off-by: wang.yuqi <noooop@126.com>

* support rerank

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* refine

Signed-off-by: wang.yuqi <noooop@126.com>

* refine

Signed-off-by: wang.yuqi <noooop@126.com>

* Update mteb/models/vllm_wrapper.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* refine

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + benchmark

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + more benchmark

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* refine docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* refine docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* fix typing

* move type ignore

* doc upd

* add test

* Update Makefile

* add support for prompts

* add support for prompts

* - demo

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* make mypy happy

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* fix typehints

* update pyproject

* update pyproject

* update pyproject

* The pooling + dp fails to run.

* fix uv lock

* fix docs

* simplify conflicts

* upd lock

* upd lock

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Apply suggestion from @Samoed

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* update

Signed-off-by: wang.yuqi <noooop@126.com>

* update

Signed-off-by: wang.yuqi <noooop@126.com>

---------

Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* 2.7.0

Automatically generated by python-semantic-release

* dataset: add ChemRxivRetrieval task to ChemTEB benchmark (#3923)

* dataset: add ChemRxivRetrieval task to ChemTEB benchmark

* fix: add descriptive statistics

* feat: add ChemTEB v1.1 with ChemRxivRetrieval task

* fix: chemteb v1.1 alias

* dataset: Add EuroPIRQRetrieval dataset (#3924)

* dataset: Add EuroPIRQRetrieval dataset

* Removed unnecessary load dataset functions

* model: add nemotron rerank (#3750)

* add nemotron rerank

* move to nvidia models

* removed extra params

* Apply suggestions from code review

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

* remove or

* add docstring

* Update mteb/models/model_implementations/nvidia_models.py

Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>

* update

---------

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>

* Update references and citations for ViDoRe V3 benchmark (#3930)

* fix: Update references and citations for ViDoRe V3 benchmark

* foramat citation

* format again

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: Adding voyage-4 model (#3927)

* Adding voyage-4 model

* Adding voyage-4 model configs

* fix: temporarily remove private column from RTEB

Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up.

fixes #3902

* added issue link

* fix remove mean (Task)

* lint

* fix: Minor logging fixes by activate `LOG` rule (#3820)

activate logger rule

* 2.7.1

Automatically generated by python-semantic-release

* docs: fix vllm broken link (#3936)

fix vllm link

* model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m (#3931)

* Add model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m

* Lintter

* Add quotes

* Update dataset name

* Apply suggestions from code review

* Update mixedbread_ai_models.py

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* model: add pixie_models (#3938)

* model: add pixie_models

* Apply lint formatting

* fix: computation of results with missing scores (#3874)

* fix computation of results with missing scores

* fix test

* change 0 to nan

* change 0 to nan

* remove `fill_missing_scores`

* fix: expose `ResultCache` directly as `mteb.ResultCache` (#3912)

* fix: expose `ResultCache` directly as `mteb.ResultCache`

fixes #3910

* docs: Update docs usage of `ResultCache`

* merge in fixes to remove_private (#3940)

fix: exclude private tasks from Borda rank calculation in RTEB

Co-authored-by: bflhc <kunka.xgw@gmail.com>

* 2.7.2

Automatically generated by python-semantic-release

* fix typo (#3954)

* fix colSmol-256M revision (#3956)

* dedup colnomic_7b and fix loader (#3957)

* dedup colnomic_7b and fix loader

* remove flash_attention_2

* refactor: Activate `TC` (#3800)

* activate tc

* activate `TC`

* small import fix

* fix imports

* fix imports

* fix pil import

* fix benchmark result validation

* full benchmark fix

* update

* fix unpack imports

* upd vllm type

* fix: correct inverted unload_data condition in evaluate (#3929)

Add tests verifying preloaded data is preserved.

Co-authored-by: Daniel Svonava <daniel@superlinked.com>

* fix: temporarily remove private column from RTEB (#3932)

* fix: temporarily remove private column from RTEB

Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up.

fixes #3902

* added issue link

* fix remove mean (Task)

* lint

* merge in fixes to remove_private (#3940)

fix: exclude private tasks from Borda rank calculation in RTEB

Co-authored-by: bflhc <kunka.xgw@gmail.com>

---------

Co-authored-by: bflhc <kunka.xgw@gmail.com>

* 2.7.3

Automatically generated by python-semantic-release

* refactor: split `BRIGHT` benchmark into individual subset tasks (#3285)

* refactor: split BRIGHT benchmark into individual subset tasks

* readd bright

* readd bright subset tasks

* feat: add descriptive stats for BRIGHT subsets retrieval tasks

* feat: add top_ranked for excluded_ids handling

* change main score to recall@1 for long version

* improve BRIGHT task descriptions

* add prompts to BRIGHT retrieval tasks

* refactor: BRIGHT(v1.1)

* calculate descriptive stats for BRIGHTLongRetrieval

* update prompts

* normalize names in prompts

* don't filter tasks

* remove filter_queries_without_positives and update revision

* don't create top ranked if not necessary

* get back naucs

* fix instructions

* add warning

* fix import

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* fix: Update metadata to include active number of parameter to `ModelMeta` (#3837)

* Add active parameter column on LB

* update ModelMeta with parameters

* update ModelMeta of models

* Delete parameter_update_results.csv

* fix test

* fix tests

* delete script

* rename for consistency

* convert active_parameter to property

* rename and fix property

* update embedding parameters for model2vec models

* remove duplicate loading of models

* fix

* lintter

* fix

* remove separate method for embedding parameter calculation

* fix embedding calculation to pass typecheck

* lintter

* fix checking

* rename active parameters

* upd docstring

* fix tests

* remove n_active_parameters_override from ModelMeta of all models

* lintter

* rename file instead of merging main

* fix tests

* correct tests

* Delete model total and active parameters - model_parameters.csv

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* 2.7.4

Automatically generated by python-semantic-release

* fix: use `num_proc` for dataset processing (#3832)

* add typehint for encode kwargs

* remove num_proc

* start adding num_proc

* remove all num proc

* fix import

* add num proc to transform

* add to push to hub

* use num proc in vidore v2

* move num proc to evaluate

* pass num proc everywhere

* fix tests

* fix pylate

* fix image text pair

* fix num workers

* add kwargs to `load_data`

* 2.7.5

Automatically generated by python-semantic-release

* fix: saving aggregated tasks (#3915)

fix saving

* 2.7.6

Automatically generated by python-semantic-release

* model: Adding voyage-4-large (2048d) model configs (#3970)

* Adding voyage-4-large (2048d) model configs

* Adding voyage-4-large 2048d model configs

* Adding voyage-4-large 2048d model configs

* fix: Ensure that retrieval tasks only evaluate on specified subsets instead of all (#3946)

* fix dataset loading

* update logging

* add test

* fix: Add `fill_missing` parameter in `get_model_meta` (#3801)

* Add compute missing parameter in get_model_meta

* fix logs

* fix

* fix from comments

* apply suggestion

* fix method

* add test and fix logic

* address comments

* rename compute_missing to fill_missing

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* fix: leaderboard Nan handling (#3965)

* fix leaderboard

* fix loading aggregated tasks

* Update mteb/results/task_result.py

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* 2.7.7

Automatically generated by python-semantic-release

* fix: Filled active_parameter_overiride for GritLM/GritLM-8x7B nomic-ai/nomic-embed-text-v2-moe (#3967)

* Filled active_parameter_overiride for ritLM/GritLM-8x7B and nomic-ai/nomic-embed-text-v2-moe

* add correct parameters for nomic-ai/nomic-embed-text-v2-moe

* 2.7.8

Automatically generated by python-semantic-release

* fix: add kwargs to pub chem load data (#3990)

add kwargs to pub chem load data

* 2.7.9

Automatically generated by python-semantic-release

* fix: `BAAI/bge-small-en` model revision (#3993)

fix(models): update invalid bge-small-en revision

* fix: NomicWrapper `get_prompt_name` call (#3995)

fix(models): correct get_prompt_name call in NomicWrapper

* 2.7.10

Automatically generated by python-semantic-release

* fix: `BedrockModel` initialization arguments (#3999)

fix: add model_name arg to BedrockModel init to prevent multiple values for model_id

* 2.7.11

Automatically generated by python-semantic-release

* fix: `dataset_transform` signature in PubChemWikiPairClassification (#4001)

fix: add num_proc arg to PubChemWikiPairClassification dataset_transform

* fix: all dataset transform (#4002)

fix dataset transform

* 2.7.12

Automatically generated by python-semantic-release

* model: Adding Ops-Colqwen3 models (#3987)

* Create ops_colqwen3_models.py

* Refactor OpsColQwen3 model and processor classes

* Update model revision in ops_colqwen3_models.py

* Remove calculate_probs method and fix model name

Removed the calculate_probs method and updated model name.

* format

* fix ds name

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: added nomic-ai/nomic-embed-code (#4006)

* Add model metadata for nomic-embed-code

Added new model metadata for 'nomic-embed-code'

* fix nomic_embed_code

* lint

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* Adding nvidia/nemotron-colembed models (#3941)

* Adding nvidia/nemotron-colembed models

* add colembed 4b, 8b model meta

* fix colembed-3b-v2 model name

* update revision for colembed 3b

* update revisions

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* lint

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: added Querit/Querit (#3996)

* querit_models_add

* Querit_Models_Change

* Update

* format revise

* add future

* format revise

* format revise

* last format revison

* last last revise

* last last last revison

* revise

* revise

* change the instruction

* last revison

* revise

* revise

* revise

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* Build image on leaderboard refresh (#4015)

build image on leaderboard refresh

* fix: simplify dependencies (#4017)

* 2.7.13

Automatically generated by python-semantic-release

* fix: Make `mteb.get_model` compatible with `CrossEncoders` (#3988)

* Made mteg.get_model compatible with CrossEncoders and SparseEncoders

* update loader for sparseEncoder

* fix import

* Simplify structure

* Add model_type to sparseEncoder models

* remove detection logic of sparsencoder

* Add tests and documentation

* simplified tests

* updated docs

* fix docs

* fix

* fix grammar

* Update docs/usage/defining_the_model.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs/advanced_usage/two_stage_reranking.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* address comments

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* rename bm25s to baseline/bm25s (#4007)

* rename bm25s to baseline/bm25s

* Update mteb/models/get_model_meta.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* remove logger message

* rename Human to baseline/Human

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix support for datsets 4.5 with pandas 3 (#3983)

* fix test

* fix: sanitize type for label during array conversion

* lint

* revert typo fix

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* lint

* fix typing

* fix test import

---------

Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: semantic-release <semantic-release>
Co-authored-by: Yongbin Choi <whybe.choi@gmail.com>
Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in>
Co-authored-by: bflhc <kunka.xgw@gmail.com>
Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>
Co-authored-by: fzoll <5575946+fzoll@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Sahel Sharifymoghaddam <sahel.sharifi@gmail.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: wang.yuqi <noooop@126.com>
Co-authored-by: HSILA <ali.shiraee@partners.basf.com>
Co-authored-by: Elias H <40372306+eherra@users.noreply.github.com>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
Co-authored-by: antoineedy <antoine.edy@illuin.tech>
Co-authored-by: Bong-Min Kim <klbm126@gmail.com>
Co-authored-by: svonava <svonava@gmail.com>
Co-authored-by: Daniel Svonava <daniel@superlinked.com>
Co-authored-by: HSILA <a.shiraee@gmail.com>
Co-authored-by: caoyi <caoyi0905@mail.hfut.edu.cn>
Co-authored-by: Lukas Kleybolte <32893711+Mozartuss@users.noreply.github.com>
Co-authored-by: rnyak <16246900+rnyak@users.noreply.github.com>
Co-authored-by: youngbeauty250 <140679097+youngbeauty250@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new model Questions related to adding a new model to the benchmark

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants