model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m by ayush1298 · Pull Request #3931 · embeddings-benchmark/mteb

ayush1298 · 2026-01-14T13:48:57Z

Closes #3391
Added both models: https://huggingface.co/mixedbread-ai/mxbai-edge-colbert-v0-17m, https://huggingface.co/mixedbread-ai/mxbai-edge-colbert-v0-32m
If you add a model or a dataset, please add the corresponding checklist:

I have filled out the ModelMeta object to the extent possible
I have ensured that my model can be loaded using
- mteb.get_model(model_name, revision) and
- mteb.get_model_meta(model_name, revision)
I have tested the implementation works on a representative set of tasks.
The model is public, i.e., is available either as an API or the weights are publicly available to download

…mxbai-edge-colbert-v0-17m

Copilot

Pull request overview

This PR adds two new late-interaction ColBERT models from mixedbread-ai to the MTEB model registry: mxbai-edge-colbert-v0-17m and mxbai-edge-colbert-v0-32m. These are small, efficient retrieval models with 17M and 32M parameters respectively.

Changes:

Added two new ModelMeta definitions for the mxbai-edge-colbert-v0-17m and mxbai-edge-colbert-v0-32m models with complete metadata including training datasets, citation, and model specifications
Updated import in mdbr_models.py to use mixedbread_ai_models module for consistency

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
mteb/models/model_implementations/mixedbread_ai_models.py	Adds import for MultiVectorModel and defines two new late-interaction ColBERT model metadata objects with comprehensive details including training datasets, citations, and model specifications
mteb/models/model_implementations/mdbr_models.py	Updates import statement to reference mixedbread_ai_models instead of the older module path for better organization

Comments suppressed due to low confidence (2)

mteb/models/model_implementations/mixedbread_ai_models.py:273

The commented-out training dataset 'FineWeb' is missing quotes. For consistency with other commented datasets on lines 272, 274, and 275, it should be written as # "FineWeb", if it represents a string value. This would prevent potential errors if the line is uncommented in the future.
mteb/models/model_implementations/mixedbread_ai_models.py:318
The commented-out training dataset 'FineWeb' is missing quotes. For consistency with other commented datasets on lines 317, 319, and 320, it should be written as # "FineWeb", if it represents a string value. This would prevent potential errors if the line is uncommented in the future.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mteb/models/model_implementations/mixedbread_ai_models.py

ayush1298 · 2026-01-14T14:07:18Z

@Samoed On hf they have given results on NanoBEIR and LongEmbed benchmarks, But when I am trying to run below code, I am getting error:

import mteb
model_name1 = "mixedbread-ai/mxbai-edge-colbert-v0-32m"
revision1 = "2f12870a85dae80680b9babc59992c9a2bc59e4a"
model1 = mteb.get_model_meta(model_name1, revision1)

benchmark = mteb.get_benchmark("LongEmbed")
tasks = benchmark.tasks
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model1)

TypeError: RetrievalEvaluator expects a SearchInterface, Encoder, or CrossEncoder, got <class 'mteb.models.model_meta.ModelMeta'>

Samoed · 2026-01-14T14:16:00Z

You should change mteb.get_model_meta to mteb.get_model and it's better to use mteb.evaluate

ayush1298 · 2026-01-14T14:23:54Z

You should change mteb.get_model_meta to mteb.get_model and it's better to use mteb.evaluate

sorry I was testing and mistakenly used mteb.get_model_meta instead of mteb.get_model.

mteb/models/model_implementations/mixedbread_ai_models.py

ayush1298 · 2026-01-14T14:42:50Z

1 more doubt, when I am doing evaluations like these:

import mteb
model_name1 = "mixedbread-ai/mxbai-edge-colbert-v0-32m"
revision1 = "2f12870a85dae80680b9babc59992c9a2bc59e4a"
model1 = mteb.get_model(model_name1)
benchmark2 = mteb.get_benchmark("NanoBEIR")
tasks2 = benchmark2.tasks
results2 = mteb.evaluate(model1, tasks=tasks2)

No JSON files are getting created for individual tasks in the benchmark. Is it that I have to see results in the following way only:

task_result2 = results2.task_results
for t in task_result2:
    print(f"Task name is: {t.task_name} and results is : {t.scores} ")

Also, do we also report some kind of average results of each metric across all tasks for a benchmark?

Samoed · 2026-01-14T14:48:21Z

You can just do

benchmark2 = mteb.get_benchmark("NanoBEIR")
results2 = mteb.evaluate(model1, benchmark2)

No JSON files are getting created

Are you sure that results are not in ~/.cache/mteb/?

ayush1298 · 2026-01-14T17:35:07Z

You can just do
benchmark2 = mteb.get_benchmark("NanoBEIR")
results2 = mteb.evaluate(model1, benchmark2)
No JSON files are getting created

Are you sure that results are not in ~/.cache/mteb/?

After running, results folder get created, but under there was just model_meta.json, not others.
Also, why it would be under ~/.cache/mteb/, and not in results folder that gets created after mteb.evaluate?

Samoed · 2026-01-14T19:51:47Z

I run this with baseline/random-encoder-baseline (I didn't want to recreate venv for pylate, because I'm using 3.13)

import mteb
model_name1 = "baseline/random-encoder-baseline"
revision1 = "2f12870a85dae80680b9babc59992c9a2bc59e4a"
model1 = mteb.get_model(model_name1)
benchmark2 = mteb.get_benchmark("NanoBEIR")
results2 = mteb.evaluate(model1, benchmark2)

And my I can see all task results in ~/.cache/mteb/results/baseline__random-encoder-baseline/1.

You can also specify your folder with

cache = mteb.cache.ResultCache("/path/to/folder")
mteb.evaluate(..., cache=cache)

Also, why it would be under ~/.cache/mteb/, and not in results folder that gets created after mteb.evaluate?

This change from v2. Now we store all results in cache, but you can specify any directory for that

ayush1298 · 2026-01-15T04:45:24Z

@Samoed Got it. This works. However, I am unable to run the evaluation on the actual model due to some dependency issues.

Samoed · 2026-01-15T09:05:24Z

I evaluated without problems on NanoBeir

ayush1298 · 2026-01-15T09:31:03Z

I evaluated without problems on NanoBeir

Can you check whether results are matching for NanoBEIR, with results given at: https://huggingface.co/mixedbread-ai/mxbai-edge-colbert-v0-17m. Its possible that there may be some problem from my side as I was running it on google colab.

Samoed

I've got same results

ayush1298 · 2026-01-15T10:15:56Z

I've got same results

Could you also add them in results repo?

@Samoed

* fix: Simplify conflicts (#3875) * simplify conflicts * add lock * remove torch * 2.6.6 Automatically generated by python-semantic-release * model: add missing sentence transformers and jina models (#3808) * add sentence transformers models * add jina v2 * fix modalities * Don't sync make lint (#3841) * don't sync make lint * don't sync make typecheck * upd ci * upd ci * upd ci * upd ci * upd ci * swap * fix: nv embed version (#3715) * fix nv embed wrapper * try to fix * fix sbert version * 2.6.7 Automatically generated by python-semantic-release * add dataset: KoViDoRe(v2) (#3876) * add dataset: KoViDoRe v2 * fix citation format * add direct loading * lint format * delete benchmark language view Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Add typehint for encode kwargs (#3831) * add typehint for encode kwargs * remove num_proc * remove all num proc * fix import * fix docstrings * model: mixedbread-ai/mxbai-rerank-large-v1 (#3905) * Add model: mixedbread-ai/mxbai-rerank-large-v1 * apply suggestions * Added xsmall and base version of reranker models * lintter * add model: bflhc/Octen-Embedding-0.6B (#3906) * fix: KoVidore2EnergyRetrieval revision fix (#3913) * 2.6.8 Automatically generated by python-semantic-release * Artifacts for llama-embed-nemotron-8b model (#3919) add artifacts for llama-embed-nemotron-8b model * fix: model load test (#3914) * fix model load test * trigger on dependencies change * 2.6.9 Automatically generated by python-semantic-release * model: Adding voyage-4-large, voyage-4 and voyage-4-lite (#3885) * Adding voyage-4-large and voyage-4-lite * Adding voyage-4-large and voyage-4-lite * Adding voyage-4 * Reverting voyage-4 (as the tokenizer is not yet available publicly) * added superseeded_by --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * model: Update the nemo retriever reversions to avoid error when loading the model (#3925) * Update the nemo retriever versions to fix the crash issue with visual_config * Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py * Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py --------- Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * docs: Resolve problems with missing documentation links (#3834) * resolve problems with missing documentation links * split into files * feat: Add vLLM support (#3794) * init * init Signed-off-by: wang.yuqi <noooop@126.com> * ruff Signed-off-by: wang.yuqi <noooop@126.com> * - vllm_loader Signed-off-by: wang.yuqi <noooop@126.com> * + TYPE_CHECKING Signed-off-by: wang.yuqi <noooop@126.com> * Make vLLM exit properly. Signed-off-by: wang.yuqi <noooop@126.com> * rename Signed-off-by: wang.yuqi <noooop@126.com> * support rerank Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * refine Signed-off-by: wang.yuqi <noooop@126.com> * refine Signed-off-by: wang.yuqi <noooop@126.com> * Update mteb/models/vllm_wrapper.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * refine Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * + docs Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * + benchmark Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * + more benchmark Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * refine docs Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * refine docs Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * fix typing * move type ignore * doc upd * add test * Update Makefile * add support for prompts * add support for prompts * - demo Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * make mypy happy Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> * fix typehints * update pyproject * update pyproject * update pyproject * The pooling + dp fails to run. * fix uv lock * fix docs * simplify conflicts * upd lock * upd lock * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Apply suggestions from code review Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update docs/advanced_usage/vllm_wrapper.md Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Apply suggestion from @Samoed Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update Signed-off-by: wang.yuqi <noooop@126.com> * update Signed-off-by: wang.yuqi <noooop@126.com> --------- Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * 2.7.0 Automatically generated by python-semantic-release * dataset: add ChemRxivRetrieval task to ChemTEB benchmark (#3923) * dataset: add ChemRxivRetrieval task to ChemTEB benchmark * fix: add descriptive statistics * feat: add ChemTEB v1.1 with ChemRxivRetrieval task * fix: chemteb v1.1 alias * dataset: Add EuroPIRQRetrieval dataset (#3924) * dataset: Add EuroPIRQRetrieval dataset * Removed unnecessary load dataset functions * model: add nemotron rerank (#3750) * add nemotron rerank * move to nvidia models * removed extra params * Apply suggestions from code review Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * remove or * add docstring * Update mteb/models/model_implementations/nvidia_models.py Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com> * update --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com> * Update references and citations for ViDoRe V3 benchmark (#3930) * fix: Update references and citations for ViDoRe V3 benchmark * foramat citation * format again --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * model: Adding voyage-4 model (#3927) * Adding voyage-4 model * Adding voyage-4 model configs * fix: temporarily remove private column from RTEB Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up. fixes #3902 * added issue link * fix remove mean (Task) * lint * fix: Minor logging fixes by activate `LOG` rule (#3820) activate logger rule * 2.7.1 Automatically generated by python-semantic-release * docs: fix vllm broken link (#3936) fix vllm link * model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m (#3931) * Add model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m * Lintter * Add quotes * Update dataset name * Apply suggestions from code review * Update mixedbread_ai_models.py --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * model: add pixie_models (#3938) * model: add pixie_models * Apply lint formatting * fix: computation of results with missing scores (#3874) * fix computation of results with missing scores * fix test * change 0 to nan * change 0 to nan * remove `fill_missing_scores` * fix: expose `ResultCache` directly as `mteb.ResultCache` (#3912) * fix: expose `ResultCache` directly as `mteb.ResultCache` fixes #3910 * docs: Update docs usage of `ResultCache` * merge in fixes to remove_private (#3940) fix: exclude private tasks from Borda rank calculation in RTEB Co-authored-by: bflhc <kunka.xgw@gmail.com> * 2.7.2 Automatically generated by python-semantic-release * fix typo (#3954) * fix colSmol-256M revision (#3956) * dedup colnomic_7b and fix loader (#3957) * dedup colnomic_7b and fix loader * remove flash_attention_2 * refactor: Activate `TC` (#3800) * activate tc * activate `TC` * small import fix * fix imports * fix imports * fix pil import * fix benchmark result validation * full benchmark fix * update * fix unpack imports * upd vllm type * fix: correct inverted unload_data condition in evaluate (#3929) Add tests verifying preloaded data is preserved. Co-authored-by: Daniel Svonava <daniel@superlinked.com> * fix: temporarily remove private column from RTEB (#3932) * fix: temporarily remove private column from RTEB Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up. fixes #3902 * added issue link * fix remove mean (Task) * lint * merge in fixes to remove_private (#3940) fix: exclude private tasks from Borda rank calculation in RTEB Co-authored-by: bflhc <kunka.xgw@gmail.com> --------- Co-authored-by: bflhc <kunka.xgw@gmail.com> * 2.7.3 Automatically generated by python-semantic-release * refactor: split `BRIGHT` benchmark into individual subset tasks (#3285) * refactor: split BRIGHT benchmark into individual subset tasks * readd bright * readd bright subset tasks * feat: add descriptive stats for BRIGHT subsets retrieval tasks * feat: add top_ranked for excluded_ids handling * change main score to recall@1 for long version * improve BRIGHT task descriptions * add prompts to BRIGHT retrieval tasks * refactor: BRIGHT(v1.1) * calculate descriptive stats for BRIGHTLongRetrieval * update prompts * normalize names in prompts * don't filter tasks * remove filter_queries_without_positives and update revision * don't create top ranked if not necessary * get back naucs * fix instructions * add warning * fix import --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * fix: Update metadata to include active number of parameter to `ModelMeta` (#3837) * Add active parameter column on LB * update ModelMeta with parameters * update ModelMeta of models * Delete parameter_update_results.csv * fix test * fix tests * delete script * rename for consistency * convert active_parameter to property * rename and fix property * update embedding parameters for model2vec models * remove duplicate loading of models * fix * lintter * fix * remove separate method for embedding parameter calculation * fix embedding calculation to pass typecheck * lintter * fix checking * rename active parameters * upd docstring * fix tests * remove n_active_parameters_override from ModelMeta of all models * lintter * rename file instead of merging main * fix tests * correct tests * Delete model total and active parameters - model_parameters.csv --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * 2.7.4 Automatically generated by python-semantic-release * fix: use `num_proc` for dataset processing (#3832) * add typehint for encode kwargs * remove num_proc * start adding num_proc * remove all num proc * fix import * add num proc to transform * add to push to hub * use num proc in vidore v2 * move num proc to evaluate * pass num proc everywhere * fix tests * fix pylate * fix image text pair * fix num workers * add kwargs to `load_data` * 2.7.5 Automatically generated by python-semantic-release * fix: saving aggregated tasks (#3915) fix saving * 2.7.6 Automatically generated by python-semantic-release * model: Adding voyage-4-large (2048d) model configs (#3970) * Adding voyage-4-large (2048d) model configs * Adding voyage-4-large 2048d model configs * Adding voyage-4-large 2048d model configs * fix: Ensure that retrieval tasks only evaluate on specified subsets instead of all (#3946) * fix dataset loading * update logging * add test * fix: Add `fill_missing` parameter in `get_model_meta` (#3801) * Add compute missing parameter in get_model_meta * fix logs * fix * fix from comments * apply suggestion * fix method * add test and fix logic * address comments * rename compute_missing to fill_missing --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * fix: leaderboard Nan handling (#3965) * fix leaderboard * fix loading aggregated tasks * Update mteb/results/task_result.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * 2.7.7 Automatically generated by python-semantic-release * fix: Filled active_parameter_overiride for GritLM/GritLM-8x7B nomic-ai/nomic-embed-text-v2-moe (#3967) * Filled active_parameter_overiride for ritLM/GritLM-8x7B and nomic-ai/nomic-embed-text-v2-moe * add correct parameters for nomic-ai/nomic-embed-text-v2-moe * 2.7.8 Automatically generated by python-semantic-release * fix: add kwargs to pub chem load data (#3990) add kwargs to pub chem load data * 2.7.9 Automatically generated by python-semantic-release * fix: `BAAI/bge-small-en` model revision (#3993) fix(models): update invalid bge-small-en revision * fix: NomicWrapper `get_prompt_name` call (#3995) fix(models): correct get_prompt_name call in NomicWrapper * 2.7.10 Automatically generated by python-semantic-release * fix: `BedrockModel` initialization arguments (#3999) fix: add model_name arg to BedrockModel init to prevent multiple values for model_id * 2.7.11 Automatically generated by python-semantic-release * fix: `dataset_transform` signature in PubChemWikiPairClassification (#4001) fix: add num_proc arg to PubChemWikiPairClassification dataset_transform * fix: all dataset transform (#4002) fix dataset transform * 2.7.12 Automatically generated by python-semantic-release * model: Adding Ops-Colqwen3 models (#3987) * Create ops_colqwen3_models.py * Refactor OpsColQwen3 model and processor classes * Update model revision in ops_colqwen3_models.py * Remove calculate_probs method and fix model name Removed the calculate_probs method and updated model name. * format * fix ds name --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * model: added nomic-ai/nomic-embed-code (#4006) * Add model metadata for nomic-embed-code Added new model metadata for 'nomic-embed-code' * fix nomic_embed_code * lint --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * Adding nvidia/nemotron-colembed models (#3941) * Adding nvidia/nemotron-colembed models * add colembed 4b, 8b model meta * fix colembed-3b-v2 model name * update revision for colembed 3b * update revisions * Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * lint --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * model: added Querit/Querit (#3996) * querit_models_add * Querit_Models_Change * Update * format revise * add future * format revise * format revise * last format revison * last last revise * last last last revison * revise * revise * change the instruction * last revison * revise * revise * revise --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * Build image on leaderboard refresh (#4015) build image on leaderboard refresh * fix: simplify dependencies (#4017) * 2.7.13 Automatically generated by python-semantic-release * fix: Make `mteb.get_model` compatible with `CrossEncoders` (#3988) * Made mteg.get_model compatible with CrossEncoders and SparseEncoders * update loader for sparseEncoder * fix import * Simplify structure * Add model_type to sparseEncoder models * remove detection logic of sparsencoder * Add tests and documentation * simplified tests * updated docs * fix docs * fix * fix grammar * Update docs/usage/defining_the_model.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update docs/advanced_usage/two_stage_reranking.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update docs/index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * address comments --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * rename bm25s to baseline/bm25s (#4007) * rename bm25s to baseline/bm25s * Update mteb/models/get_model_meta.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * remove logger message * rename Human to baseline/Human --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix support for datsets 4.5 with pandas 3 (#3983) * fix test * fix: sanitize type for label during array conversion * lint * revert typo fix --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * lint * fix typing * fix test import --------- Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Co-authored-by: semantic-release <semantic-release> Co-authored-by: Yongbin Choi <whybe.choi@gmail.com> Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in> Co-authored-by: bflhc <kunka.xgw@gmail.com> Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com> Co-authored-by: fzoll <5575946+fzoll@users.noreply.github.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Sahel Sharifymoghaddam <sahel.sharifi@gmail.com> Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> Co-authored-by: wang.yuqi <noooop@126.com> Co-authored-by: HSILA <ali.shiraee@partners.basf.com> Co-authored-by: Elias H <40372306+eherra@users.noreply.github.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: antoineedy <antoine.edy@illuin.tech> Co-authored-by: Bong-Min Kim <klbm126@gmail.com> Co-authored-by: svonava <svonava@gmail.com> Co-authored-by: Daniel Svonava <daniel@superlinked.com> Co-authored-by: HSILA <a.shiraee@gmail.com> Co-authored-by: caoyi <caoyi0905@mail.hfut.edu.cn> Co-authored-by: Lukas Kleybolte <32893711+Mozartuss@users.noreply.github.com> Co-authored-by: rnyak <16246900+rnyak@users.noreply.github.com> Co-authored-by: youngbeauty250 <140679097+youngbeauty250@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Add model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/…

eead738

…mxbai-edge-colbert-v0-17m

Copilot AI review requested due to automatic review settings January 14, 2026 13:48

Copilot started reviewing on behalf of ayush1298 January 14, 2026 13:49 View session

Copilot AI reviewed Jan 14, 2026

View reviewed changes

ayush1298 added 2 commits January 14, 2026 19:22

Lintter

8f75aae

Add quotes

e65ffb1

ayush1298 requested a review from Samoed January 14, 2026 13:58

Samoed reviewed Jan 14, 2026

View reviewed changes

mteb/models/model_implementations/mixedbread_ai_models.py Outdated Show resolved Hide resolved

Update dataset name

b2b8634

Samoed reviewed Jan 14, 2026

View reviewed changes

mteb/models/model_implementations/mixedbread_ai_models.py Outdated Show resolved Hide resolved

mteb/models/model_implementations/mixedbread_ai_models.py Outdated Show resolved Hide resolved

Samoed added 2 commits January 14, 2026 17:24

Apply suggestions from code review

41baeb1

Update mixedbread_ai_models.py

5cc169b

Samoed changed the title ~~Add model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m~~ model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m Jan 14, 2026

Samoed approved these changes Jan 15, 2026

View reviewed changes

Samoed merged commit 5de2194 into embeddings-benchmark:main Jan 15, 2026
11 checks passed

ayush1298 deleted the add_mxbai-edge-colbert branch January 15, 2026 10:14

Conversation

ayush1298 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

ayush1298 commented Jan 14, 2026 • edited by Samoed Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayush1298 commented Jan 14, 2026

Uh oh!

Uh oh!

Uh oh!

ayush1298 commented Jan 14, 2026 • edited by Samoed Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed commented Jan 14, 2026

Uh oh!

ayush1298 commented Jan 14, 2026

Uh oh!

Samoed commented Jan 14, 2026

Uh oh!

ayush1298 commented Jan 15, 2026

Uh oh!

Samoed commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayush1298 commented Jan 15, 2026

Uh oh!

Samoed left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ayush1298 commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

ayush1298 commented Jan 14, 2026 •

edited

Loading

ayush1298 commented Jan 14, 2026 •

edited by Samoed

Loading

Samoed commented Jan 14, 2026 •

edited

Loading

ayush1298 commented Jan 14, 2026 •

edited by Samoed

Loading

Samoed commented Jan 15, 2026 •

edited

Loading