Skip to content

Comments

fix: Update metadata to include active number of parameter to ModelMeta#3837

Merged
Samoed merged 32 commits intoembeddings-benchmark:mainfrom
ayush1298:add_active_parameter
Jan 19, 2026
Merged

fix: Update metadata to include active number of parameter to ModelMeta#3837
Samoed merged 32 commits intoembeddings-benchmark:mainfrom
ayush1298:add_active_parameter

Conversation

@ayush1298
Copy link
Collaborator

@ayush1298 ayush1298 commented Jan 3, 2026

closes #3258

Copilot AI review requested due to automatic review settings January 3, 2026 19:40
@ayush1298
Copy link
Collaborator Author

ayush1298 commented Jan 3, 2026

Just added csv for now. Will remove memory usage column, add active embedding to ModelMeta of each model and then add column to LB.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a CSV file containing model parameter data for models in the MTEB leaderboard. The file documents total parameters, active parameters, and input embedding parameters for over 500 models. This addresses issues #3258 and #3259.

Key Changes

  • Addition of a CSV file documenting parameter counts (total, active, and input embedding) for 554 models
  • The CSV includes success/error status for each model, tracking whether parameter extraction succeeded or encountered issues

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ayush1298
Copy link
Collaborator Author

ayush1298 commented Jan 4, 2026

@Samoed @KennethEnevoldsen @isaac-chung
1 doubt that I have is, value of n_parameters (total number of parameters) which we have in ModelMeta, is almost equal to(though not exact) same as total_params + embedding_params counted using code.

For model: Alibaba-NLP/gte-Qwen2-7B-instruct

  ModelMeta:  7,613,000,000
  In CSV:  total_params - 7069121024, active_params - 6525621760, embedding_params - 543499264 
so, total_params + embedding_params = 7,612,620,288 which is almost equal to what we have in ModelMeta.

Also in csv these 3 parameters came from below code.

import numpy as np
from transformers import AutoModel

model_name = "google/embeddinggemma-300m"
model = AutoModel.from_pretrained(model_name)

input_params = np.prod(model.get_input_embeddings().weight.shape)
total_params = np.sum([np.prod(x.shape) for x in list(model.parameters())])
active_params = total_params - input_params

So, what exactly, I should add in ModelMeta? Should I add 2 new field in ModelMeta with name active_parameter, and embedding_parameter, and should we show both of these on LB, and remove memory_usage and number of parameters?

@Samoed
Copy link
Member

Samoed commented Jan 4, 2026

Let's modify leaderboard in separate PR

1 doubt that I have is, value of n_parameters (total number of parameters) which we have in ModelMeta, is almost equal to(though not exact) same as total_params + embedding_params counted using code.

Yes, this is expected that total parameters = active parameters + embedding paramters

Should I add 2 new field in ModelMeta with name active_parameter, and embedding_parameter

Probably yes

should we show both of these on LB, and remove memory_usage and number of parameters?

Let's discuss this in issue

@Samoed Samoed changed the title Add active parameter column on LB Add active parameter to ModelMeta Jan 4, 2026
@KennethEnevoldsen KennethEnevoldsen marked this pull request as draft January 4, 2026 11:48
@KennethEnevoldsen
Copy link
Contributor

Converting this to a draft given the current state, but great to see it!

  • Should we have a script to compute active parameters (e.g. the automatic metadata). Should we consider how we handle MoE? (this is where active != total - embedding)
  • Leaderboard: Let us deal with that in a seperate PR
  • embedding parameters: If we define active parameters as total - embedding (might be a good enough assumption) I would probably just add embedding parameters and let the other one be a property. We could potentially allow an overwrite using e.g. self._active_parameters

@ayush1298
Copy link
Collaborator Author

  • Should we have a script to compute active parameters (e.g. the automatic metadata). Should we consider how we handle MoE? (this is where active != total - embedding)

Yes, I will be adding the logic to calculate automatically in from_hub method. I think MOE are case, where I was not able to calculate it. Hence I think atleast in ModelMeta we can keep n_parameters and add both active and embedding. Also, as I have already calculated these parameters for around 315 models, and around 60 models are propritory out of 498 models we have currently.

  • embedding parameters: If we define active parameters as total - embedding (might be a good enough assumption) I would probably just add embedding parameters and let the other one be a property. We could potentially allow an overwrite using e.g. self._active_parameters

For now, we can use csv. But, I think its better to keep both in ModelMeta as then we could also handle it for MOE case, where these property will not be useful.

@ayush1298
Copy link
Collaborator Author

I have updated files with these meta and also added it to .from_hub() method. I have also added script that I have used to update ModelMeta. Will remove these script at end.

@ayush1298
Copy link
Collaborator Author

ayush1298 commented Jan 4, 2026

Previously when I was doing make lint, it was fixing errors itself (before adding uv everywhere), but now somehow it was just giving error, but not fixing them. Why this was happening?

@Samoed
Copy link
Member

Samoed commented Jan 4, 2026

Strange. I don't have this behavior

@KennethEnevoldsen
Copy link
Contributor

Hmm do we want to use the csv? I think we should rather use the ModelMeta object - if not then I think we should use the json format similar to what we do with descriptive statistics

@Samoed
Copy link
Member

Samoed commented Jan 4, 2026

I think csv just for demonstration

@ayush1298
Copy link
Collaborator Author

ayush1298 commented Jan 5, 2026

Hmm do we want to use the csv? I think we should rather use the ModelMeta object - if not then I think we should use the json format similar to what we do with descriptive statistics

Its just for demonstration of all results I have calculated, and I had filled ModelMeta of models using that csv.
For future if someone wants to calculate or add it for models where I am not able to calculate, then can just do it in below way:

model_meta = mteb.get_model_meta("model_name")
active_parameters, embedding_parameters  = model_meta.extract_parameter_breakdown_from_hub

@ayush1298 ayush1298 marked this pull request as ready for review January 5, 2026 11:50
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would convert active parameters to a property and just add embedding parameters.

@ayush1298
Copy link
Collaborator Author

Converted active_parameters to property. I will check for which models n_parameter != embedding + active based on results in csv, and only for them will keep _n_active_parameters field in ModelMeta and will remove these field for rest of all models.

@ayush1298 ayush1298 force-pushed the add_active_parameter branch from 5eda71b to 3db3369 Compare January 6, 2026 14:38
@Samoed
Copy link
Member

Samoed commented Jan 11, 2026

I don't think we can do anything with it

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few suggestions - I think that should finalize this PR (thanks for taking the time on it)

@ayush1298
Copy link
Collaborator Author

I have deleted csv and created issue to fill n_embedding_parameters (#3947 ) and to calculate active_parameters for MOE models (#3948)

@KennethEnevoldsen KennethEnevoldsen changed the title Add active parameter to ModelMeta fix: Update metadata to include active number of parameter to ModelMeta Jan 19, 2026
@KennethEnevoldsen
Copy link
Contributor

@ayush1298 will to resolve the merge - I think this is good to go. @Samoed I marked it as fix since it is only an update on the metadata, but would be fine with doing a feat as well.

@Samoed Samoed force-pushed the add_active_parameter branch from c76e40c to 9cbabb5 Compare January 19, 2026 10:59
@Samoed Samoed enabled auto-merge (squash) January 19, 2026 11:00
@Samoed Samoed merged commit a45359e into embeddings-benchmark:main Jan 19, 2026
11 checks passed
@ayush1298 ayush1298 deleted the add_active_parameter branch January 19, 2026 11:17
Samoed added a commit that referenced this pull request Jan 31, 2026
* fix: Simplify conflicts (#3875)

* simplify conflicts

* add lock

* remove torch

* 2.6.6

Automatically generated by python-semantic-release

* model: add missing sentence transformers and jina models (#3808)

* add sentence transformers models

* add jina v2

* fix modalities

* Don't sync make lint (#3841)

* don't sync make lint

* don't sync make typecheck

* upd ci

* upd ci

* upd ci

* upd ci

* upd ci

* swap

* fix: nv embed version (#3715)

* fix nv embed wrapper

* try to fix

* fix sbert version

* 2.6.7

Automatically generated by python-semantic-release

* add dataset: KoViDoRe(v2) (#3876)

* add dataset: KoViDoRe v2

* fix citation format

* add direct loading

* lint format

* delete benchmark language view

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Add typehint for encode kwargs (#3831)

* add typehint for encode kwargs

* remove num_proc

* remove all num proc

* fix import

* fix docstrings

* model: mixedbread-ai/mxbai-rerank-large-v1 (#3905)

* Add model: mixedbread-ai/mxbai-rerank-large-v1

* apply suggestions

* Added xsmall and base version of reranker models

* lintter

* add model: bflhc/Octen-Embedding-0.6B (#3906)

* fix: KoVidore2EnergyRetrieval revision fix (#3913)

* 2.6.8

Automatically generated by python-semantic-release

* Artifacts for llama-embed-nemotron-8b model (#3919)

add artifacts for llama-embed-nemotron-8b model

* fix: model load test (#3914)

* fix model load test

* trigger on dependencies change

* 2.6.9

Automatically generated by python-semantic-release

* model: Adding voyage-4-large, voyage-4 and voyage-4-lite (#3885)

* Adding voyage-4-large and voyage-4-lite

* Adding voyage-4-large and voyage-4-lite

* Adding voyage-4

* Reverting voyage-4 (as the tokenizer is not yet available publicly)

* added superseeded_by

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* model: Update the nemo retriever reversions to avoid error when loading the model (#3925)

* Update the nemo retriever versions to fix the crash issue with visual_config

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* docs: Resolve problems with missing documentation links (#3834)

* resolve problems with missing documentation links

* split into files

* feat: Add vLLM support (#3794)

* init

* init

Signed-off-by: wang.yuqi <noooop@126.com>

* ruff

Signed-off-by: wang.yuqi <noooop@126.com>

* - vllm_loader

Signed-off-by: wang.yuqi <noooop@126.com>

* + TYPE_CHECKING

Signed-off-by: wang.yuqi <noooop@126.com>

* Make vLLM exit properly.

Signed-off-by: wang.yuqi <noooop@126.com>

* rename

Signed-off-by: wang.yuqi <noooop@126.com>

* support rerank

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* refine

Signed-off-by: wang.yuqi <noooop@126.com>

* refine

Signed-off-by: wang.yuqi <noooop@126.com>

* Update mteb/models/vllm_wrapper.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* refine

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + benchmark

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* + more benchmark

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* refine docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* refine docs

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* fix typing

* move type ignore

* doc upd

* add test

* Update Makefile

* add support for prompts

* add support for prompts

* - demo

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* make mypy happy

Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>

* fix typehints

* update pyproject

* update pyproject

* update pyproject

* The pooling + dp fails to run.

* fix uv lock

* fix docs

* simplify conflicts

* upd lock

* upd lock

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Apply suggestions from code review

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update docs/advanced_usage/vllm_wrapper.md

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Apply suggestion from @Samoed

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* update

Signed-off-by: wang.yuqi <noooop@126.com>

* update

Signed-off-by: wang.yuqi <noooop@126.com>

---------

Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* 2.7.0

Automatically generated by python-semantic-release

* dataset: add ChemRxivRetrieval task to ChemTEB benchmark (#3923)

* dataset: add ChemRxivRetrieval task to ChemTEB benchmark

* fix: add descriptive statistics

* feat: add ChemTEB v1.1 with ChemRxivRetrieval task

* fix: chemteb v1.1 alias

* dataset: Add EuroPIRQRetrieval dataset (#3924)

* dataset: Add EuroPIRQRetrieval dataset

* Removed unnecessary load dataset functions

* model: add nemotron rerank (#3750)

* add nemotron rerank

* move to nvidia models

* removed extra params

* Apply suggestions from code review

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

* remove or

* add docstring

* Update mteb/models/model_implementations/nvidia_models.py

Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>

* update

---------

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>

* Update references and citations for ViDoRe V3 benchmark (#3930)

* fix: Update references and citations for ViDoRe V3 benchmark

* foramat citation

* format again

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: Adding voyage-4 model (#3927)

* Adding voyage-4 model

* Adding voyage-4 model configs

* fix: temporarily remove private column from RTEB

Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up.

fixes #3902

* added issue link

* fix remove mean (Task)

* lint

* fix: Minor logging fixes by activate `LOG` rule (#3820)

activate logger rule

* 2.7.1

Automatically generated by python-semantic-release

* docs: fix vllm broken link (#3936)

fix vllm link

* model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m (#3931)

* Add model: mixedbread-ai/mxbai-edge-colbert-v0-32m and mixedbread-ai/mxbai-edge-colbert-v0-17m

* Lintter

* Add quotes

* Update dataset name

* Apply suggestions from code review

* Update mixedbread_ai_models.py

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* model: add pixie_models (#3938)

* model: add pixie_models

* Apply lint formatting

* fix: computation of results with missing scores (#3874)

* fix computation of results with missing scores

* fix test

* change 0 to nan

* change 0 to nan

* remove `fill_missing_scores`

* fix: expose `ResultCache` directly as `mteb.ResultCache` (#3912)

* fix: expose `ResultCache` directly as `mteb.ResultCache`

fixes #3910

* docs: Update docs usage of `ResultCache`

* merge in fixes to remove_private (#3940)

fix: exclude private tasks from Borda rank calculation in RTEB

Co-authored-by: bflhc <kunka.xgw@gmail.com>

* 2.7.2

Automatically generated by python-semantic-release

* fix typo (#3954)

* fix colSmol-256M revision (#3956)

* dedup colnomic_7b and fix loader (#3957)

* dedup colnomic_7b and fix loader

* remove flash_attention_2

* refactor: Activate `TC` (#3800)

* activate tc

* activate `TC`

* small import fix

* fix imports

* fix imports

* fix pil import

* fix benchmark result validation

* full benchmark fix

* update

* fix unpack imports

* upd vllm type

* fix: correct inverted unload_data condition in evaluate (#3929)

Add tests verifying preloaded data is preserved.

Co-authored-by: Daniel Svonava <daniel@superlinked.com>

* fix: temporarily remove private column from RTEB (#3932)

* fix: temporarily remove private column from RTEB

Link is still missing the note as I am waiting for @isaac-chung and @Samoed to confirm the write-up.

fixes #3902

* added issue link

* fix remove mean (Task)

* lint

* merge in fixes to remove_private (#3940)

fix: exclude private tasks from Borda rank calculation in RTEB

Co-authored-by: bflhc <kunka.xgw@gmail.com>

---------

Co-authored-by: bflhc <kunka.xgw@gmail.com>

* 2.7.3

Automatically generated by python-semantic-release

* refactor: split `BRIGHT` benchmark into individual subset tasks (#3285)

* refactor: split BRIGHT benchmark into individual subset tasks

* readd bright

* readd bright subset tasks

* feat: add descriptive stats for BRIGHT subsets retrieval tasks

* feat: add top_ranked for excluded_ids handling

* change main score to recall@1 for long version

* improve BRIGHT task descriptions

* add prompts to BRIGHT retrieval tasks

* refactor: BRIGHT(v1.1)

* calculate descriptive stats for BRIGHTLongRetrieval

* update prompts

* normalize names in prompts

* don't filter tasks

* remove filter_queries_without_positives and update revision

* don't create top ranked if not necessary

* get back naucs

* fix instructions

* add warning

* fix import

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* fix: Update metadata to include active number of parameter to `ModelMeta` (#3837)

* Add active parameter column on LB

* update ModelMeta with parameters

* update ModelMeta of models

* Delete parameter_update_results.csv

* fix test

* fix tests

* delete script

* rename for consistency

* convert active_parameter to property

* rename and fix property

* update embedding parameters for model2vec models

* remove duplicate loading of models

* fix

* lintter

* fix

* remove separate method for embedding parameter calculation

* fix embedding calculation to pass typecheck

* lintter

* fix checking

* rename active parameters

* upd docstring

* fix tests

* remove n_active_parameters_override from ModelMeta of all models

* lintter

* rename file instead of merging main

* fix tests

* correct tests

* Delete model total and active parameters - model_parameters.csv

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* 2.7.4

Automatically generated by python-semantic-release

* fix: use `num_proc` for dataset processing (#3832)

* add typehint for encode kwargs

* remove num_proc

* start adding num_proc

* remove all num proc

* fix import

* add num proc to transform

* add to push to hub

* use num proc in vidore v2

* move num proc to evaluate

* pass num proc everywhere

* fix tests

* fix pylate

* fix image text pair

* fix num workers

* add kwargs to `load_data`

* 2.7.5

Automatically generated by python-semantic-release

* fix: saving aggregated tasks (#3915)

fix saving

* 2.7.6

Automatically generated by python-semantic-release

* model: Adding voyage-4-large (2048d) model configs (#3970)

* Adding voyage-4-large (2048d) model configs

* Adding voyage-4-large 2048d model configs

* Adding voyage-4-large 2048d model configs

* fix: Ensure that retrieval tasks only evaluate on specified subsets instead of all (#3946)

* fix dataset loading

* update logging

* add test

* fix: Add `fill_missing` parameter in `get_model_meta` (#3801)

* Add compute missing parameter in get_model_meta

* fix logs

* fix

* fix from comments

* apply suggestion

* fix method

* add test and fix logic

* address comments

* rename compute_missing to fill_missing

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* fix: leaderboard Nan handling (#3965)

* fix leaderboard

* fix loading aggregated tasks

* Update mteb/results/task_result.py

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* 2.7.7

Automatically generated by python-semantic-release

* fix: Filled active_parameter_overiride for GritLM/GritLM-8x7B nomic-ai/nomic-embed-text-v2-moe (#3967)

* Filled active_parameter_overiride for ritLM/GritLM-8x7B and nomic-ai/nomic-embed-text-v2-moe

* add correct parameters for nomic-ai/nomic-embed-text-v2-moe

* 2.7.8

Automatically generated by python-semantic-release

* fix: add kwargs to pub chem load data (#3990)

add kwargs to pub chem load data

* 2.7.9

Automatically generated by python-semantic-release

* fix: `BAAI/bge-small-en` model revision (#3993)

fix(models): update invalid bge-small-en revision

* fix: NomicWrapper `get_prompt_name` call (#3995)

fix(models): correct get_prompt_name call in NomicWrapper

* 2.7.10

Automatically generated by python-semantic-release

* fix: `BedrockModel` initialization arguments (#3999)

fix: add model_name arg to BedrockModel init to prevent multiple values for model_id

* 2.7.11

Automatically generated by python-semantic-release

* fix: `dataset_transform` signature in PubChemWikiPairClassification (#4001)

fix: add num_proc arg to PubChemWikiPairClassification dataset_transform

* fix: all dataset transform (#4002)

fix dataset transform

* 2.7.12

Automatically generated by python-semantic-release

* model: Adding Ops-Colqwen3 models (#3987)

* Create ops_colqwen3_models.py

* Refactor OpsColQwen3 model and processor classes

* Update model revision in ops_colqwen3_models.py

* Remove calculate_probs method and fix model name

Removed the calculate_probs method and updated model name.

* format

* fix ds name

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: added nomic-ai/nomic-embed-code (#4006)

* Add model metadata for nomic-embed-code

Added new model metadata for 'nomic-embed-code'

* fix nomic_embed_code

* lint

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* Adding nvidia/nemotron-colembed models (#3941)

* Adding nvidia/nemotron-colembed models

* add colembed 4b, 8b model meta

* fix colembed-3b-v2 model name

* update revision for colembed 3b

* update revisions

* Update mteb/models/model_implementations/nvidia_llama_nemoretriever_colemb.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* lint

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* model: added Querit/Querit (#3996)

* querit_models_add

* Querit_Models_Change

* Update

* format revise

* add future

* format revise

* format revise

* last format revison

* last last revise

* last last last revison

* revise

* revise

* change the instruction

* last revison

* revise

* revise

* revise

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>

* Build image on leaderboard refresh (#4015)

build image on leaderboard refresh

* fix: simplify dependencies (#4017)

* 2.7.13

Automatically generated by python-semantic-release

* fix: Make `mteb.get_model` compatible with `CrossEncoders` (#3988)

* Made mteg.get_model compatible with CrossEncoders and SparseEncoders

* update loader for sparseEncoder

* fix import

* Simplify structure

* Add model_type to sparseEncoder models

* remove detection logic of sparsencoder

* Add tests and documentation

* simplified tests

* updated docs

* fix docs

* fix

* fix grammar

* Update docs/usage/defining_the_model.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs/advanced_usage/two_stage_reranking.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docs/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* address comments

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* rename bm25s to baseline/bm25s (#4007)

* rename bm25s to baseline/bm25s

* Update mteb/models/get_model_meta.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* remove logger message

* rename Human to baseline/Human

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix support for datsets 4.5 with pandas 3 (#3983)

* fix test

* fix: sanitize type for label during array conversion

* lint

* revert typo fix

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* lint

* fix typing

* fix test import

---------

Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: semantic-release <semantic-release>
Co-authored-by: Yongbin Choi <whybe.choi@gmail.com>
Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in>
Co-authored-by: bflhc <kunka.xgw@gmail.com>
Co-authored-by: Yauhen Babakhin <ybabakhin@nvidia.com>
Co-authored-by: fzoll <5575946+fzoll@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Sahel Sharifymoghaddam <sahel.sharifi@gmail.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: wang.yuqi <noooop@126.com>
Co-authored-by: HSILA <ali.shiraee@partners.basf.com>
Co-authored-by: Elias H <40372306+eherra@users.noreply.github.com>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
Co-authored-by: antoineedy <antoine.edy@illuin.tech>
Co-authored-by: Bong-Min Kim <klbm126@gmail.com>
Co-authored-by: svonava <svonava@gmail.com>
Co-authored-by: Daniel Svonava <daniel@superlinked.com>
Co-authored-by: HSILA <a.shiraee@gmail.com>
Co-authored-by: caoyi <caoyi0905@mail.hfut.edu.cn>
Co-authored-by: Lukas Kleybolte <32893711+Mozartuss@users.noreply.github.com>
Co-authored-by: rnyak <16246900+rnyak@users.noreply.github.com>
Co-authored-by: youngbeauty250 <140679097+youngbeauty250@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add active/embedding parameters separately

3 participants