Skip to content

Maeb merge main v2#3447

Merged
Samoed merged 217 commits intomaebfrom
maeb_merge_main_v2
Oct 22, 2025
Merged

Maeb merge main v2#3447
Samoed merged 217 commits intomaebfrom
maeb_merge_main_v2

Conversation

@Samoed
Copy link
Member

@Samoed Samoed commented Oct 20, 2025

  1. I've make all tasks and files to follow pep8
  2. Made torchadio as optional dependency
  3. Removed duplicates from v1

makram93 and others added 30 commits July 11, 2025 22:06
* feat: unify text and image embeddings for all tasks

* fix: uniform batch size

* fix: update error message

* fix: update code task

* fix: update max length

* fix: apply review suggestions
* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

---------

Co-authored-by: xinshuohu <[email protected]>
Co-authored-by: Xinshuo Hu <[email protected]>
* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Modifications due to the comments

* Modifications due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* fix revisiions

* lint

* fix colnomic3b revision

* fix colqwen2.5 revision + latest repo version

* fix query agmentation tokens

* colsmol revision
Automatically generated by python-semantic-release
* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Modifications due to the comments

* Modifications due to the comments

* Adding STSEvaluator and SummarizationEvaluator tests

* Correcting due to the comments

* Correcting due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
* Classification dataset cleaning

* Update pull request number

* Fix metadata test

* fix formatting

* add script for cleaning
Add JapaneseSentimentClassification
* change document to passage

* fix prompt names

* fix kwargs check

* fix default prompt
Automatically generated by python-semantic-release
add opensearch inf-free models

Co-authored-by: Isaac Chung <[email protected]>
* Add BareExamQA retrieval task

* ran linter

* updated details

* updated details

* fixed subtype name

* fixed changes

* ran linter again
specify revision for opensearch
Automatically generated by python-semantic-release
… been checked (#2940)

* fix: Only import SparseEncoder once sentence-transformer version have been checked

fixes #2936

* Update mteb/models/opensearch_neural_sparse_models.py

Co-authored-by: Isaac Chung <[email protected]>

---------

Co-authored-by: Isaac Chung <[email protected]>
…2939)

The leaderboard would have (silent) errors where `get_benchmark` lead to a KeyError due to "selector_state" being passed as a default value. Setting `DEFAULT_BENCMARK_NAME` as the value solves this issue.
* docs: Update adding_a_dataset.md

* Update docs/adding_a_dataset.md
Automatically generated by python-semantic-release
* BSARD loader fixed

* BSARDv2 metadata fixed

* Update mteb/tasks/Retrieval/fra/BSARDRetrieval.py

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
* Added govreport task

* Updated description
* Added BillSum datasets

* fixed billsumca

* Updated BillSumCA description

* Updated BillSumUS description

* Update mteb/tasks/Retrieval/eng/BillSumCA.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* Update mteb/tasks/Retrieval/eng/BillSumUS.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

* lint

* lint

---------

Co-authored-by: Kenneth Enevoldsen <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
…2716)

* Add RuSciBench

* fix bitext mining lang

* Add regression task

* fix init

* add missing files

* Improve description

* Add superseded_by

* fix lint

* Update regression task to match with v2

* Add stratified_subsampling for regression task

* Add boostrap for regression task

* Rename task class, add model as evaluator argument

* fix import

* fix import 2

* fixes

* fix

* Rename regression model protocol
@KennethEnevoldsen
Copy link
Contributor

Do you want to do all of this in one go? I would probably just transfer one task type over at a time

@Samoed
Copy link
Member Author

Samoed commented Oct 20, 2025

I just want to make basic merge firstly to make tests runnable. After this, I will update per task type

@Samoed Samoed changed the base branch from maeb_v2 to maeb October 20, 2025 19:14
@Samoed Samoed marked this pull request as ready for review October 21, 2025 11:05
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impossible to review. Gotta trust you here

@Samoed
Copy link
Member Author

Samoed commented Oct 21, 2025

You can view last 11 commits

@KennethEnevoldsen
Copy link
Contributor

Ahh yea that was a good idea. Yeah changes looks reasonable!

@Samoed Samoed merged commit 9f1c7a6 into maeb Oct 22, 2025
10 checks passed
@Samoed Samoed deleted the maeb_merge_main_v2 branch October 22, 2025 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.