-
Notifications
You must be signed in to change notification settings - Fork 554
Merge main 05 10 #3246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Merge main 05 10 #3246
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* model: Add BMRetriever * Update mteb/models/bmretriever_models.py Co-authored-by: Roman Solomatin <[email protected]> * Update mteb/models/bmretriever_models.py Co-authored-by: Roman Solomatin <[email protected]> * fix: remove trust_remote_code option * feat: implement BMREtrieverWrapper based on InstructSentenceTransformerWrapper * refactor: update training datasets for bmretriever --------- Co-authored-by: Roman Solomatin <[email protected]>
* add codefuse models * add codefuse models * Update codefuse_models.py * lint codefuse.py
* Adding Cohere's output_dimension and embedding_type parameter Cohere's embed-v4 binary and int8 * Correcting due to comments
* feat: add swedish cpc patent classifications to mteb * fix: formatting and init imports * fix: update mteb task according to feedback * fix: perform citation and code formatting * fix: add train and test split for both datasets
* fix: delete kwargs for similarity score in ColPaliEngineWrapper for method behavior * chore: fix colpali_models similarity handle device
* fix(models): prevent EOS token truncation for BMRetriever * refactor(models): refactor tokenizer setup in `InstructSentenceTransformerWrapper` * fix(models): correct eos token handling in `BMRetrieverWrapper`
* update giga embeddings * update giga embeddings * 3b-september-2025 * fixed * lint * Update mteb/models/ru_sentence_models.py Co-authored-by: Roman Solomatin <[email protected]> * change revision due to flash-attn dependency * change apply_instruction_to_passages --------- Co-authored-by: Kolodin Egor <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Неизвестный Пользователь722497 <[email protected]>
* feat - Split create_tables into static Benchmark methods * feat - format * Update mteb/leaderboard/table.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * feat - remove search query;take benchmark result as input;addressing the circular import, * feat - format * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * feat - use to_dataframe;clean table.py;move creat_table * feat - fix circular import * feat - clean-up * feat - format --------- Co-authored-by: Kenneth Enevoldsen <[email protected]>
Adding another voyageai model
* Update qzhou_models.py * Update qzhou_models.py * reformat script code * Update configuration * According to our new decision, the model name has been changed to "QZhou-Embedding-Zh". * Fix variable naming issues.
* add youtu models * add a blank line * fix the optional dependencies and lint the code * remove unused dependencies and reformat * revise prompt_type * update youtu_models --------- Co-authored-by: springxchen <[email protected]>
* add software issue localization datasets * add software issue localization datasets * update and add multilingual datasets * fix citation format issues * Update mteb/tasks/Reranking/eng/SWEbenchVerifiedReranking.py * fix linting issues --------- Co-authored-by: Roman Solomatin <[email protected]>
* feat - adjust Rteb's Benchmark * feat - add blank * fix menu names * Update mteb/leaderboard/benchmark_selector.py Co-authored-by: Roman Solomatin <[email protected]> * moving around tasks * fix: Update RTEB summary columns (#3226) * fix(models): ensure prompt_type is passed to format_instruction (#3216) * 1.38.58 Automatically generated by python-semantic-release * Adding Cohere's output_dimension and embedding_type parameter (#3204) * Adding Cohere's output_dimension and embedding_type parameter Cohere's embed-v4 binary and int8 * Correcting due to comments * dataset: add swedish cpc patent classifications to mteb (#3072) * feat: add swedish cpc patent classifications to mteb * fix: formatting and init imports * fix: update mteb task according to feedback * fix: perform citation and code formatting * fix: add train and test split for both datasets * fix: AttributeError in ColPaliEngineWrapper similarity method (#3177) * fix: delete kwargs for similarity score in ColPaliEngineWrapper for method behavior * chore: fix colpali_models similarity handle device * Update tasks & benchmarks tables * 1.38.59 Automatically generated by python-semantic-release * fix: prevent EOS token truncation (#3218) * fix(models): prevent EOS token truncation for BMRetriever * refactor(models): refactor tokenizer setup in `InstructSentenceTransformerWrapper` * fix(models): correct eos token handling in `BMRetrieverWrapper` * 1.38.60 Automatically generated by python-semantic-release * Update giga embeddings (#3210) * update giga embeddings * update giga embeddings * 3b-september-2025 * fixed * lint * Update mteb/models/ru_sentence_models.py Co-authored-by: Roman Solomatin <[email protected]> * change revision due to flash-attn dependency * change apply_instruction_to_passages --------- Co-authored-by: Kolodin Egor <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Неизвестный Пользователь722497 <[email protected]> * fix: Refactor split create_tables into static Benchmark methods (#3126) * feat - Split create_tables into static Benchmark methods * feat - format * Update mteb/leaderboard/table.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * feat - remove search query;take benchmark result as input;addressing the circular import, * feat - format * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <[email protected]> * feat - use to_dataframe;clean table.py;move creat_table * feat - fix circular import * feat - clean-up * feat - format --------- Co-authored-by: Kenneth Enevoldsen <[email protected]> * 1.38.61 Automatically generated by python-semantic-release * Extending the RTEB benchmark (#3223) Adding another voyageai model * Update tasks & benchmarks tables * feat - filter_by_privacy * feat - add new fields for rteb part * feat - getattr * feat - adjust privacy filter logic * feat - enhance summary table column renaming and add 'is_public' field mapping * fix: remove unused 'is_public' attribute from TaskResult --------- Co-authored-by: Yongbin Choi <[email protected]> Co-authored-by: semantic-release <semantic-release> Co-authored-by: fzoll <[email protected]> Co-authored-by: Atheer <[email protected]> Co-authored-by: Yong woo Song <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Egor <[email protected]> Co-authored-by: Kolodin Egor <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Неизвестный Пользователь722497 <[email protected]> Co-authored-by: Kenneth Enevoldsen <[email protected]> Co-authored-by: smile <[email protected]> Co-authored-by: ethan <[email protected]> * removed show_rteb args * avoid defining function where we can just use the metadata * minor fixes * minor fixes * fix: Correct logic for filtering public tasks in ModelResult class (#3230) Co-authored-by: ethan <[email protected]> --------- Co-authored-by: q275343119 <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: 笑尿伊人 <[email protected]> Co-authored-by: Yongbin Choi <[email protected]> Co-authored-by: fzoll <[email protected]> Co-authored-by: Atheer <[email protected]> Co-authored-by: Yong woo Song <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Egor <[email protected]> Co-authored-by: Kolodin Egor <[email protected]> Co-authored-by: Roman Solomatin <[email protected]> Co-authored-by: Неизвестный Пользователь722497 <[email protected]> Co-authored-by: smile <[email protected]> Co-authored-by: ethan <[email protected]>
* fix: Add rteb submission references and improve descriptions. * Added evaluation request * added field for tasks
* Human Subsets Tasks * Fixed Multilingual Classification Subset * linting * fix citations format * make lint * fix tests * remove human folder * fix relative imports * add adapted_from for all human subsets * fix pydantic errors * add benchmark object * make benchmark discoverable * bibtex test * Apply suggestion Co-authored-by: Kenneth Enevoldsen <[email protected]> * Apply suggestions from code review Co-authored-by: Kenneth Enevoldsen <[email protected]> * rename & reupload * upd tests * upd tests again * add model * add benchmark to leaderboard * change branch of leaderboard * remove branch of load data * fix model meta path * make mteb importable * update repo * Update mteb/benchmarks/benchmarks/benchmarks.py * Update mteb/leaderboard/benchmark_selector.py * Update mteb/load_results/load_results.py Co-authored-by: Roman Solomatin <[email protected]> --------- Co-authored-by: Adnan El Assadi <[email protected]> Co-authored-by: Isaac Chung <[email protected]> Co-authored-by: Kenneth Enevoldsen <[email protected]> Co-authored-by: AdnanElAssadi56 <[email protected]>
* Remove 'HUME(v1)' from leaderboard benchmark * lint
* update adding_a_benchmark.md documentation * fix numbers
* fix: Further specified macro-language code for Norwegian "nor" is a macro-language code that covers bokmål and nynorsk (both norwegian), but this means that these datasets will be missed if using "nob" or "nno". Specifying it like this should allow this. * furhter specified macro language "nor"
# Conflicts: # docs/benchmarks.md # mteb/benchmarks/benchmark.py # mteb/benchmarks/benchmarks/__init__.py # mteb/benchmarks/benchmarks/benchmarks.py # mteb/evaluation/evaluators/RerankingEvaluator.py # mteb/leaderboard/benchmark_selector.py # mteb/leaderboard/table.py # mteb/load_results.py # mteb/models/abs_encoder.py # mteb/models/instruct_wrapper.py # mteb/models/model_implementations/cohere_models.py # mteb/models/model_implementations/cohere_v.py # mteb/models/model_implementations/ru_sentence_models.py # mteb/models/model_implementations/youtu_models.py # mteb/models/overview.py # mteb/results/benchmark_results.py # mteb/tasks/Classification/__init__.py # mteb/tasks/Clustering/__init__.py # mteb/tasks/MultiLabelClassification/__init__.py # mteb/tasks/Reranking/__init__.py # mteb/tasks/Retrieval/multilingual/MKQARetrieval.py # mteb/tasks/STS/__init__.py # scripts/make_leaderboard.py
7 tasks
* fix python39 transformers * fix
aggregate by subset for HUMEv1
Fix AbsTaskTextRegression
* feat - add Japanese * feat - use mteb.get_benchmark * fix - 3.9 test error * Revert "fix - 3.9 test error" This reverts commit 6bfee53. * fix - 3.9 test error
# Conflicts: # mteb/benchmarks/benchmarks/__init__.py # mteb/benchmarks/benchmarks/benchmarks.py # mteb/models/bm25.py
ec748ef to
6e2766d
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If you add a model or a dataset, please add the corresponding checklist: