[ADD] 2 new Vietnamese Retrieval Datasets#2393
[ADD] 2 new Vietnamese Retrieval Datasets#2393Samoed merged 8 commits intoembeddings-benchmark:mainfrom
Conversation
| annotations_creators="human-annotated", | ||
| dialect=[], | ||
| sample_creation="found", | ||
| bibtex_citation="""""", |
There was a problem hiding this comment.
This dataset doesn't have a paper?
There was a problem hiding this comment.
@isaac-chung What do you think should be done with citations?
There was a problem hiding this comment.
Let's put a TODO in, and keep an open issue for this, so we can revisit when the paper is published or on Arxiv.
| annotations_creators="human-annotated", | ||
| dialect=[], | ||
| sample_creation="found", | ||
| bibtex_citation="""""", |
There was a problem hiding this comment.
This dataset doesn't have a paper?
There was a problem hiding this comment.
This dataset comes from a challenge hosted by Zalo AI in 2021. This is landing page. https://challenge.zalo.ai
There a lots of people using this dataset to evaluate their model, such as https://huggingface.co/collections/GreenNode/greennode-text-embedding-models-66a75c00889910bc76007de5 , https://huggingface.co/AITeamVN/Vietnamese_Embedding, etc.
* SpeedTask add deprecated warning (#2493) * Docs: Update README.md (#2494) Update README.md * fix transformers version for now (#2504) * Fix typos (#2509) * ci: refactor TaskMetadata eval langs test (#2501) * refactor eval langs test * function returns None * add hard negaties tasks in _HISTORIC_DATASETS * rename to ImageClustering folder (#2516) rename folder * Clean up trailing spaces citation (#2518) * rename folder * trailing spaces * missed one * [mieb] Memotion preprocessing code made more robust and readable (#2519) * fix: validate lang code in ModelMeta (#2499) * Update pyproject.toml (#2522) * 1.36.38 Automatically generated by python-semantic-release * Fix leaderboard version (#2524) * fix gradio leaderboard run * update docs * Fix gte-multilingual-base embed_dim (#2526) * [MIEB] Specify only the multilingual AggTask for MIEB-lite (#2539) specify only the multilingual AggTask * [mieb] fix hatefulmemes (#2531) * fix hatefulmeme * add to description and use polars instead --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Model conan (#2534) * conan_models * conan_models * refactor code * refactor code --------- Co-authored-by: shyuli <shyuli@tencent.com> * fix: Update mteb.get_tasks with an exclude_aggregate parameter to exclude aggregate tasks (#2536) * Implement task.is_aggregate check * Add `mteb.get_tasks` parameter `include_aggregate` to exclude aggregate tasks if needed * Update mteb.run with the new `task.is_aggregate` parameter * Add tests * Ran linter * Changed logic to `exclude_aggregate` * Updated from review comments * Exclude aggregate by default false in get_tasks * 1.36.39 Automatically generated by python-semantic-release * docs: Add MIEB citation in benchmarks (#2544) Add MIEB citation in benchmarks * Add 2 new Vietnamese Retrieval Datasets (#2393) * [ADD] 2 new Datasets * [UPDATE] Change bibtext_citation for GreenNodeTableMarkdownRetrieval as TODO * [UPDATE] Change bibtext_citation for ZacLegalTextRetrieval as TODO * Update tasks table * fix: CacheWrapper per task (#2467) * feat: CacheWrapper per task * refactor logic * update documentation --------- Co-authored-by: Florian Rottach <florianrottach@boehringer-ingelheim.com> * 1.36.40 Automatically generated by python-semantic-release * misc: move MMTEB scripts and notebooks to separate repo (#2546) move mmteb scripts and notebooks to separate repo * fix: Update requirements in JinaWrapper (#2548) fix: Update package requirements in JinaWrapper for einops and flash_attn * 1.36.41 Automatically generated by python-semantic-release * Docs: Add MIEB to README (#2550) Add MIEB to README * Add xlm_roberta_ua_distilled (#2547) * defined model metadata for xlm_roberta_ua_distilled * Update mteb/models/ua_sentence_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * included ua_sentence_models.py in overview.py * applied linting, added missing fields in ModelMeta * applied linting --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix me5 trainind data config to include xquad dataset (#2552) * fix: me5 trainind data config to include xquad dataset * Update mteb/models/e5_models.py upddate: xquad key name Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix: ME5_TRAINING_DATA format --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * feat: Added dataframe utilities to BenchmarkResults (#2542) * fix: Added dataframe utilities to BenchmarkResults - Added `get_results_table`. I was considering renaming it to `to_dataframe` to align with `tasks.to_dataframe`. WDYT? - Added a tests for ModelResults and BenchmarksResults - Added a few utility functions where needed - Added docstring throughout ModelResults and BenchmarksResults - Added todo comment for missing aspects - mostly v2 - but we join_revisions seems like it could use an update before then. Prerequisite for #2454: @ayush1298 can I ask you to review this PR as well? I hope this give an idea of what I was hinting at. Sorry that it took a while. I wanted to make sure to get it right. * refactor to to_dataframe and combine common dependencies * ibid * fix revision joining after discussion with @x-tabdeveloping * remove strict=True for zip() as it is a >3.9 feature * updated mock cache * 1.37.0 Automatically generated by python-semantic-release * fix e5_R_mistral_7b (#2490) * fix e5_R_mistral_7b * change wrapper * address comments * Added kwargs for pad_token * correct lang format * address comments * add revision --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix unintentional working of filters on leaderboard (#2535) * fix unintentional working of filters on leaderboard * address comments * make lint * address comments * rollback unnecessary changes * feat: UI Overhaul (#2549) * Bumped gradio version to latest * Added new Gradio table functionality to leaderboard * Removed search bar * Changed color scheme in plot to match the table * Added new benchmark selector in sidebar * Changed not activated button type to secondary * Short-circuited callbacks that are based on language selection * Re-added column width calculation since it got messed up * Commented out gradient for per-task table as it slowed things down substantially * Styling and layout updates * Adjusted comments according to reviews * Converted all print statements to logger.debug * Removed pydantic version fix * Ran linting * Remove commented out code Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * Moved English,v1 to Legacy section * Closed the benchmark sharing accordion by default * Adjusted markdown blocks according to suggestions * Ran linter --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * 1.38.0 Automatically generated by python-semantic-release * add USER2 (#2560) * add user2 * add training code * update prompts * Fix leaderboard entry for BuiltBench (#2563) Fix leaderboard entry for BuiltBench (#2562) Co-authored-by: Mehrzad Shahin-Moghadam <mehr@Mehrzads-MacBook-Pro.local> * fix: jasper models embeddings having nan values (#2481) * 1.38.1 Automatically generated by python-semantic-release * fix frida datasets (#2565) * Add relle (#2564) * Add relle * defined model metadata for relle * Add mteb/models/relle_models.py * Update mteb/models/relle_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * lint after commit run after "make lint" * Add into model_modules Add model into model_modules and lint check --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Backfill task metadata for metadata for GermanDPR and GermanQuAD (#2566) * Add metadata for GermanDPR and GermanQuAD * PR improvements * Update tasks table * Add ModelMeta for CodeSearch-ModernBERT-Crow-Plus (#2570) * Add files via upload * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update overview.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update mteb/models/shuu_model.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Docs: Improve MIEB docs (#2569) * Add missing annotations (#2498) * Update tasks table * move icon & name to benchmark dataclass (#2573) * Remove the comments from ImageEncoder (#2579) * fix: Add Encodechka benchmark (#2561) * add tasks * add benchmark * fix imports * update stsb split * Update tasks table * 1.38.2 Automatically generated by python-semantic-release * fix FlagEmbedding package name (#2588) * fix codecarbon version (#2587) * Add MIEB image only benchmark (#2590) * add vision only bench * add description * correct zs task modalities * specify tasks param * Add image only MIEB benchmark to LB left panel (#2596) * Update benchmarks.py * make lint * add to left side bar * update Doubao-1.5-Embedding (#2575) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix: Add WebSSL models (#2604) * add 2 web SSL dino models * add models from collection and revisions * update memory_usage_mb and embed dim * use automodel instead * fix mieb citation (#2606) * 1.38.3 Automatically generated by python-semantic-release * Update Doubao-1.5-Embedding (#2611) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint * update link --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * CI: update benchmark table (#2609) * update benchmark table * fix table * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update Doubao-1.5-Embedding revision (#2613) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint * update link * update revision --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * CI: fix table (#2615) * Update tasks & benchmarks tables * fixes * Update gradio version (#2558) * Update gradio version Closes #2557 * bump gradio * fix: Removed missing dataset for MTEB(Multilingual) and bumped version We should probably just have done this earlier to ensure that the multilingual benchamrk is runable. * CI: fix infinitely committing issue (#2616) * fix token * try to trigger * add token * test ci * Update tasks & benchmarks tables * Update tasks & benchmarks tables * remove test lines --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix retrieval loader * add descriptive stats * Add ScandiSent dataset (#2620) * add scandisent dataset * add to init * typo * lint * 1.38.4 Automatically generated by python-semantic-release * Format all citations (#2614) * Fix errors in bibtex_citation * Format all bibtex_citation fields * format benchmarks * fix format * Fix tests * add formatting script * fix citations * update imports * fix citations * fix citations * format citation --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: chenghao xiao <85804993+gowitheflow-1998@users.noreply.github.com> Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in> Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: E. Tolga Ayan <33233561+tolgayan@users.noreply.github.com> Co-authored-by: lllsy12138 <50816213+lllsy12138@users.noreply.github.com> Co-authored-by: shyuli <shyuli@tencent.com> Co-authored-by: Siddharth M. Bhatia <siddharth@sidmb.com> Co-authored-by: Bao Loc Pham <67360122+BaoLocPham@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Flo <FlorianRottach@aol.com> Co-authored-by: Florian Rottach <florianrottach@boehringer-ingelheim.com> Co-authored-by: Alexey Vatolin <vatolinalex@gmail.com> Co-authored-by: Olesksii Horchynskyi <121444758+panalexeu@users.noreply.github.com> Co-authored-by: Pandaswag <110003154+torchtorchkimtorch@users.noreply.github.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Márton Kardos <power.up1163@gmail.com> Co-authored-by: Mehrzad Shahin-Moghadam <42153677+mehrzadshm@users.noreply.github.com> Co-authored-by: Mehrzad Shahin-Moghadam <mehr@Mehrzads-MacBook-Pro.local> Co-authored-by: Youngjoon Jang <82500463+yjoonjang@users.noreply.github.com> Co-authored-by: 24September <puritysarah@naver.com> Co-authored-by: Jan Karaś <90987511+KTFish@users.noreply.github.com> Co-authored-by: Shuu <136542198+Shun0212@users.noreply.github.com> Co-authored-by: namespace-Pt <61188463+namespace-Pt@users.noreply.github.com> Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com>
* Update tasks table * 1.36.26 Automatically generated by python-semantic-release * Pass task name to all evaluators (#2389) * pass task name to all tasks * add test * fix loader * fix: renaming Zeroshot -> ZeroShot (#2395) * fix: renaming Zeroshot -> ZeroShot Adresses #2078 * rename 1 * rename 2 * format * fixed error * 1.36.27 Automatically generated by python-semantic-release * fix: Update AmazonPolarityClassification license (#2402) Update AmazonPolarityClassification.py * fix b1ade name (#2403) * 1.36.28 Automatically generated by python-semantic-release * Minor style changes (#2396) * fix: renaming Zeroshot -> ZeroShot Adresses #2078 * fix: minor style changes Adresses #2078 * rename 1 * rename 2 * format * fixed error --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Added new dataset and tasks - ClusTREC-covid , clustering of thematic covid related scientific papers (#2302) * Clustrec covid new dataset and task * fix * fix * fix * fix * fix * descriptive stats * change all mentions of clustrec-covidp2p to clustrec-covid * change ' to " * Update tasks table * fix: Major updates to docs + make mieb dep optional (#2397) * fix: renaming Zeroshot -> ZeroShot Adresses #2078 * fix: minor style changes Adresses #2078 * fix: Major updates to documentation This PR does the following: - This introduced other modalities more clearly in the documentation as well as make it easier to transition to a full on documentation site later. - added minor code updates due to discovered inconsistencies in docs and code. - Added the MMTEB citation where applicable - makes the docs ready to move torchvision to an optional dependency * Moved VISTA example * rename 1 * rename 2 * format * fixed error * fix: make torchvision optional (#2399) * fix: make torchvision optional * format * add docs * minor fix * remove transform from Any2TextMultipleChoiceEvaluator --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * move Running SentenceTransformer model with prompts to usage --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * 1.36.29 Automatically generated by python-semantic-release * remove Arabic_Triplet_Matryoshka_V2.py (#2405) * Min torchvision>0.2.1 (#2410) matching torch>1.0.0 * fix: Add validation to model_name in `ModelMeta` (#2404) * add test for name validation * upd docs * upd cohere name * fix tests * fix name for average_word_embeddings_komninos * fix name for average_word_embeddings_komninos * fix reranker test * fix reranker test * 1.36.30 Automatically generated by python-semantic-release * [MIEB] "capability measured"-Abstask 1-1 matching refactor [1/3]: reimplement CV-Bench (#2414) * refactor CV-Bench * reimplement CV Bench * remove abstask/evaluator/tests for Any2TextMultipleChoice * rerun descriptive stats * Update tasks table * fix: Add option to remove benchmark from leaderboard (#2417) fix: Add option to remove leaderboard from leaderboard fixes #2413 This only removed the benchmark from the leaderboard but keep it in MTEB. * 1.36.31 Automatically generated by python-semantic-release * fix: Add VDR Multilingual Dataset (#2408) * Added VDR Multilingual Dataset * address comments * make lint * Formated Dataset for retrieval * Update mteb/tasks/Retrieval/multilingual/VdrMultilingualRetrieval.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update mteb/tasks/Retrieval/multilingual/VdrMultilingualRetrieval.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * make lint * corrected date * fix dataset building * move to image folder --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Update tasks table * 1.36.32 Automatically generated by python-semantic-release * HOTFIX: pin setuptools (#2423) * pin setuptools * pin setuptools * pin setuptools in makefile * try ci * fix ci * remove speed from installs * add __init__.py Clustering > kor folder, And edit __init__.py in Clustering folder (#2422) * add PatentFnBClustering.py * do make lint and revise * rollback Makefile * Update mteb/tasks/Clustering/kor/PatentFnBClustering.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * klue_mrc_domain * make lint * klue_modified_clustering_dataset * clustering & kor folder add __init.py * clustering & kor folder add __init__.py * task.py roll-back * correct text_creation to sample_creation & delete form in MetaData * correct task_subtype in TaskMetaData * delete space * edit metadata * edit task_subtypes --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update tasks table * Update speed dependencies with new setuptools release (#2429) * add richinfoai models (#2427) * add richinfoai models add richinfoai models * format codes by linter format codes by linter * Added Memory Usage column on leaderboard (#2428) * docs: typos; Standardize spacing; Chronological order (#2436) * Fix typos; add chrono order * Fix spacing * fix: Add model specific dependencies in pyproject.toml (#2424) * Add model specific dependencies in pyproject.toml * Update documentation * 1.36.33 Automatically generated by python-semantic-release * [MIEB] "capability measured"-Abstask 1-1 matching refactor [2/3]: reimplement r-Oxford and r-Paris (#2442) * MutipleChoiceEvaluationMixin; reimplement r-Oxford and r-Paris; rerun stats * modify benchmark list * fix citation * Update tasks table * Error while evaluating MIRACLRetrievalHardNegatives: 'trust_remote_code' (#2445) Fixes #2444 * Feat/searchmap preview (#2420) * Added meta information about SearchMap_Preview model to the model_dir * Added meta information about SearchMap_Preview model to the model_dir * updated revision name * Device loading and cuda cache cleaning step left out * removed task instructions since it's not necessary * changed sentence transformer loader to mteb default loader and passed instructions s model prompts * Included searchmap to the models overview page * Included searchmap to the models overview page * added meta data information about where model was adpated from * Update mteb/models/searchmap_models.py * fix lint * lint --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> * Add Background Gradients in Summary and Task Table (#2392) * Add Background Gradients in Summary and Task Table * Remove warnings and add light green cmap * Address comments * Separate styling function * address comments * added comments * add ops_moa_models (#2439) * add ops_moa_models * add custom implementations * Simplify custom implementation and format the code * support SentenceTransformers * add training datasets * Update mteb/models/ops_moa_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update training_datasets --------- Co-authored-by: kunka.xgw <kunka.xgw@taobao.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * leaderboard fix (#2456) * ci: cache `~/.cache/huggingface` (#2464) ci: cache ~/.cache/huggingface Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com> * [MIEB] "capability measured"-Abstask 1-1 matching refactor [3/3]: reimplement ImageCoDe (#2468) * reimplement ImageCoDe with ImageTextPairClassification * add missing stats file * Update tasks table * fix: Adds family of NeuML/pubmedbert-base-embedding models (#2443) * feat: added pubmedbert model2vec models * fix: attribute model_name * fix: fixed commit hash for pubmed_bert model2vec models * fix: changes requested in PR 2443 * fix: add nb_sbert model (#2339) * add_nb_sbert_model * Update nb_sbert.py added n_parameters and release_date * Update mteb/models/nb_sbert.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update nb_sbert.py fix: make lint * added nb_sbert to overview.py + ran make lint * Update nb_sbert.py Fix error: Input should be a valid date or datetime, month value is outside expected range of 1-12 --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * 1.36.34 Automatically generated by python-semantic-release * suppress logging warnings on leaderboard (#2406) * supress logging warnings * remove loggers * return blocks * rename function * fix gme models * add server name * update after merge * fix ruff * fix: E5 instruct now listed as sbert compatible (#2475) Fixes #1442 * 1.36.35 Automatically generated by python-semantic-release * [MIEB] rename VisionCentric to VisionCentricQA (#2479) rename VisionCentric to VisionCentricQA * ci: Run dataset loading only when pushing to main (#2480) Update dataset_loading.yml * fix table in tasks.md (#2483) * Update tasks table * fix: add prompt to NanoDBPedia (#2486) * 1.36.36 Automatically generated by python-semantic-release * Fix Task Lang Table (#2487) * Fix Task Lang Table * added tasks.md * fix * fix: Ignore datasets not available in tests (#2484) * 1.36.37 Automatically generated by python-semantic-release * [MIEB] align main metrics with leaderboard (#2489) align main metrics with leaderboard * typo in model name (#2491) * SpeedTask add deprecated warning (#2493) * Docs: Update README.md (#2494) Update README.md * fix transformers version for now (#2504) * Fix typos (#2509) * ci: refactor TaskMetadata eval langs test (#2501) * refactor eval langs test * function returns None * add hard negaties tasks in _HISTORIC_DATASETS * rename to ImageClustering folder (#2516) rename folder * Clean up trailing spaces citation (#2518) * rename folder * trailing spaces * missed one * [mieb] Memotion preprocessing code made more robust and readable (#2519) * fix: validate lang code in ModelMeta (#2499) * Update pyproject.toml (#2522) * 1.36.38 Automatically generated by python-semantic-release * Fix leaderboard version (#2524) * fix gradio leaderboard run * update docs * Fix gte-multilingual-base embed_dim (#2526) * [MIEB] Specify only the multilingual AggTask for MIEB-lite (#2539) specify only the multilingual AggTask * [mieb] fix hatefulmemes (#2531) * fix hatefulmeme * add to description and use polars instead --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * Model conan (#2534) * conan_models * conan_models * refactor code * refactor code --------- Co-authored-by: shyuli <shyuli@tencent.com> * fix: Update mteb.get_tasks with an exclude_aggregate parameter to exclude aggregate tasks (#2536) * Implement task.is_aggregate check * Add `mteb.get_tasks` parameter `include_aggregate` to exclude aggregate tasks if needed * Update mteb.run with the new `task.is_aggregate` parameter * Add tests * Ran linter * Changed logic to `exclude_aggregate` * Updated from review comments * Exclude aggregate by default false in get_tasks * 1.36.39 Automatically generated by python-semantic-release * docs: Add MIEB citation in benchmarks (#2544) Add MIEB citation in benchmarks * Add 2 new Vietnamese Retrieval Datasets (#2393) * [ADD] 2 new Datasets * [UPDATE] Change bibtext_citation for GreenNodeTableMarkdownRetrieval as TODO * [UPDATE] Change bibtext_citation for ZacLegalTextRetrieval as TODO * Update tasks table * fix: CacheWrapper per task (#2467) * feat: CacheWrapper per task * refactor logic * update documentation --------- Co-authored-by: Florian Rottach <florianrottach@boehringer-ingelheim.com> * 1.36.40 Automatically generated by python-semantic-release * misc: move MMTEB scripts and notebooks to separate repo (#2546) move mmteb scripts and notebooks to separate repo * fix: Update requirements in JinaWrapper (#2548) fix: Update package requirements in JinaWrapper for einops and flash_attn * 1.36.41 Automatically generated by python-semantic-release * Docs: Add MIEB to README (#2550) Add MIEB to README * Add xlm_roberta_ua_distilled (#2547) * defined model metadata for xlm_roberta_ua_distilled * Update mteb/models/ua_sentence_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * included ua_sentence_models.py in overview.py * applied linting, added missing fields in ModelMeta * applied linting --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix me5 trainind data config to include xquad dataset (#2552) * fix: me5 trainind data config to include xquad dataset * Update mteb/models/e5_models.py upddate: xquad key name Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix: ME5_TRAINING_DATA format --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * feat: Added dataframe utilities to BenchmarkResults (#2542) * fix: Added dataframe utilities to BenchmarkResults - Added `get_results_table`. I was considering renaming it to `to_dataframe` to align with `tasks.to_dataframe`. WDYT? - Added a tests for ModelResults and BenchmarksResults - Added a few utility functions where needed - Added docstring throughout ModelResults and BenchmarksResults - Added todo comment for missing aspects - mostly v2 - but we join_revisions seems like it could use an update before then. Prerequisite for #2454: @ayush1298 can I ask you to review this PR as well? I hope this give an idea of what I was hinting at. Sorry that it took a while. I wanted to make sure to get it right. * refactor to to_dataframe and combine common dependencies * ibid * fix revision joining after discussion with @x-tabdeveloping * remove strict=True for zip() as it is a >3.9 feature * updated mock cache * 1.37.0 Automatically generated by python-semantic-release * fix e5_R_mistral_7b (#2490) * fix e5_R_mistral_7b * change wrapper * address comments * Added kwargs for pad_token * correct lang format * address comments * add revision --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix unintentional working of filters on leaderboard (#2535) * fix unintentional working of filters on leaderboard * address comments * make lint * address comments * rollback unnecessary changes * feat: UI Overhaul (#2549) * Bumped gradio version to latest * Added new Gradio table functionality to leaderboard * Removed search bar * Changed color scheme in plot to match the table * Added new benchmark selector in sidebar * Changed not activated button type to secondary * Short-circuited callbacks that are based on language selection * Re-added column width calculation since it got messed up * Commented out gradient for per-task table as it slowed things down substantially * Styling and layout updates * Adjusted comments according to reviews * Converted all print statements to logger.debug * Removed pydantic version fix * Ran linting * Remove commented out code Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * Moved English,v1 to Legacy section * Closed the benchmark sharing accordion by default * Adjusted markdown blocks according to suggestions * Ran linter --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * 1.38.0 Automatically generated by python-semantic-release * add USER2 (#2560) * add user2 * add training code * update prompts * Fix leaderboard entry for BuiltBench (#2563) Fix leaderboard entry for BuiltBench (#2562) Co-authored-by: Mehrzad Shahin-Moghadam <mehr@Mehrzads-MacBook-Pro.local> * fix: jasper models embeddings having nan values (#2481) * 1.38.1 Automatically generated by python-semantic-release * fix frida datasets (#2565) * Add relle (#2564) * Add relle * defined model metadata for relle * Add mteb/models/relle_models.py * Update mteb/models/relle_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * lint after commit run after "make lint" * Add into model_modules Add model into model_modules and lint check --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Backfill task metadata for metadata for GermanDPR and GermanQuAD (#2566) * Add metadata for GermanDPR and GermanQuAD * PR improvements * Update tasks table * Add ModelMeta for CodeSearch-ModernBERT-Crow-Plus (#2570) * Add files via upload * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update overview.py * Update shuu_model.py * Update shuu_model.py * Update shuu_model.py * Update mteb/models/shuu_model.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Docs: Improve MIEB docs (#2569) * Add missing annotations (#2498) * Update tasks table * move icon & name to benchmark dataclass (#2573) * Remove the comments from ImageEncoder (#2579) * fix: Add Encodechka benchmark (#2561) * add tasks * add benchmark * fix imports * update stsb split * Update tasks table * 1.38.2 Automatically generated by python-semantic-release * fix FlagEmbedding package name (#2588) * fix codecarbon version (#2587) * Add MIEB image only benchmark (#2590) * add vision only bench * add description * correct zs task modalities * specify tasks param * Add image only MIEB benchmark to LB left panel (#2596) * Update benchmarks.py * make lint * add to left side bar * update Doubao-1.5-Embedding (#2575) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix: Add WebSSL models (#2604) * add 2 web SSL dino models * add models from collection and revisions * update memory_usage_mb and embed dim * use automodel instead * fix mieb citation (#2606) * 1.38.3 Automatically generated by python-semantic-release * Update Doubao-1.5-Embedding (#2611) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint * update link --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * CI: update benchmark table (#2609) * update benchmark table * fix table * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update Doubao-1.5-Embedding revision (#2613) * update seed-embedding * update seed models * fix linting and tiktoken problem * fix tiktoken bug * fix lint * update name * Update mteb/models/seed_models.py adopt suggestion Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * update logging * update lint * update link * update revision --------- Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * Update tasks & benchmarks tables * CI: fix table (#2615) * Update tasks & benchmarks tables * Update gradio version (#2558) * Update gradio version Closes #2557 * bump gradio * fix: Removed missing dataset for MTEB(Multilingual) and bumped version We should probably just have done this earlier to ensure that the multilingual benchamrk is runable. * CI: fix infinitely committing issue (#2616) * fix token * try to trigger * add token * test ci * Update tasks & benchmarks tables * Update tasks & benchmarks tables * remove test lines --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Add ScandiSent dataset (#2620) * add scandisent dataset * add to init * typo * lint * 1.38.4 Automatically generated by python-semantic-release * Format all citations (#2614) * Fix errors in bibtex_citation * Format all bibtex_citation fields * format benchmarks * fix format * Fix tests * add formatting script * fix citations (#2628) * Add Talemaader pair classification task (#2621) Add talemaader pair classification task * fix citations * fix citations --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: Uri K <37979288+katzurik@users.noreply.github.com> Co-authored-by: chenghao xiao <85804993+gowitheflow-1998@users.noreply.github.com> Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in> Co-authored-by: OnandOn <76710635+OnAnd0n@users.noreply.github.com> Co-authored-by: richinfo-ai <richinfoai@163.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Adewole Babatunde <40810247+Free-tek@users.noreply.github.com> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> Co-authored-by: ahxgw <ahxgwOnePiece@gmail.com> Co-authored-by: kunka.xgw <kunka.xgw@taobao.com> Co-authored-by: Sam Heymann <40773225+sam-hey@users.noreply.github.com> Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com> Co-authored-by: Nadia Sheikh <144166074+nadshe@users.noreply.github.com> Co-authored-by: theatollersrud <thea.tollersrud@nb.no> Co-authored-by: hongst <76415500+seongtaehong@users.noreply.github.com> Co-authored-by: E. Tolga Ayan <33233561+tolgayan@users.noreply.github.com> Co-authored-by: lllsy12138 <50816213+lllsy12138@users.noreply.github.com> Co-authored-by: shyuli <shyuli@tencent.com> Co-authored-by: Siddharth M. Bhatia <siddharth@sidmb.com> Co-authored-by: Bao Loc Pham <67360122+BaoLocPham@users.noreply.github.com> Co-authored-by: Flo <FlorianRottach@aol.com> Co-authored-by: Florian Rottach <florianrottach@boehringer-ingelheim.com> Co-authored-by: Alexey Vatolin <vatolinalex@gmail.com> Co-authored-by: Olesksii Horchynskyi <121444758+panalexeu@users.noreply.github.com> Co-authored-by: Pandaswag <110003154+torchtorchkimtorch@users.noreply.github.com> Co-authored-by: Márton Kardos <power.up1163@gmail.com> Co-authored-by: Mehrzad Shahin-Moghadam <42153677+mehrzadshm@users.noreply.github.com> Co-authored-by: Mehrzad Shahin-Moghadam <mehr@Mehrzads-MacBook-Pro.local> Co-authored-by: Youngjoon Jang <82500463+yjoonjang@users.noreply.github.com> Co-authored-by: 24September <puritysarah@naver.com> Co-authored-by: Jan Karaś <90987511+KTFish@users.noreply.github.com> Co-authored-by: Shuu <136542198+Shun0212@users.noreply.github.com> Co-authored-by: namespace-Pt <61188463+namespace-Pt@users.noreply.github.com> Co-authored-by: zhangpeitian <zhangpeitian@bytedance.com> Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com>
mtebpackage.mteb -m {model_name} -t {task_name}command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2intfloat/multilingual-e5-smallmake test.make lint.data-link :