-
Notifications
You must be signed in to change notification settings - Fork 568
fix: Add Touche2020v3 and JMTEB #1262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
KennethEnevoldsen
merged 9 commits into
embeddings-benchmark:main
from
Samoed:add_datasets
Oct 3, 2024
+351
−9
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
948dd47
add datasets
Samoed 48a324d
Merge branch 'embeddings-benchmark:main' into add_datasets
Samoed ce5c746
fix metrics
Samoed c752de7
Merge branch 'embeddings-benchmark:main' into add_datasets
Samoed 456df82
add Touche2020v3
Samoed 38c0b56
fix metadata
Samoed fbe737e
Apply suggestions from code review
Samoed 1a56a46
upd name and supress
Samoed 049c914
add benchmark class
Samoed File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
|
|
||
| from mteb.abstasks.AbsTaskReranking import AbsTaskReranking | ||
| from mteb.abstasks.MultilingualTask import MultilingualTask | ||
| from mteb.abstasks.TaskMetadata import TaskMetadata | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| _EVAL_SPLIT = "test" | ||
| _LANGUAGES = { | ||
| "us": ["eng-Latn"], | ||
| "es": ["spa-Latn"], | ||
| "jp": ["jpn-Jpan"], | ||
| } | ||
|
|
||
| _CITATION = """@article{reddy2022shopping, | ||
| title={Shopping Queries Dataset: A Large-Scale {ESCI} Benchmark for Improving Product Search}, | ||
| author={Chandan K. Reddy and Lluís Màrquez and Fran Valero and Nikhil Rao and Hugo Zaragoza and Sambaran Bandyopadhyay and Arnab Biswas and Anlu Xing and Karthik Subbian}, | ||
| year={2022}, | ||
| eprint={2206.06588}, | ||
| archivePrefix={arXiv} | ||
| }""" | ||
|
|
||
|
|
||
| class ESCIReranking(MultilingualTask, AbsTaskReranking): | ||
| metadata = TaskMetadata( | ||
| name="ESCIReranking", | ||
| description="", | ||
| reference="https://github.com/amazon-science/esci-data/", | ||
| dataset={ | ||
| "path": "mteb/esci", | ||
| "revision": "237f74be0503482b4e8bc1b83778c7a87ea93fd8", | ||
| }, | ||
| type="Reranking", | ||
| category="s2p", | ||
| modalities=["text"], | ||
| eval_splits=[_EVAL_SPLIT], | ||
| eval_langs=_LANGUAGES, | ||
| main_score="map", | ||
| date=("2022-06-14", "2022-06-14"), | ||
| domains=["Written"], | ||
| task_subtypes=[], | ||
| license="apache-2.0", | ||
| annotations_creators="derived", | ||
| dialect=[], | ||
| sample_creation="created", | ||
| bibtex_citation=_CITATION, | ||
| descriptive_stats={ | ||
| "test": { | ||
| "num_samples": 29285, | ||
| "num_positive": 29285, | ||
| "num_negative": 29285, | ||
| "avg_query_len": 19.691890046098685, | ||
| "avg_positive_len": 9.268089465596722, | ||
| "avg_negative_len": 1.5105002561038074, | ||
| "hf_subset_descriptive_stats": { | ||
| "us": { | ||
| "num_samples": 21296, | ||
| "num_positive": 21296, | ||
| "num_negative": 21296, | ||
| "avg_query_len": 21.440833959429, | ||
| "avg_positive_len": 8.892515026296017, | ||
| "avg_negative_len": 1.1956705484598047, | ||
| }, | ||
| "es": { | ||
| "num_samples": 3703, | ||
| "num_positive": 3703, | ||
| "num_negative": 3703, | ||
| "avg_query_len": 20.681609505806104, | ||
| "avg_positive_len": 10.561706724277613, | ||
| "avg_negative_len": 2.749932487172563, | ||
| }, | ||
| "jp": { | ||
| "num_samples": 4286, | ||
| "num_positive": 4286, | ||
| "num_negative": 4286, | ||
| "avg_query_len": 10.146756882874476, | ||
| "avg_positive_len": 10.016565562295847, | ||
| "avg_negative_len": 2.003966402239851, | ||
| }, | ||
| }, | ||
| } | ||
| }, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from mteb.abstasks.AbsTaskRetrieval import AbsTaskRetrieval | ||
| from mteb.abstasks.TaskMetadata import TaskMetadata | ||
|
|
||
|
|
||
| class JaqketRetrieval(AbsTaskRetrieval): | ||
| metadata = TaskMetadata( | ||
| name="JaqketRetrieval", | ||
| dataset={ | ||
| "path": "mteb/jaqket", | ||
| "revision": "3a5b92dad489a61e664c05ed2175bc9220230199", | ||
| }, | ||
| description="JAQKET (JApanese Questions on Knowledge of EnTities) is a QA dataset that is created based on quiz questions.", | ||
| reference="https://github.com/kumapo/JAQKET-dataset", | ||
| type="Retrieval", | ||
| category="s2p", | ||
| modalities=["text"], | ||
| eval_splits=["test"], | ||
| eval_langs=["jpn-Jpan"], | ||
| main_score="ndcg_at_10", | ||
| date=("2023-10-09", "2023-10-09"), | ||
| domains=["Encyclopaedic", "Non-fiction", "Written"], | ||
| task_subtypes=["Question answering"], | ||
| license="cc-by-sa-4.0", | ||
| annotations_creators="human-annotated", | ||
| dialect=[], | ||
| sample_creation="found", | ||
| bibtex_citation="""@InProceedings{Kurihara_nlp2020, | ||
| author = "鈴木正敏 and 鈴木潤 and 松田耕史 and ⻄田京介 and 井之上直也", | ||
| title = "JAQKET: クイズを題材にした日本語 QA データセットの構築", | ||
| booktitle = "言語処理学会第26回年次大会", | ||
| year = "2020", | ||
| url = "https://www.anlp.jp/proceedings/annual_meeting/2020/pdf_dir/P2-24.pdf" | ||
| note= "in Japanese" | ||
| }""", | ||
| descriptive_stats={ | ||
| "test": { | ||
| "average_document_length": 3747.995228882333, | ||
| "average_query_length": 50.70611835506519, | ||
| "num_documents": 114229, | ||
| "num_queries": 997, | ||
| "average_relevant_docs_per_query": 1.0, | ||
| } | ||
| }, | ||
| ) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.