-
Notifications
You must be signed in to change notification settings - Fork 125
Famteb v2 Results #273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Famteb v2 Results #273
Conversation
Model Results ComparisonReference models: Results for
|
| task_name | Alibaba-NLP/gte-Qwen2-7B-instruct | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.4259 | 0.4127 | |
| DeepSentiPers.v2 | 0.5794 | 0.5769 | |
| FEVER-FaHardNegatives | 0.6967 | 0.4615 | |
| FiQA2018-Fa.v2 | 0.3274 | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.5159 | 0.6153 | |
| MSMARCO-FaHardNegatives | 0.6295 | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7877 | 0.7659 | |
| NQ-FaHardNegatives | 0.4559 | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.5953 | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.5851 | 0.5517 | |
| PerShopIntentClassification | 0.8809 | 0.9069 | |
| PersianTextEmotion.v2 | 0.4196 | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7500 | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1622 | 0.1222 | |
| SIDClassification.v2 | 0.6263 | 0.6137 | |
| SciFact-Fa.v2 | 0.6472 | 0.6037 | |
| StyleClassification | 0.5943 | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.6505 | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.7496 | 0.7177 | |
| Touche2020-Fa.v2 | 0.4224 | 0.4978 | |
| WebFAQRetrieval | 0.7127 | 0.7459 | 0.7813 |
| Average | 0.5816 | 0.5931 | 0.6881 |
Model have high performance on these tasks: NeuCLIR2023RetrievalHardNegatives
Results for BAAI/bge-m3
| task_name | BAAI/bge-m3 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.5403 | 0.4127 | |
| DeepSentiPers.v2 | 0.6678 | 0.5769 | |
| FEVER-FaHardNegatives | 0.6421 | 0.4615 | |
| FiQA2018-Fa.v2 | 0.3023 | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.5762 | 0.6153 | |
| MSMARCO-FaHardNegatives | 0.6847 | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7726 | 0.7659 | |
| NQ-FaHardNegatives | 0.5021 | 0.4983 | |
| PerShopDomainClassification | 0.6646 | 0.5517 | |
| PerShopIntentClassification | 0.8988 | 0.9069 | |
| PersianTextEmotion.v2 | 0.5981 | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.8031 | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1499 | 0.1222 | |
| SIDClassification.v2 | 0.5962 | 0.6137 | |
| SciFact-Fa.v2 | 0.5858 | 0.6037 | |
| StyleClassification | 0.5586 | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.7263 | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.7338 | 0.7177 | |
| Touche2020-Fa.v2 | 0.4853 | 0.4978 | |
| WebFAQRetrieval | 0.7726 | 0.7459 | 0.7813 |
| Average | 0.6131 | 0.5975 | 0.7813 |
Results for HooshvareLab/bert-base-parsbert-uncased
| task_name | HooshvareLab/bert-base-parsbert-uncased | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.1969 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.4971 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.0170 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.0153 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.0630 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.0775 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.2492 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7153 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.0456 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.1171 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.6962 | nan | 0.5517 | |
| PerShopIntentClassification | 0.9168 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.4763 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.4848 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.0161 | nan | 0.1222 | |
| SIDClassification.v2 | 0.5571 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.0834 | nan | 0.6037 | |
| StyleClassification | 0.9586 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.9745 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.0867 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.0443 | nan | 0.4978 | |
| WebFAQRetrieval | 0.1822 | nan | 0.7459 | 0.7813 |
| Average | 0.3396 | 0.6163 | 0.5931 | 0.6673 |
Results for MCINext/Hakim-small
| task_name | MCINext/Hakim-small | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.4268 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.6527 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.5176 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.1912 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.4450 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.4488 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.6399 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7819 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.3085 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.4937 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.6896 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8633 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.7719 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7197 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.0981 | nan | 0.1222 | |
| SIDClassification.v2 | 0.6463 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.4977 | nan | 0.6037 | |
| StyleClassification | 0.7969 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.9184 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.4367 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.3738 | nan | 0.4978 | |
| WebFAQRetrieval | 0.6934 | nan | 0.7459 | 0.7813 |
| Average | 0.5642 | 0.6163 | 0.5931 | 0.6673 |
Results for MCINext/Hakim-unsup
| task_name | MCINext/Hakim-unsup | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.4020 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.6494 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.3961 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.1705 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.4381 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.5143 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.6082 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7669 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.3514 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.5333 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.7181 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8907 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.6460 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7592 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1261 | nan | 0.1222 | |
| SIDClassification.v2 | 0.6022 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.4874 | nan | 0.6037 | |
| StyleClassification | 0.7484 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.8072 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.5779 | nan | 0.7177 | |
| WebFAQRetrieval | 0.6611 | nan | 0.7459 | 0.7813 |
| Average | 0.5645 | 0.6163 | 0.5976 | 0.6673 |
Results for MCINext/Hakim
| task_name | MCINext/Hakim | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.4613 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.7227 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.5014 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.2446 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.4799 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.4725 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.6472 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.8001 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.3475 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.4933 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.6669 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8792 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.8645 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7457 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1050 | nan | 0.1222 | |
| SIDClassification.v2 | 0.6845 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.5379 | nan | 0.6037 | |
| StyleClassification | 0.6896 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.9388 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.5485 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.3975 | nan | 0.4978 | |
| WebFAQRetrieval | 0.7388 | nan | 0.7459 | 0.7813 |
| Average | 0.5894 | 0.6163 | 0.5931 | 0.6673 |
Results for PartAI/Tooka-SBERT-V2-Large
| task_name | PartAI/Tooka-SBERT-V2-Large | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.4369 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.6564 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.2492 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.1921 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.3533 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.4854 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.5982 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7780 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.3190 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.5561 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.7607 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8914 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.5685 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7743 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1160 | nan | 0.1222 | |
| SIDClassification.v2 | 0.5535 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.4103 | nan | 0.6037 | |
| StyleClassification | 0.8471 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.9294 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.6676 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.4374 | nan | 0.4978 | |
| WebFAQRetrieval | 0.6641 | nan | 0.7459 | 0.7813 |
| Average | 0.5566 | 0.6163 | 0.5931 | 0.6673 |
Results for PartAI/Tooka-SBERT-V2-Small
| task_name | PartAI/Tooka-SBERT-V2-Small | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.4564 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.6087 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.3968 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.1687 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.3601 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.5306 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.5802 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7717 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.3377 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.5524 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.7465 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8816 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.5251 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7474 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1166 | nan | 0.1222 | |
| SIDClassification.v2 | 0.5459 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.4138 | nan | 0.6037 | |
| StyleClassification | 0.8768 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.8429 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.6560 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.4269 | nan | 0.4978 | |
| WebFAQRetrieval | 0.6332 | nan | 0.7459 | 0.7813 |
| Average | 0.5534 | 0.6163 | 0.5931 | 0.6673 |
Results for PartAI/Tooka-SBERT
| task_name | PartAI/Tooka-SBERT | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.3253 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.6345 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.1515 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.1267 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.2374 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.2643 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.4732 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7563 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.1804 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.4927 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.7372 | nan | 0.5517 | |
| PerShopIntentClassification | 0.8810 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.5682 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7588 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.0973 | nan | 0.1222 | |
| SIDClassification.v2 | 0.5325 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.3798 | nan | 0.6037 | |
| StyleClassification | 0.7591 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.7462 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.5796 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.3149 | nan | 0.4978 | |
| WebFAQRetrieval | 0.5454 | nan | 0.7459 | 0.7813 |
| Average | 0.4792 | 0.6163 | 0.5931 | 0.6673 |
Results for PartAI/TookaBERT-Base
| task_name | PartAI/TookaBERT-Base | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | 0.2671 | nan | 0.4127 | |
| DeepSentiPers.v2 | 0.5406 | nan | 0.5769 | |
| FEVER-FaHardNegatives | 0.0081 | nan | 0.4615 | |
| FiQA2018-Fa.v2 | 0.0229 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.0494 | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.0521 | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | 0.2741 | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7149 | nan | 0.7659 | |
| NQ-FaHardNegatives | 0.0343 | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.2021 | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | 0.6516 | nan | 0.5517 | |
| PerShopIntentClassification | 0.9021 | nan | 0.9069 | |
| PersianTextEmotion.v2 | 0.5288 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.5060 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.0330 | nan | 0.1222 | |
| SIDClassification.v2 | 0.5876 | nan | 0.6137 | |
| SciFact-Fa.v2 | 0.1684 | nan | 0.6037 | |
| StyleClassification | 0.9641 | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.9851 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.1120 | nan | 0.7177 | |
| Touche2020-Fa.v2 | 0.0330 | nan | 0.4978 | |
| WebFAQRetrieval | 0.2016 | nan | 0.7459 | 0.7813 |
| Average | 0.3563 | 0.6163 | 0.5931 | 0.6673 |
Results for google/embeddinggemma-300m
| task_name | google/embeddinggemma-300m | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.5849 | 0.4127 | |
| BeytooteClustering | 0.6510 | 0.6150 | 0.6252 |
| DeepSentiPers.v2 | 0.6027 | 0.5769 | |
| DigikalamagClassification | 0.8779 | 0.8705 | 0.8631 |
| DigikalamagClustering | 0.5143 | 0.3989 | 0.4748 |
| FEVER-FaHardNegatives | 0.6769 | 0.4615 | |
| FarsTail | 0.7703 | 0.7255 | 0.7478 |
| FarsiParaphraseDetection | 0.9623 | 0.9757 | 0.9706 |
| Farsick | 0.7108 | 0.7067 | 0.7095 |
| FiQA2018-Fa.v2 | 0.2880 | 0.2946 | |
| HamshahriClustring | 0.7207 | 0.6742 | 0.6983 |
| HotpotQA-FaHardNegatives | 0.5481 | 0.6153 | |
| MIRACLReranking | 0.5250 | 0.5936 | 0.6026 |
| MSMARCO-FaHardNegatives | 0.6236 | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7791 | 0.7659 | |
| NLPTwitterAnalysisClustering | 0.8112 | 0.7848 | 0.8082 |
| NQ-FaHardNegatives | 0.4418 | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | 0.6225 | 0.5059 | 0.5950 |
| ParsinluEntail | 0.7103 | 0.6546 | 0.6655 |
| ParsinluQueryParaphPC | 0.8841 | 0.8783 | 0.8709 |
| PerShopDomainClassification | 0.6273 | 0.5517 | |
| PerShopIntentClassification | 0.9061 | 0.9069 | |
| PersianFoodSentimentClassification | 0.8254 | 0.8212 | 0.8105 |
| PersianTextEmotion.v2 | 0.5655 | 0.6091 | |
| PersianWebDocumentRetrieval | 0.5282 | 0.4676 | 0.5067 |
| QuoraRetrieval-Fa.v2 | 0.7430 | 0.7788 | |
| SAMSumFa | 0.9812 | 0.9242 | 0.9247 |
| SCIDOCS-Fa.v2 | 0.1445 | 0.1222 | |
| SIDClassification.v2 | 0.6327 | 0.6137 | |
| SIDClustring | 0.4858 | 0.3865 | 0.4102 |
| SciFact-Fa.v2 | 0.6523 | 0.6037 | |
| StyleClassification | 0.6258 | 0.6492 | |
| SynPerChatbotConvSAAnger | 0.8237 | 0.7193 | 0.8661 |
| SynPerChatbotConvSAClassification | 0.6979 | 0.6077 | 0.7472 |
| SynPerChatbotConvSAFear | 0.8376 | 0.7419 | 0.7769 |
| SynPerChatbotConvSAFriendship | 0.5420 | 0.5283 | 0.6268 |
| SynPerChatbotConvSAHappiness | 0.6263 | 0.5246 | 0.7398 |
| SynPerChatbotConvSAJealousy | 0.7517 | 0.7034 | 0.7621 |
| SynPerChatbotConvSALove | 0.5286 | 0.4629 | 0.6086 |
| SynPerChatbotConvSASadness | 0.7598 | 0.6441 | 0.8167 |
| SynPerChatbotConvSASatisfaction | 0.7879 | 0.6058 | 0.8079 |
| SynPerChatbotConvSASurprise | 0.6240 | 0.5388 | 0.7198 |
| SynPerChatbotConvSAToneChatbotClassification | 0.6458 | 0.5807 | 0.8198 |
| SynPerChatbotConvSAToneUserClassification | 0.5450 | 0.5260 | 0.6197 |
| SynPerChatbotRAGFAQPC | 0.6659 | 0.6303 | 0.6677 |
| SynPerChatbotRAGFAQRetrieval | 0.3401 | 0.2348 | 0.4405 |
| SynPerChatbotRAGSumSRetrieval | 0.6593 | 0.4981 | 0.6037 |
| SynPerChatbotSatisfactionLevelClassification | 0.3197 | 0.2523 | 0.3343 |
| SynPerChatbotSumSRetrieval | 0.4457 | 0.2760 | 0.3678 |
| SynPerQAPC | 0.9355 | 0.9516 | 0.9320 |
| SynPerQARetrieval | 0.8628 | 0.8735 | 0.8681 |
| SynPerSTS | 0.8707 | 0.8798 | 0.8691 |
| SynPerTextKeywordsPC | 0.9650 | 0.9479 | 0.9640 |
| SynPerTextToneClassification.v3 | 0.7377 | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.7341 | 0.7177 | |
| WebFAQRetrieval | 0.7806 | 0.7459 | 0.7813 |
| Average | 0.6698 | 0.6279 | 0.7112 |
Model have high performance on these tasks: BeytooteClustering,DigikalamagClassification,DigikalamagClustering,FarsTail,Farsick,HamshahriClustring,NLPTwitterAnalysisClustering,NeuCLIR2023RetrievalHardNegatives,ParsinluEntail,ParsinluQueryParaphPC,PersianFoodSentimentClassification,PersianWebDocumentRetrieval,SAMSumFa,SIDClustring,SynPerChatbotConvSAFear,SynPerChatbotRAGSumSRetrieval,SynPerChatbotSumSRetrieval,SynPerQAPC,SynPerSTS,SynPerTextKeywordsPC
Results for intfloat/e5-mistral-7b-instruct
| task_name | intfloat/e5-mistral-7b-instruct | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.4652 | 0.4127 | |
| DeepSentiPers.v2 | 0.5474 | 0.5769 | |
| FEVER-FaHardNegatives | 0.6563 | 0.4615 | |
| FiQA2018-Fa.v2 | 0.2293 | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.4643 | 0.6153 | |
| MSMARCO-FaHardNegatives | 0.6256 | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7582 | 0.7659 | |
| NQ-FaHardNegatives | 0.3873 | 0.4983 | |
| PerShopDomainClassification | 0.2709 | 0.5517 | |
| PerShopIntentClassification | 0.8724 | 0.9069 | |
| PersianTextEmotion.v2 | 0.3933 | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7783 | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1404 | 0.1222 | |
| SIDClassification.v2 | 0.5819 | 0.6137 | |
| SciFact-Fa.v2 | 0.5552 | 0.6037 | |
| StyleClassification | 0.5604 | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.6149 | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.6498 | 0.7177 | |
| Touche2020-Fa.v2 | 0.4352 | 0.4978 | |
| WebFAQRetrieval | 0.7224 | 0.7459 | 0.7813 |
| Average | 0.5354 | 0.5975 | 0.7813 |
Results for intfloat/multilingual-e5-base
| task_name | intfloat/multilingual-e5-base | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.3384 | 0.4127 | |
| DeepSentiPers.v2 | 0.5897 | 0.5769 | |
| FEVER-FaHardNegatives | 0.5419 | 0.4615 | |
| FiQA2018-Fa.v2 | 0.2298 | 0.2946 | |
| HotpotQA-FaHardNegatives | 0.5657 | 0.6153 | |
| MSMARCO-FaHardNegatives | 0.6667 | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | 0.7492 | 0.7659 | |
| NQ-FaHardNegatives | 0.4497 | 0.4983 | |
| PerShopDomainClassification | 0.5054 | 0.5517 | |
| PerShopIntentClassification | 0.9003 | 0.9069 | |
| PersianTextEmotion.v2 | 0.5281 | 0.6091 | |
| QuoraRetrieval-Fa.v2 | 0.7468 | 0.7788 | |
| SCIDOCS-Fa.v2 | 0.1182 | 0.1222 | |
| SIDClassification.v2 | 0.6073 | 0.6137 | |
| SciFact-Fa.v2 | 0.5818 | 0.6037 | |
| StyleClassification | 0.6172 | 0.6492 | |
| SynPerTextToneClassification.v3 | 0.8018 | 0.8412 | |
| TRECCOVID-Fa.v2 | 0.6291 | 0.7177 | |
| Touche2020-Fa.v2 | 0.4594 | 0.4978 | |
| Average | 0.5593 | 0.5897 |
Results for intfloat/multilingual-e5-large
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | |
| DeepSentiPers.v2 | nan | 0.5769 | |
| FEVER-FaHardNegatives | nan | 0.4615 | |
| FiQA2018-Fa.v2 | nan | 0.2946 | |
| HotpotQA-FaHardNegatives | nan | 0.6153 | |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | |
| NQ-FaHardNegatives | nan | 0.4983 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.5950 |
| PerShopDomainClassification | nan | 0.5517 | |
| PerShopIntentClassification | nan | 0.9069 | |
| PersianTextEmotion.v2 | nan | 0.6091 | |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | |
| SCIDOCS-Fa.v2 | nan | 0.1222 | |
| SIDClassification.v2 | nan | 0.6137 | |
| SciFact-Fa.v2 | nan | 0.6037 | |
| StyleClassification | nan | 0.6492 | |
| SynPerTextToneClassification.v3 | nan | 0.8412 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | |
| Touche2020-Fa.v2 | nan | 0.4978 | |
| Average | 0.6163 | 0.5858 | 0.6103 |
Results for jinaai/jina-embeddings-v3
| task_name | intfloat/multilingual-e5-large | jinaai/jina-embeddings-v3 | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.4127 | 0.3852 | |
| DeepSentiPers.v2 | 0.5769 | 0.6407 | |
| FEVER-FaHardNegatives | 0.4615 | 0.7050 | |
| FiQA2018-Fa.v2 | 0.2946 | 0.3551 | |
| HotpotQA-FaHardNegatives | 0.6153 | 0.5401 | |
| MSMARCO-FaHardNegatives | 0.6871 | 0.6740 | |
| NLPTwitterAnalysisClassification.v2 | 0.7659 | 0.7677 | |
| NQ-FaHardNegatives | 0.4983 | 0.5305 | |
| NeuCLIR2023RetrievalHardNegatives | 0.5059 | 0.5896 | 0.5950 |
| PerShopDomainClassification | 0.5517 | 0.5644 | |
| PerShopIntentClassification | 0.9069 | 0.8583 | |
| PersianTextEmotion.v2 | 0.6091 | 0.5217 | |
| QuoraRetrieval-Fa.v2 | 0.7788 | 0.5715 | |
| SCIDOCS-Fa.v2 | 0.1222 | 0.1426 | |
| SIDClassification.v2 | 0.6137 | 0.6165 | |
| SciFact-Fa.v2 | 0.6037 | 0.6123 | |
| StyleClassification | 0.6492 | 0.6133 | |
| SynPerTextToneClassification.v3 | 0.8412 | 0.7064 | |
| TRECCOVID-Fa.v2 | 0.7177 | 0.6858 | |
| Touche2020-Fa.v2 | 0.4978 | 0.4901 | |
| Average | 0.5855 | 0.5785 | 0.5950 |
Results for m3hrdadfi/bert-zwnj-wnli-mean-tokens
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | m3hrdadfi/bert-zwnj-wnli-mean-tokens | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | 0.2087 | |
| DeepSentiPers.v2 | nan | 0.5769 | 0.4511 | |
| FEVER-FaHardNegatives | nan | 0.4615 | 0.0355 | |
| FiQA2018-Fa.v2 | nan | 0.2946 | 0.0207 | |
| HotpotQA-FaHardNegatives | nan | 0.6153 | 0.0232 | |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.0797 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | 0.2251 | |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | 0.7129 | |
| NQ-FaHardNegatives | nan | 0.4983 | 0.0374 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.1744 | 0.5950 |
| PerShopDomainClassification | nan | 0.5517 | 0.6182 | |
| PerShopIntentClassification | nan | 0.9069 | 0.8858 | |
| PersianTextEmotion.v2 | nan | 0.6091 | 0.3833 | |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | 0.4677 | |
| SCIDOCS-Fa.v2 | nan | 0.1222 | 0.0181 | |
| SIDClassification.v2 | nan | 0.6137 | 0.4827 | |
| SciFact-Fa.v2 | nan | 0.6037 | 0.0742 | |
| StyleClassification | nan | 0.6492 | 0.8198 | |
| SynPerTextToneClassification.v3 | nan | 0.8412 | 0.8910 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | 0.1643 | |
| Touche2020-Fa.v2 | nan | 0.4978 | 0.1279 | |
| WebFAQRetrieval | nan | 0.7459 | 0.1780 | 0.7813 |
| Average | 0.6163 | 0.5931 | 0.3218 | 0.6673 |
Results for m3hrdadfi/roberta-zwnj-wnli-mean-tokens
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | m3hrdadfi/roberta-zwnj-wnli-mean-tokens | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | 0.2202 | |
| DeepSentiPers.v2 | nan | 0.5769 | 0.4354 | |
| FEVER-FaHardNegatives | nan | 0.4615 | 0.0262 | |
| FiQA2018-Fa.v2 | nan | 0.2946 | 0.0321 | |
| HotpotQA-FaHardNegatives | nan | 0.6153 | 0.0266 | |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.0725 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | 0.3186 | |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | 0.7092 | |
| NQ-FaHardNegatives | nan | 0.4983 | 0.0440 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.1857 | 0.5950 |
| PerShopDomainClassification | nan | 0.5517 | 0.5375 | |
| PerShopIntentClassification | nan | 0.9069 | 0.8766 | |
| PersianTextEmotion.v2 | nan | 0.6091 | 0.3756 | |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | 0.4789 | |
| SCIDOCS-Fa.v2 | nan | 0.1222 | 0.0343 | |
| SIDClassification.v2 | nan | 0.6137 | 0.5041 | |
| SciFact-Fa.v2 | nan | 0.6037 | 0.0784 | |
| StyleClassification | nan | 0.6492 | 0.8182 | |
| SynPerTextToneClassification.v3 | nan | 0.8412 | 0.8930 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | 0.2010 | |
| Touche2020-Fa.v2 | nan | 0.4978 | 0.1387 | |
| WebFAQRetrieval | nan | 0.7459 | 0.1835 | 0.7813 |
| Average | 0.6163 | 0.5931 | 0.3268 | 0.6673 |
Results for myrkur/sentence-transformer-parsbert-fa
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | myrkur/sentence-transformer-parsbert-fa | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | 0.2103 | |
| DeepSentiPers.v2 | nan | 0.5769 | 0.4116 | |
| FEVER-FaHardNegatives | nan | 0.4615 | 0.0265 | |
| FiQA2018-Fa.v2 | nan | 0.2946 | 0.0100 | |
| HotpotQA-FaHardNegatives | nan | 0.6153 | 0.0132 | |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.0537 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | 0.2412 | |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | 0.7464 | |
| NQ-FaHardNegatives | nan | 0.4983 | 0.0171 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.2394 | 0.5950 |
| PerShopDomainClassification | nan | 0.5517 | 0.6930 | |
| PerShopIntentClassification | nan | 0.9069 | 0.8582 | |
| PersianTextEmotion.v2 | nan | 0.6091 | 0.3882 | |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | 0.4700 | |
| SCIDOCS-Fa.v2 | nan | 0.1222 | 0.0206 | |
| SIDClassification.v2 | nan | 0.6137 | 0.5500 | |
| SciFact-Fa.v2 | nan | 0.6037 | 0.0496 | |
| StyleClassification | nan | 0.6492 | 0.7458 | |
| SynPerTextToneClassification.v3 | nan | 0.8412 | 0.7516 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | 0.1005 | |
| Touche2020-Fa.v2 | nan | 0.4978 | 0.0399 | |
| WebFAQRetrieval | nan | 0.7459 | 0.1252 | 0.7813 |
| Average | 0.6163 | 0.5931 | 0.3074 | 0.6673 |
Results for openai/text-embedding-3-small
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | openai/text-embedding-3-small | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | 0.3328 | |
| BeytooteClustering | nan | 0.6150 | 0.6038 | 0.6252 |
| DeepSentiPers.v2 | nan | 0.5769 | 0.5044 | |
| DigikalamagClassification | nan | 0.8705 | 0.8493 | 0.8631 |
| DigikalamagClustering | nan | 0.3989 | 0.4597 | 0.4748 |
| FEVER-FaHardNegatives | nan | 0.4615 | 0.2722 | |
| FarsTail | nan | 0.7255 | 0.6885 | 0.7478 |
| FarsiParaphraseDetection | nan | 0.9757 | 0.9483 | 0.9706 |
| Farsick | nan | 0.7067 | 0.6085 | 0.7095 |
| FiQA2018-Fa.v2 | nan | 0.2946 | 0.1011 | |
| HamshahriClustring | nan | 0.6742 | 0.6633 | 0.6983 |
| HotpotQA-FaHardNegatives | nan | 0.6153 | 0.2813 | |
| MIRACLReranking | nan | 0.5936 | 0.3477 | 0.6026 |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.2724 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | 0.4767 | |
| MassiveIntentClassification | 0.8349 | 0.6549 | 0.5217 | 0.8349 |
| MassiveScenarioClassification | 0.8863 | 0.6859 | 0.5667 | 0.8863 |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | 0.7203 | |
| NLPTwitterAnalysisClustering | nan | 0.7848 | 0.7892 | 0.8082 |
| NQ-FaHardNegatives | nan | 0.4983 | 0.1996 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.3452 | 0.5950 |
| ParsinluEntail | nan | 0.6546 | 0.5747 | 0.6655 |
| ParsinluQueryParaphPC | nan | 0.8783 | 0.7701 | 0.8709 |
| PerShopDomainClassification | nan | 0.5517 | 0.4921 | |
| PerShopIntentClassification | nan | 0.9069 | 0.8870 | |
| PersianFoodSentimentClassification | nan | 0.8212 | 0.6615 | 0.8105 |
| PersianTextEmotion.v2 | nan | 0.6091 | 0.4459 | |
| PersianWebDocumentRetrieval | nan | 0.4676 | 0.3508 | 0.5067 |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | 0.6223 | |
| SAMSumFa | nan | 0.9242 | 0.8461 | 0.9247 |
| SCIDOCS-Fa.v2 | nan | 0.1222 | 0.0840 | |
| SIDClassification.v2 | nan | 0.6137 | 0.5272 | |
| SIDClustring | nan | 0.3865 | 0.3568 | 0.4102 |
| SciFact-Fa.v2 | nan | 0.6037 | 0.4125 | |
| StyleClassification | nan | 0.6492 | 0.6229 | |
| SynPerChatbotConvSAAnger | nan | 0.7193 | 0.8539 | 0.8661 |
| SynPerChatbotConvSAClassification | nan | 0.6077 | 0.7151 | 0.7472 |
| SynPerChatbotConvSAFear | nan | 0.7419 | 0.7641 | 0.7769 |
| SynPerChatbotConvSAFriendship | nan | 0.5283 | 0.5891 | 0.6268 |
| SynPerChatbotConvSAHappiness | nan | 0.5246 | 0.6398 | 0.7398 |
| SynPerChatbotConvSAJealousy | nan | 0.7034 | 0.7276 | 0.7621 |
| SynPerChatbotConvSALove | nan | 0.4629 | 0.5200 | 0.6086 |
| SynPerChatbotConvSASadness | nan | 0.6441 | 0.7922 | 0.8167 |
| SynPerChatbotConvSASatisfaction | nan | 0.6058 | 0.8664 | 0.8079 |
| SynPerChatbotConvSASurprise | nan | 0.5388 | 0.6826 | 0.7198 |
| SynPerChatbotConvSAToneChatbotClassification | nan | 0.5807 | 0.6666 | 0.8198 |
| SynPerChatbotConvSAToneUserClassification | nan | 0.5260 | 0.6037 | 0.6197 |
| SynPerChatbotRAGFAQPC | nan | 0.6303 | 0.6896 | 0.6677 |
| SynPerChatbotRAGFAQRetrieval | nan | 0.2348 | 0.2826 | 0.4405 |
| SynPerChatbotRAGSumSRetrieval | nan | 0.4981 | 0.4781 | 0.6037 |
| SynPerChatbotSatisfactionLevelClassification | nan | 0.2523 | 0.3664 | 0.3343 |
| SynPerChatbotSumSRetrieval | nan | 0.2760 | 0.2216 | 0.3678 |
| SynPerQAPC | nan | 0.9516 | 0.9056 | 0.9320 |
| SynPerQARetrieval | nan | 0.8735 | 0.6358 | 0.8681 |
| SynPerSTS | nan | 0.8798 | 0.7733 | 0.8691 |
| SynPerTextKeywordsPC | nan | 0.9479 | 0.9146 | 0.9640 |
| SynPerTextToneClassification.v3 | nan | 0.8412 | 0.7289 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | 0.2937 | |
| WebFAQRetrieval | nan | 0.7459 | 0.5044 | 0.7813 |
| WikipediaRerankingMultilingual | 0.9120 | 0.8932 | 0.8094 | 0.9120 |
| WikipediaRetrievalMultilingual | 0.9357 | 0.9040 | 0.7532 | 0.9357 |
| Average | 0.8370 | 0.6376 | 0.5735 | 0.7260 |
Model have high performance on these tasks: SynPerChatbotConvSASatisfaction,SynPerChatbotRAGFAQPC,SynPerChatbotSatisfactionLevelClassification
Results for sbunlp/fabert
| task_name | google/gemini-embedding-001 | intfloat/multilingual-e5-large | sbunlp/fabert | Max result |
|---|---|---|---|---|
| ArguAna-Fa.v2 | nan | 0.4127 | 0.1913 | |
| DeepSentiPers.v2 | nan | 0.5769 | 0.4175 | |
| FEVER-FaHardNegatives | nan | 0.4615 | 0.0463 | |
| FiQA2018-Fa.v2 | nan | 0.2946 | 0.0367 | |
| HotpotQA-FaHardNegatives | nan | 0.6153 | 0.0926 | |
| MIRACLRetrievalHardNegatives | 0.6163 | 0.5923 | 0.1285 | 0.6257 |
| MSMARCO-FaHardNegatives | nan | 0.6871 | 0.2718 | |
| NLPTwitterAnalysisClassification.v2 | nan | 0.7659 | 0.7106 | |
| NQ-FaHardNegatives | nan | 0.4983 | 0.0783 | |
| NeuCLIR2023RetrievalHardNegatives | nan | 0.5059 | 0.3032 | 0.5950 |
| PerShopDomainClassification | nan | 0.5517 | 0.5465 | |
| PerShopIntentClassification | nan | 0.9069 | 0.8962 | |
| PersianTextEmotion.v2 | nan | 0.6091 | 0.4884 | |
| QuoraRetrieval-Fa.v2 | nan | 0.7788 | 0.5246 | |
| SCIDOCS-Fa.v2 | nan | 0.1222 | 0.0411 | |
| SIDClassification.v2 | nan | 0.6137 | 0.5400 | |
| SciFact-Fa.v2 | nan | 0.6037 | 0.1848 | |
| StyleClassification | nan | 0.6492 | 0.9771 | |
| SynPerTextToneClassification.v3 | nan | 0.8412 | 0.9840 | |
| TRECCOVID-Fa.v2 | nan | 0.7177 | 0.1810 | |
| Touche2020-Fa.v2 | nan | 0.4978 | 0.0705 | |
| WebFAQRetrieval | nan | 0.7459 | 0.2758 | 0.7813 |
| Average | 0.6163 | 0.5931 | 0.3630 | 0.6673 |
Results for sentence-transformers/LaBSE
| task_name | intfloat/multilingual-e5-large | sentence-transformers/LaBSE | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.4127 | 0.3812 | |
| DeepSentiPers.v2 | 0.5769 | 0.5805 | |
| FEVER-FaHardNegatives | 0.4615 | 0.1350 | |
| FiQA2018-Fa.v2 | 0.2946 | 0.0550 | |
| HotpotQA-FaHardNegatives | 0.6153 | 0.1619 | |
| MSMARCO-FaHardNegatives | 0.6871 | 0.3307 | |
| NLPTwitterAnalysisClassification.v2 | 0.7659 | 0.7536 | |
| NQ-FaHardNegatives | 0.4983 | 0.1221 | |
| PerShopDomainClassification | 0.5517 | 0.5665 | |
| PerShopIntentClassification | 0.9069 | 0.9284 | |
| PersianTextEmotion.v2 | 0.6091 | 0.5333 | |
| QuoraRetrieval-Fa.v2 | 0.7788 | 0.7026 | |
| SCIDOCS-Fa.v2 | 0.1222 | 0.0713 | |
| SIDClassification.v2 | 0.6137 | 0.5672 | |
| SciFact-Fa.v2 | 0.6037 | 0.3387 | |
| StyleClassification | 0.6492 | 0.5664 | |
| SynPerTextToneClassification.v3 | 0.8412 | 0.6849 | |
| TRECCOVID-Fa.v2 | 0.7177 | 0.2083 | |
| Touche2020-Fa.v2 | 0.4978 | 0.1362 | |
| WebFAQRetrieval | 0.7459 | 0.3671 | 0.7813 |
| Average | 0.5975 | 0.4095 | 0.7813 |
Results for sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
| task_name | intfloat/multilingual-e5-large | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | Max result |
|---|---|---|---|
| ArguAna-Fa.v2 | 0.4127 | 0.3900 | |
| DeepSentiPers.v2 | 0.5769 | 0.5602 | |
| FEVER-FaHardNegatives | 0.4615 | 0.2021 | |
| FiQA2018-Fa.v2 | 0.2946 | 0.0956 | |
| HotpotQA-FaHardNegatives | 0.6153 | 0.1341 | |
| MSMARCO-FaHardNegatives | 0.6871 | 0.4321 | |
| NLPTwitterAnalysisClassification.v2 | 0.7659 | 0.7547 | |
| NQ-FaHardNegatives | 0.4983 | 0.1650 | |
| PerShopDomainClassification | 0.5517 | 0.5642 | |
| PerShopIntentClassification | 0.9069 | 0.8562 | |
| PersianTextEmotion.v2 | 0.6091 | 0.4471 | |
| QuoraRetrieval-Fa.v2 | 0.7788 | 0.7148 | |
| SCIDOCS-Fa.v2 | 0.1222 | 0.0879 | |
| SIDClassification.v2 | 0.6137 | 0.5447 | |
| SciFact-Fa.v2 | 0.6037 | 0.3195 | |
| StyleClassification | 0.6492 | 0.5122 | |
| SynPerTextToneClassification.v3 | 0.8412 | 0.6079 | |
| TRECCOVID-Fa.v2 | 0.7177 | 0.3495 | |
| Touche2020-Fa.v2 | 0.4978 | 0.3442 | |
| Average | 0.5897 | 0.4254 |
|
Hi @KennethEnevoldsen , could you please take a look at this PR and let me know if everything looks good or if I should make changes? Thanks! |
|
thanks for the ping :) |
for embeddings-benchmark/mteb#3157
Checklist