update youtu embedding model metadata by spring-quan · Pull Request #292 · embeddings-benchmark/results

spring-quan · 2025-10-07T15:36:11Z

Checklist

My model has a model sheet, report or similar
My model has a reference implementation in mteb/models/ this can be as an API. Instruction on how to add a model can be found here
- No, but there is an existing PR ___
The results submitted is obtained using the reference implementation
My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
I solemnly swear that for all results submitted I have not trained on the evaluation dataset including training splits. If I have I have disclosed it clearly.

github-actions · 2025-10-07T15:39:33Z

Model Results Comparison

Reference models: intfloat/multilingual-e5-large, google/gemini-embedding-001
New models evaluated: tencent/Youtu-Embedding, tencent/Youtu-Embedding
Tasks: AFQMC, ATEC, BQ, CLSClusteringP2P, CLSClusteringS2S, CMedQAv1-reranking, CMedQAv2-reranking, CmedqaRetrieval, Cmnli, CovidRetrieval, DuRetrieval, EcomRetrieval, IFlyTek, JDReview, LCQMC, MMarcoReranking, MMarcoRetrieval, MedicalRetrieval, MultilingualSentiment, Ocnli, OnlineShopping, PAWSX, QBQTC, STSB, T2Reranking, T2Retrieval, TNews, ThuNewsClusteringP2P, ThuNewsClusteringS2S, VideoRetrieval, Waimai

Results for `tencent/Youtu-Embedding`

task_name	google/gemini-embedding-001	tencent/Youtu-Embedding	tencent/Youtu-Embedding	intfloat/multilingual-e5-large	Max result
Revisions		1	32e04afc24817c187a8422e7bdbb493b19796d47
AFQMC	nan	0.6711	0.7219	0.3301	0.7225
ATEC	nan	0.5989	0.6170	0.3981	0.6464
BQ	nan	0.7401	0.7227	0.4850	0.8125
CLSClusteringP2P	nan	0.7580	0.8153	nan	0.8225
CLSClusteringS2S	nan	0.7131	0.7627	nan	0.7408
CMedQAv1-reranking	nan	0.9162	0.9109	0.6765	0.9434
CMedQAv2-reranking	nan	0.9211	0.9256	0.6678	0.9353
CmedqaRetrieval	nan	0.5742	0.5272	0.2866	0.5658
Cmnli	nan	0.9015	0.8773	nan	0.9579
CovidRetrieval	0.7913	0.9291	0.9194	0.7561	0.9606
DuRetrieval	nan	0.9107	0.9198	0.8530	0.9423
EcomRetrieval	nan	0.7328	0.7447	0.5467	0.7881
IFlyTek	nan	0.5273	0.5973	0.4186	0.5799
JDReview	nan	0.9054	0.8923	0.8054	0.9214
LCQMC	nan	0.7997	0.7748	0.7595	0.8354
MMarcoReranking	nan	0.3890	0.4358	0.2912	0.4689
MMarcoRetrieval	nan	0.8957	0.8845	0.7920	0.9033
MedicalRetrieval	nan	0.7324	0.7379	0.5144	0.7562
MultilingualSentiment	nan	0.8089	0.7985	0.7090	0.8536
Ocnli	nan	0.8923	0.8452	nan	0.9518
OnlineShopping	nan	0.9479	0.9413	0.9045	0.9716
PAWSX	nan	0.6782	0.5932	0.1463	0.7331
QBQTC	nan	0.5958	0.5560	nan	0.7145
STSB	0.8550	0.8576	0.8318	0.8236	0.9199
T2Reranking	0.6795	0.7277	0.7315	0.6632	0.7283
T2Retrieval	nan	0.8902	0.8750	0.7607	0.8926
TNews	nan	0.6010	0.6005	0.4880	0.6090
ThuNewsClusteringP2P	nan	0.8698	0.8973	nan	0.8976
ThuNewsClusteringS2S	nan	0.8459	0.8955	nan	0.8790
VideoRetrieval	nan	0.8105	0.8085	0.5828	0.8384
Waimai	nan	0.8980	0.8933	0.8630	0.9231
Average	0.7753	0.7755	0.7760	0.6051	0.8134

Model have high performance on these tasks: CmedqaRetrieval

Results for `tencent/Youtu-Embedding`

task_name	google/gemini-embedding-001	tencent/Youtu-Embedding	tencent/Youtu-Embedding	intfloat/multilingual-e5-large	Max result
Revisions		1	32e04afc24817c187a8422e7bdbb493b19796d47
AFQMC	nan	0.6711	0.7219	0.3301	0.7225
ATEC	nan	0.5967	0.6170	0.3980	0.6464
BQ	nan	0.7295	0.7227	0.4644	0.8125
CLSClusteringP2P	nan	0.7580	0.8153	nan	0.8225
CLSClusteringS2S	nan	0.7131	0.7627	nan	0.7408
CMedQAv1-reranking	nan	0.9162	0.9109	0.6765	0.9434
CMedQAv2-reranking	nan	0.9211	0.9256	0.6678	0.9353
CmedqaRetrieval	nan	0.5742	0.5272	0.2866	0.5658
Cmnli	nan	0.9015	0.8773	nan	0.9579
CovidRetrieval	0.7913	0.9291	0.9194	0.7561	0.9606
DuRetrieval	nan	0.9107	0.9198	0.8530	0.9423
EcomRetrieval	nan	0.7328	0.7447	0.5467	0.7881
IFlyTek	nan	0.5273	0.5973	0.4186	0.5799
JDReview	nan	0.9054	0.8923	0.8054	0.9214
LCQMC	nan	0.7997	0.7748	0.7595	0.8354
MMarcoReranking	nan	0.3890	0.4358	0.2912	0.4689
MMarcoRetrieval	nan	0.8957	0.8845	0.7920	0.9033
MedicalRetrieval	nan	0.7324	0.7379	0.5144	0.7562
MultilingualSentiment	nan	0.8089	0.7985	0.7090	0.8536
Ocnli	nan	0.8923	0.8452	nan	0.9518
OnlineShopping	nan	0.9479	0.9413	0.9045	0.9716
PAWSX	nan	0.6782	0.5932	0.1463	0.7331
QBQTC	nan	0.5958	0.5560	nan	0.7145
STSB	0.8465	0.8484	0.8318	0.8108	0.9140
T2Reranking	0.6795	0.7277	0.7315	0.6632	0.7283
T2Retrieval	nan	0.8902	0.8750	0.7607	0.8926
TNews	nan	0.6010	0.6005	0.4880	0.6090
ThuNewsClusteringP2P	nan	0.8698	0.8973	nan	0.8976
ThuNewsClusteringS2S	nan	0.8459	0.8955	nan	0.8790
VideoRetrieval	nan	0.8105	0.8085	0.5828	0.8384
Waimai	nan	0.8980	0.8933	0.8630	0.9231
Average	0.7725	0.7748	0.7760	0.6037	0.8132

Model have high performance on these tasks: CLSClusteringS2S,IFlyTek,T2Reranking,ThuNewsClusteringS2S

Samoed · 2025-10-07T15:45:56Z

@spring-quan You need to rename folder too

results/tencent__Youtu-Embedding/1/model_meta.json

springxchen and others added 3 commits September 30, 2025 23:54

update results for youtu embedding model

0b66571

Merge branch 'embeddings-benchmark:main' into youtu_llm_embedding

1222b5a

Update youtu embedding model_meta.json

8c5c88b

spring-quan mentioned this pull request Oct 7, 2025

update results for youtu embedding model #284

Merged

6 tasks

rename folder

fe9dcb6

Samoed reviewed Oct 7, 2025

View reviewed changes

results/tencent__Youtu-Embedding/1/model_meta.json Show resolved Hide resolved

Update model_meta.json

527441c

Samoed approved these changes Oct 7, 2025

View reviewed changes

Samoed enabled auto-merge (squash) October 7, 2025 16:28

Samoed merged commit c2427a0 into embeddings-benchmark:main Oct 7, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update youtu embedding model metadata#292

update youtu embedding model metadata#292
Samoed merged 5 commits intoembeddings-benchmark:mainfrom
spring-quan:youtu_llm_embedding

spring-quan commented Oct 7, 2025

Uh oh!

github-actions bot commented Oct 7, 2025 •

edited

Loading

Uh oh!

Samoed commented Oct 7, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spring-quan commented Oct 7, 2025

Checklist

Uh oh!

github-actions bot commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Model Results Comparison

Results for tencent/Youtu-Embedding

Results for tencent/Youtu-Embedding

Uh oh!

Samoed commented Oct 7, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 7, 2025 •

edited

Loading

Results for `tencent/Youtu-Embedding`

Results for `tencent/Youtu-Embedding`