Skip to content

add xyz model#207

Merged
KennethEnevoldsen merged 4 commits intoembeddings-benchmark:mainfrom
fangxiaoquan:fxq-xyz
Jun 10, 2025
Merged

add xyz model#207
KennethEnevoldsen merged 4 commits intoembeddings-benchmark:mainfrom
fangxiaoquan:fxq-xyz

Conversation

@fangxiaoquan
Copy link
Contributor

@fangxiaoquan fangxiaoquan commented May 29, 2025

Checklist

  • My model has a model sheet, report or similar
  • My model has a reference implementation in mteb/models/ this can be as an API. Instruction on how to add a model can be found here
    • No, but there is an existing PR ___
  • The results submitted is obtained using the reference implementation
  • My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
  • I solemnly swear that for all results submitted I have not on the evaluation dataset including training splits. If I have I have disclosed it clearly.

@Samoed Samoed requested a review from KennethEnevoldsen June 10, 2025 07:22
@KennethEnevoldsen
Copy link
Contributor

Results for fangxq/XYZ-embedding

task_name fangxq/XYZ-embedding google/gemini-embedding-001 intfloat/multilingual-e5-large
CmedqaRetrieval 0.48 nan 0.29
CovidRetrieval 0.91 0.79 0.76
DuRetrieval 0.91 nan 0.85
EcomRetrieval 0.70 nan 0.55
MMarcoRetrieval 0.83 nan 0.79
MedicalRetrieval 0.68 nan 0.51
T2Retrieval 0.86 nan 0.76
VideoRetrieval 0.81 nan 0.58
Average 0.77 0.79 0.64
task_name Qwen/Qwen3-Embedding-8B fangxq/XYZ-embedding
CmedqaRetrieval 0.53 0.48
CovidRetrieval 0.88 0.91
DuRetrieval 0.91 0.91
EcomRetrieval 0.73 0.70
MMarcoRetrieval 0.87 0.83
MedicalRetrieval 0.65 0.68
T2Retrieval 0.89 0.86
VideoRetrieval 0.80 0.81
Average 0.78 0.77

CmedqaRetrieval looks especially suspicious, this seems to be because the model was trained on this dataset (medqav2) - @Samoed I assume this PR will fix that? - if not, we should update the ModelMeta.

Otherwise, the model generally only performs surprisingly in cases where it was fit on the data, so once the annotation is resolved, we can merge.

@Samoed
Copy link
Member

Samoed commented Jun 10, 2025

Yes, PR should fix this.

@KennethEnevoldsen KennethEnevoldsen merged commit 005e085 into embeddings-benchmark:main Jun 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants