Skip to content

Conversation

@abdurrahmanbutler
Copy link
Contributor

@abdurrahmanbutler abdurrahmanbutler commented Jul 21, 2025

Hi,
I’m submitting this pull request to push the results of intfloat/multilingual-e5-small and sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on BarExamQA.

This PR is connected to embeddings-benchmark/mteb#2916, which adds BarExamQA to MTEB.

This pull request is being submitted courtesy of Isaacus, a legal AI research company.

Checklist

  • My model has a model sheet, report or similar
  • My model has a reference implementation in mteb/models/ this can be as an API. Instruction on how to add a model can be found here
    • No, but there is an existing PR ___
  • The results submitted is obtained using the reference implementation
  • My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
  • I solemnly swear that for all results submitted I have not trained on the evaluation dataset including training splits. If I have I have disclosed it clearly.

Copy link

@umarbutler umarbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that I have reviewed and approve this PR on behalf of Isaacus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've run model incorrectly. You should use model from mteb.get_model

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Samoed,
These results were generated by following the instructions for adding a dataset to MTEB: https://github.com/embeddings-benchmark/mteb/blob/main/docs/adding_a_dataset.md#submit-a-pr

The exact same code was used:

from mteb import MTEB
from sentence_transformers import SentenceTransformer

# Define the sentence-transformers model name
model_name = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"

model = SentenceTransformer(model_name)
evaluation = MTEB(tasks=[YourNewTask()])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think it's a bit outdated. Thanks for pointing!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rerun this model using updated instruction embeddings-benchmark/mteb#2922?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep so I reran the model using mteb.get_model and it seems to produce the right model meta data. I believe the sentence transformers version of the model is newer, leading to different meta data.

@Samoed Samoed merged commit f4a723d into embeddings-benchmark:main Jul 21, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants