Add ChemTEB results by HSILA · Pull Request #89 · embeddings-benchmark/results

HSILA · 2025-01-08T20:36:49Z

Adding ChemTEB results, as requested in this PR.

Checklist

Run tests locally to make sure nothing is broken using make test.
Run the results files checker make pre-push.

Adding a model checklist

I have added model implementation to mteb/models/ directory. Instruction to add a model can be found here in the following PR ____

Samoed · 2025-01-11T19:41:10Z

@HSILA can you create table that compare your results with results of benchmark?

HSILA · 2025-01-11T21:44:06Z

@HSILA can you create table that compare your results with results of benchmark?

Hi, hope you are doing fine.

As I mentioned earlier here, a direct comparison with the paper is not possible because the task-model scores are not presented in the paper. The paper reports an average score per category, but the combination of tasks has changed in my PR, so we cannot directly compare per-category averages.

However, one approach we can take is to compare the tasks that are present both in my PR and in the ChemTEB results. I have shared the local JSON results that I have, allowing us to compare the shared tasks and evaluate how the performance has changed on average.

To ensure that the JSON files I shared in chemteb-results correspond to the same results used to produce Table 2, you can refer to table2.ipynb and reproduce it for verification.

The mteb.ipynb notebook compares the main score for shared tasks (using an average score for tasks that were merged as subsets of a bigger task in MTEB) and reports the difference.

The observed difference is 0.0045 overall (average across all the models and tasks), and it ranges from approximately 0 to 0.02 for most tasks. Notably, the PubChemWikiParagraphsPC task shows a significant difference because it was updated later. This update involved masking exact chemical compound names in each text pair to make the problem more challenging.

Other observed changes, particularly in Classification and Clustering tasks, can be attributed to updates that complemented the label column (which was not sufficiently descriptive) with a label_text column. Additionally, these tasks may have undergone reordering, potentially affecting train-test splits.

That said, we can always revert all revisions to match those used in the paper. However, the current revisions in the PR provide more detailed information, such as the label_text column and an updated README.

HSILA and others added 2 commits January 7, 2025 20:13

Add Chemteb results, open source models

68659fb

Add Chemteb results, proprietary models and m3

3c7bd90

KennethEnevoldsen requested a review from Samoed January 8, 2025 21:41

Merge branch 'embeddings-benchmark:main' into main

997cb2d

Samoed approved these changes Jan 12, 2025

View reviewed changes

Samoed mentioned this pull request Jan 12, 2025

feat: Integrating ChemTEB embeddings-benchmark/mteb#1708

Merged

15 tasks

Samoed merged commit 4f6a9fc into embeddings-benchmark:main Jan 12, 2025
2 checks passed

Samoed mentioned this pull request Dec 12, 2025

remove results of model with missing implementations in MTEB #362

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ChemTEB results#89

Add ChemTEB results#89
Samoed merged 3 commits intoembeddings-benchmark:mainfrom
HSILA:main

HSILA commented Jan 8, 2025

Uh oh!

Samoed commented Jan 11, 2025

Uh oh!

HSILA commented Jan 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HSILA commented Jan 8, 2025

Checklist

Adding a model checklist

Uh oh!

Samoed commented Jan 11, 2025

Uh oh!

HSILA commented Jan 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HSILA commented Jan 11, 2025 •

edited

Loading