Skip to content

Add Gemma-Embeddings-v0.8 Retrieval Results#59

Merged
KennethEnevoldsen merged 17 commits intoembeddings-benchmark:mainfrom
nicholasmonath:main
Dec 11, 2024
Merged

Add Gemma-Embeddings-v0.8 Retrieval Results#59
KennethEnevoldsen merged 17 commits intoembeddings-benchmark:mainfrom
nicholasmonath:main

Conversation

@nicholasmonath
Copy link
Contributor

Add Gofer-Embeddings-v0.8 results on the retrieval subset of tasks in MTEB.

@KennethEnevoldsen
Copy link
Contributor

Thanks for the PR @nicholasmonath it seems like multiple fields such as the MTEB version are not specified as shown by the tests. It also seems like the model_meta is not filled out.

@Samoed
Copy link
Member

Samoed commented Dec 3, 2024

Could you provide the script you used to run MTEB? It seems a bit unusual that the original results didn’t include the MTEB version and evaluation time

@nicholasmonath
Copy link
Contributor Author

Hi @KennethEnevoldsen and @Samoed. Thank you for your comments. We wrote a sanitizer to remove sensitive info like timings, but we realized that our sanitizer was overly sensitive and removed even necessary fields. We are working on updating the pull request. We will add back the MTEB version that we used (we noticed that it is actually an older version 1.0.3) and model_meta. However, we are still required to avoid evaluation time due to the sensitivity of the infrastructure that we use.

@KennethEnevoldsen
Copy link
Contributor

Excluding runtime and co2 emissions is fine, however, 1.0.3 is quite an old version. I would strongly recommend running it on the latest version of mteb. The scores should be approximately the same (minor differences as the seed changed in older version of the code along with code changes). We also standardize the result format in later versions of MTEB. If your model is prompt-based, newer versions of the benchmark allow you to integrate that as well.

@nicholasmonath
Copy link
Contributor Author

Hi @KennethEnevoldsen, thank you for comments and time reviewing this PR. We have now updated our MTEB version to 1.21.7. The latest files now have this version and we have only sanitized the evaluation time. Please let us know if you have any questions or concerns.

@nicholasmonath nicholasmonath changed the title Add Gofer-Embeddings-v0.8 Retrieval Results Add Gemma-Embeddings-v0.8 Retrieval Results Dec 9, 2024
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update - there is a few issues remaining

@nicholasmonath
Copy link
Contributor Author

Thank you so much @KennethEnevoldsen! Sorry about those remaining issues. I believe I have resolved them all now. Please do let me know if there is anything else.

@KennethEnevoldsen KennethEnevoldsen merged commit 2a8b9de into embeddings-benchmark:main Dec 11, 2024
@nicholasmonath
Copy link
Contributor Author

Hi @KennethEnevoldsen, thank you again for all of your help with this pull request. I wanted to check in about when the results would appear on the leaderboard? I thought that they might appear after the update today, but I don't see them added? Please let me know if there is anything more from my side that you need.

Thanks very much!

@KennethEnevoldsen
Copy link
Contributor

For them to appear on the current leaderboard you will have to updatee paths.json (see snippet results.py) If adding a new models also add their names to results.py.

(we are close to having a new leaderboard ready where this will no longer be necessary)

@nicholasmonath
Copy link
Contributor Author

Thank you for your quick reply and the information, @KennethEnevoldsen! I have updated paths.json here: #69

Note, that it looked like the MODELS in results.py are automatically pulled from this line: https://github.com/embeddings-benchmark/results/blob/main/results.py#L295 and so I did not modify this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants