-
Notifications
You must be signed in to change notification settings - Fork 125
init automate script #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init automate script #220
Conversation
Model Results ComparisonReference models: Results for
|
| task_name | ai-forever/FRIDA | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result |
|---|---|---|---|---|
| MassiveIntentClassification | 0.79 | 0.82 | 0.6 | 0.85 |
| Average | 0.79 | 0.82 | 0.6 | 0.85 |
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
A few things which we could add, but is not required to merge this (probably out of scope):
- Would love to also include the "max" score the each task
- Would love to be able trigger this command using something like @bot result --models intfloat/e5-large-v2
To use it like a bot, we need to host it somewhere. I can host it, but I'm not sure if that's the best option. We could make it trigger on each comment and check if there’s a command to run, but I’m not sure if that’s a good idea. Another solution could be to parametrize workflow dispatch (manually trigger) to add models here. I will try this approach |
|
I've added support for max score and comparison with different models through |
|
@KennethEnevoldsen Can you review this PR? Actual work of the script in this comment #220 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Feel free to merge
To use it like a bot, we need to host it somewhere. I can host it, but I'm not sure if that's the best option. We could make it trigger on each comment and check if there’s a command to run, but I’m not sure if that’s a good idea.
Another solution could be to parametrize workflow dispatch (manually trigger) to add models here. I will try this approach
I think for now it is perfectly fine to just have it as a default comment, and then we can run any additional stuff ourselves.
Closes https://github.com/embeddings-benchmark/results/issues/196
I've created CI action that would create tables for PR automatically. If there won't be any results for
multilingual-e5orgemini-embedding-001then results of only new model will be added.With each new commit CI will update comment with results.
Checklist
mteb/models/this can be as an API. Instruction on how to add a model can be found here