[v2] introduce AbsAnyClassification and refactor mieb classification#2537
[v2] introduce AbsAnyClassification and refactor mieb classification#2537
Conversation
There was a problem hiding this comment.
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
mteb/abstasks/Image/AbsTaskImageClassification.py:63
- Ensure that the new 'values_column_name' attribute is consistently documented across the class, updating any outdated references (like in comments or docstrings) that mention 'image_column_name'.
values_column_name: str = "image"
isaac-chung
left a comment
There was a problem hiding this comment.
Thanks for the effort. However, this is not what we discussed in #2078 (comment). I'd prefer not to have more layers of abstraction in a refactor, to make it easier to maintain and newcomers to understand. I believe @KennethEnevoldsen shared the same view as well.
|
But you're also said |
Sounds good to me 😸 thanks! |
|
But there will be slight change in scores, e. g. |
|
Good find. Maybe we can go the following,
|
|
@isaac-chung I've changed to |
isaac-chung
left a comment
There was a problem hiding this comment.
Amazing work so far, thanks for putting this together. Got a few questions. The rest looks good 🚀
isaac-chung
left a comment
There was a problem hiding this comment.
Looks good! I feel it's good to merge. @KennethEnevoldsen feel free to still review post-hoc
# Conflicts: # tests/test_benchmark/mock_tasks.py
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
This looks great!
I few things, I would like to clarify, but overall I think it looks very good.
|
I am totally fine with merging this - tests seems to fail though, |
|
Yes, the issue was that I hadn’t updated the test to use the new stats format. I've now updated it, so you can clearly see how it will look. If this format looks good to you, I can update everything to use the new nested statistics format consistently. @KennethEnevoldsen |
I've created
AbsAnyClassificationfor classification tasks, by mergingAbsTaskClassificationandAbsTaskImageClassification. Also updated evaluator to support both modalitiesCloses #2432
Results for
openai/clip-vit-base-patch16. There is a small mismatch, because I've changed seeds a bit.Results for
minishlab/potion-base-2MCode Quality
make lintto maintain consistent style.Documentation
Testing
make test-with-coverage.make testormake test-with-coverageto ensure no existing functionality is broken.Adding datasets checklist
Reason for dataset addition: ...
mteb -m {model_name} -t {task_name}command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2intfloat/multilingual-e5-smallself.stratified_subsampling() under dataset_transform()make test.make lint.Adding a model checklist
mteb.get_model(model_name, revision)andmteb.get_model_meta(model_name, revision)