Add ESC50 and zero-shot classification by RahulSChand · Pull Request #2133 · embeddings-benchmark/mteb

RahulSChand · 2025-02-21T18:49:49Z

For #2069

Code Quality

[ ✅ ] Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Adding datasets checklist

Reason for dataset addition: ...

[ ✅] I have run the following models on the task (adding the results to the pr). These can be run using the mteb -m {model_name} -t {task_name} command.
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- intfloat/multilingual-e5-small
[ ✅] I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
If the dataset is too big (e.g. >2048 examples), considering using self.stratified_subsampling() under dataset_transform()
[✅ ] I have filled out the metadata object in the dataset file (find documentation on it here).
Run tests locally to make sure nothing is broken using make test.
[ ✅] Run the formatter to format the code using make lint.

Adding a model checklist

[ ✅] I have filled out the ModelMeta object to the extent possible
[✅ ] I have ensured that my model can be loaded using
- [ ✅] mteb.get_model(model_name, revision) and
- [ ]✅ mteb.get_model_meta(model_name, revision)
[✅ ] I have tested the implementation works on a representative set of tasks.

Co-authored-by: rahulschand <rahulsc@stanford.edu>

RahulSChand · 2025-03-05T07:19:50Z

@Samoed Hey, I finally got around to fixing this PR. I was waiting for this other PR #2082 to merge since many file changes were similar. I have updated the PR with

removed all unnecessary files
Removed esc50 from multi-label task, its only in zero-shot for now
Added F1 score
Made sure make test passes
Ran the code end to end using the below script

Currently on the fused clap model is there, when this PR gets merged, I will create another small PR with just the unfused model added as well.

import mteb
model_name = "laion/clap-htsat-fused"
model = mteb.get_model(model_name=model_name)

tasks = mteb.get_tasks(tasks=["ESC50_Zeroshot"])

evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model)

print(results[0].scores)

Samoed

Great job! Here are a few small suggestions

mteb/evaluation/evaluators/Audio/ZeroshotClassificationEvaluator.py

mteb/models/clap_models.py

mteb/tasks/Audio/__init__.py

tests/test_benchmark/mock_tasks.py

mteb/evaluation/evaluators/Audio/ZeroshotClassificationEvaluator.py

RahulSChand · 2025-03-05T12:58:33Z

Great job! Here are a few small suggestions

Is this now in a state where its good to be approved and merged?

Samoed

I can approve, but maybe @KennethEnevoldsen or @isaac-chung want something to add

mteb/models/clap_models.py

anime-sh and others added 30 commits February 16, 2025 22:27

init audio

56ce4ca

some encoder related changes

5d76e4d

some more abs task defs

b8a45b2

Co-authored-by: rahulschand <rahulsc@stanford.edu>

evaluators and classification

d4a34c1

remove rahul changes to generate first PR

72f526a

make lint

a15e64c

init audio

c5744bf

some encoder related changes

64ccf50

some more abs task defs

1a744c0

Co-authored-by: rahulschand <rahulsc@stanford.edu>

evaluators and classification

c26ebae

remove rahul changes to generate first PR

1289d9b

make lint

bb2b4d0

add dataset/tasks skeleton

705664e

readd changes lost in rebase

07eda3c

add fsd50k

ebae179

add task categories for audio

d51c5d1

slight updates to fsd50k

e3b89fa

make lint

849323c

wav2vec2 model

395b833

add fsd50k metadata

efd7095

rename folder

f97f9a3

add metric

6d61f3a

add torchaudio in req

fa61ea6

reigster wav2vec2 models

b03a28f

Merge branch 'maeb' of https://github.com/anime-sh/mteb into maeb

3b57aeb

fixes

e4aaf9d

add audio in valid task types

d3c20a0

Merge branch 'maeb' of https://github.com/anime-sh/mteb into maeb

20a45ad

mock interface changes

c92073a

my 0 shot

1b97605

silky1708 and others added 16 commits March 1, 2025 17:32

update mock_tasks; make lint

a2d31e7

remove train_split from fn parameters

d6fdc00

define fsd2019k to be multilingual

7ff54be

inherit from MultilingualTask in fsd2019K

1700f5a

Merge branch 'new-maeb' into maeb

c9f9aa4

fix tests

1282cf2

inherit correct multingial task class

6f5e0ba

remove MockAudioMultilabelClassificationLogRegTask

4712cf0

rm other instances of MockAudioMultilabelClassificationLogRegTask

11ba946

merged maeb animesh branch

75b14df

merged with maeb upstream

9ca4837

removed unncessary files

d1bb88d

removed unncrssary files

2ecd849

removed uncrssary files part 3

5083bc1

deleted esc50 from multi label classification

25481b9

fixed errors

388849d

Samoed reviewed Mar 5, 2025

View reviewed changes

RahulSChand added 2 commits March 5, 2025 00:23

fixed lintng, added precision and recall. Removed extra comments

429ac3b

fixed double loading of model

1d5e987

RahulSChand marked this pull request as ready for review March 5, 2025 08:35

RahulSChand changed the title ~~Added ESC50 and zero-shot classification (WIP)~~ Add ESC50 and zero-shot classification Mar 5, 2025

Samoed approved these changes Mar 5, 2025

View reviewed changes

mteb/models/clap_models.py Outdated Show resolved Hide resolved

RahulSChand added 2 commits March 5, 2025 05:25

filled in missing meta-data

a9b0605

fixed linting

dc2c28d

RahulSChand merged commit 0620c58 into embeddings-benchmark:maeb Mar 5, 2025
9 checks passed

This was referenced Mar 6, 2025

Add unfused clap models for zero-shot audio classification (closed wrong source branch) #2265

Closed

Add unfused clap model for zero-shot #2269

Merged

added large, music and speech clap models #2284

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ESC50 and zero-shot classification#2133

Add ESC50 and zero-shot classification#2133
RahulSChand merged 111 commits intoembeddings-benchmark:maebfrom
anime-sh:zero_shot

RahulSChand commented Feb 21, 2025 •

edited

Loading

Uh oh!

RahulSChand commented Mar 5, 2025 •

edited

Loading

Uh oh!

Samoed left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RahulSChand commented Mar 5, 2025

Uh oh!

Samoed left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

RahulSChand commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Quality

Documentation

Testing

Adding datasets checklist

Adding a model checklist

Uh oh!

RahulSChand commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RahulSChand commented Mar 5, 2025

Uh oh!

Samoed left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

RahulSChand commented Feb 21, 2025 •

edited

Loading

RahulSChand commented Mar 5, 2025 •

edited

Loading

Samoed left a comment •

edited

Loading