Add new and complete version of FSD50K multi-label audio classification task by RahulSChand · Pull Request #2285 · embeddings-benchmark/mteb

RahulSChand · 2025-03-08T07:25:42Z

All existing HuggingFace datasets for FSD50K were broken, either corrupt data or didn't have complete test+train split. We downloaded the original data from FreeSound and put it on HF. Tested wav2vec (results next comment)

Code Quality

[✅ ] Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

[✅ ] Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

[❌ ] New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Adding datasets checklist

Reason for dataset addition: ...

All existing HuggingFace datasets for FSD50K were broken, either corrupt data or didn't have complete test+train split. We downloaded the original data from FreeSound and put it on HF. Tested wav2vec (results next comment)

[ ✅] I have run the following models on the task (adding the results to the pr). These can be run using the mteb -m {model_name} -t {task_name} command.
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- intfloat/multilingual-e5-small
[✅ ] I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
[ ✅] If the dataset is too big (e.g. >2048 examples), considering using self.stratified_subsampling() under dataset_transform()
[✅ ] I have filled out the metadata object in the dataset file (find documentation on it here).
[ ✅] Run tests locally to make sure nothing is broken using make test.
[✅ ] Run the formatter to format the code using make lint.

Adding a model checklist

I have filled out the ModelMeta object to the extent possible
I have ensured that my model can be loaded using
- mteb.get_model(model_name, revision) and
- mteb.get_model_meta(model_name, revision)
I have tested the implementation works on a representative set of tasks.

RahulSChand · 2025-03-08T07:26:24Z

Test results using wav2vec model on multi-label classification task

mteb/tasks/Audio/AudioMultilabelClassification/eng/FSD50HF.py

diffunity · 2025-03-14T03:07:27Z

I have one question regarding this dataset.

I noticed that your labels are in comma-separated string format (huggingface link) However, judging from the mteb multi-class classification tasks, it seems the labels should be in list[int] format. I think this format mismatch may cause problems?

For example,

mteb/mteb/abstasks/Audio/AbsTaskAudioMultilabelClassification.py

Lines 258 to 260 in ef30e3d

    
           label_counter = defaultdict(int) 
        
           for i in idxs: 
        
               if any((label_counter[label] < samples_per_label) for label in y[i]):

My understanding is that the label_counter should be counting the unique label values, but an instance of the label_counter when running this task is : defaultdict(<class 'int'>, {'W': 3, 'i': 19, 'n': 29, 'd': 18, '_': 21, 's': 26, 't': 15, 'r': 14, 'u': 19, 'm': 15, 'e': 16, 'a': 15, 'w': 2, 'o': 27, ',': 15, 'M': 4, 'c': 10, 'l': 5, 'K': 2, 'k': 1, 'D': 5, 'h': 5, 'T': 4, 'R': 1, 'B': 1, 'p': 2, 'g': 2, 'P': 1, 'y': 2, 'b': 1, '(': 1, ')': 1, 'H': 1, 'S': 1})

Same happens in

mteb/mteb/abstasks/Audio/AbsTaskAudioMultilabelClassification.py

Lines 213 to 215 in ef30e3d

    
           test_audio = eval_split[self.audio_column_name] 
        
           binarizer = MultiLabelBinarizer() 
        
           y_test = binarizer.fit_transform(eval_split[self.label_column_name])

Unless this was intended, I think the labels should be mapped to a list of their unique IDs.

Samoed · 2025-03-14T07:32:05Z

I think labels column should be splitted additionally @RahulSChand

RahulSChand · 2025-03-14T21:19:26Z

I think labels column should be splitted additionally @RahulSChand

Yes, makes sense, will add a PR for fix

anime-sh · 2025-03-15T05:53:29Z

#2369 should fix this

RahulSChand added 4 commits March 7, 2025 21:26

Added fsd50k dataset on huggingface

40f61c3

added correct hf version of fsd50k dataset

5cfba55

added correct hf version of fsd50k dataset

1073336

removed extra imports

fadeb6d

RahulSChand added the maeb Audio extension label Mar 8, 2025

RahulSChand self-assigned this Mar 8, 2025

Samoed reviewed Mar 8, 2025

View reviewed changes

mteb/tasks/Audio/AudioMultilabelClassification/eng/FSD50HF.py Outdated Show resolved Hide resolved

removed unecessary load_data fn

4ef82bf

Samoed approved these changes Mar 8, 2025

View reviewed changes

Samoed merged commit 2188585 into embeddings-benchmark:maeb Mar 8, 2025
8 checks passed

anime-sh mentioned this pull request Mar 15, 2025

fix FSD-50K Task Metadata, Label handling and add stratified subsampling #2369

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new and complete version of FSD50K multi-label audio classification task#2285

Add new and complete version of FSD50K multi-label audio classification task#2285
Samoed merged 5 commits intoembeddings-benchmark:maebfrom
anime-sh:fsd50k_hf_upload

RahulSChand commented Mar 8, 2025

Uh oh!

RahulSChand commented Mar 8, 2025

Uh oh!

Uh oh!

Uh oh!

diffunity commented Mar 14, 2025 •

edited

Loading

Uh oh!

Samoed commented Mar 14, 2025

Uh oh!

RahulSChand commented Mar 14, 2025

Uh oh!

anime-sh commented Mar 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

RahulSChand commented Mar 8, 2025

Code Quality

Documentation

Testing

Adding datasets checklist

Adding a model checklist

Uh oh!

RahulSChand commented Mar 8, 2025

Uh oh!

Uh oh!

Uh oh!

diffunity commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed commented Mar 14, 2025

Uh oh!

RahulSChand commented Mar 14, 2025

Uh oh!

anime-sh commented Mar 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

diffunity commented Mar 14, 2025 •

edited

Loading