Add new dataset: LIMIT #3093

orionw · 2025-08-28T20:49:13Z

This adds a new dataset, LIMIT, that is available today from my internship with GDM:

Paper: https://arxiv.org/abs/2508.21038
Code: https://github.com/google-deepmind/limit
I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- intfloat/multilingual-e5-small
I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
I have considered the size of the dataset and reduced it if it is too big (2048 examples is typically large enough for most tasks)

mteb/abstasks/AbsTaskRetrieval.py

mteb/tasks/Retrieval/eng/LIMITRetrieval.py

isaac-chung

Congrats on the paper!

orionw · 2025-08-30T19:02:54Z

Thanks @isaac-chung! It's still failing the tests which is a bit weird.

@Samoed is this cuz of my PR or is this good to merge? I see you've been doing a lot of work to fix the branches CI, thank you!

isaac-chung · 2025-08-30T19:10:21Z

I think the missing piece here is to merge #3098 into the v2 branch. @Samoed has a PR open now: #3102

KennethEnevoldsen

Once tests pass this is good to merge - congrats on the paper, very happy to see it!

init commit

c857842

orionw requested review from KennethEnevoldsen and Samoed August 28, 2025 20:49

orionw commented Aug 28, 2025

View reviewed changes

mteb/abstasks/AbsTaskRetrieval.py Outdated Show resolved Hide resolved

Samoed reviewed Aug 28, 2025

View reviewed changes

mteb/tasks/Retrieval/eng/LIMITRetrieval.py Show resolved Hide resolved

Samoed reviewed Aug 28, 2025

View reviewed changes

mteb/tasks/Retrieval/eng/LIMITRetrieval.py Show resolved Hide resolved

orionw added 3 commits August 28, 2025 17:04

revert k=2 add to main

e4f9e27

add paper link

1f63baa

fix bibtex

7634dd5

Samoed reviewed Aug 29, 2025

View reviewed changes

mteb/tasks/Retrieval/eng/LIMITRetrieval.py Outdated Show resolved Hide resolved

mteb/tasks/Retrieval/eng/LIMITRetrieval.py Outdated Show resolved Hide resolved

Fix eval_langs

7f09a31

isaac-chung approved these changes Aug 29, 2025

View reviewed changes

orionw enabled auto-merge (squash) August 30, 2025 18:54

Merge branch 'v2.0.0' into add_limit

2c4bdb6

KennethEnevoldsen approved these changes Sep 1, 2025

View reviewed changes

Samoed and others added 4 commits September 1, 2025 16:34

fix eol

171d719

fix eol

0649ce0

fix citation

f397f07

add sample creation

b3ef2df

orionw merged commit b7d0bec into v2.0.0 Sep 1, 2025
7 of 8 checks passed

orionw deleted the add_limit branch September 1, 2025 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new dataset: LIMIT #3093

Add new dataset: LIMIT #3093

Uh oh!

orionw commented Aug 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaac-chung left a comment

Uh oh!

orionw commented Aug 30, 2025

Uh oh!

isaac-chung commented Aug 30, 2025 •

edited

Loading

Uh oh!

KennethEnevoldsen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add new dataset: LIMIT #3093

Add new dataset: LIMIT #3093

Uh oh!

Conversation

orionw commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

isaac-chung left a comment

Choose a reason for hiding this comment

Uh oh!

orionw commented Aug 30, 2025

Uh oh!

isaac-chung commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

orionw commented Aug 28, 2025 •

edited

Loading

isaac-chung commented Aug 30, 2025 •

edited

Loading