feat: add search encoder backend by Samoed · Pull Request #3492 · embeddings-benchmark/mteb

Samoed · 2025-10-25T21:14:23Z

I've implemented SearchEncoder protocol and now can be selected faiss or direct search (as previous).

mteb/mteb/models/search_encoder_index/search_backend_protocol.py

Lines 7 to 29 in c8b2bd3

    
           class IndexEncoderSearchProtocol(Protocol): 
        
               """Protocol for search backends used in encoder-based retrieval.""" 
        
               def add_document( 
        
                   self, 
        
                   embeddings: Array, 
        
                   idxs: list[str], 
        
               ) -> None: 
        
                   """Add documents to the search backend. 
        
                   Args: 
        
                       embeddings: Embeddings of the documents to add. 
        
                       idxs: IDs of the documents to add. 
        
                   """ 
        
               def search( 
        
                   self, 
        
                   embeddings: Array, 
        
                   top_k: int, 
        
                   similarity_fn: Callable[[Array, Array], Array], 
        
                   top_ranked: TopRankedDocumentsType | None = None, 
        
                   query_idx_to_id: dict[int, str] | None = None, 
        
               ) -> tuple[list[list[float]], list[list[int]]]:

I've saved "batched" approach for retrieval to store less memory during evaluation. Backend can be changed by

import mteb
from mteb.models.search_encoder_index import (
    DefaultEncoderSearchBackend,
    FaissEncoderSearchBackend,
)
from mteb.models import SearchEncoderWrapper


model = mteb.get_model("baseline/random-encoder-baseline")

python_backend = SearchEncoderWrapper(
    model, index_backend=DefaultEncoderSearchBackend()
)
faiss_backend = SearchEncoderWrapper(
    model, index_backend=FaissEncoderSearchBackend(model)
)

I've tested on Scifact using potion-2M and got 2s evaluation for default search and 3s for FAISS.

Script to test

import mteb
from mteb.cache import ResultCache
from mteb.models import SearchEncoderWrapper
from mteb.models.search_encoder_index import (
    DefaultEncoderSearchBackend,
    FaissEncoderSearchBackend,
)

model = mteb.get_model("minishlab/potion-base-2M")

python_backend = SearchEncoderWrapper(
    model, index_backend=DefaultEncoderSearchBackend()
)
faiss_backend = SearchEncoderWrapper(
    model, index_backend=FaissEncoderSearchBackend(model)
)

task = mteb.get_task("SciFact")

python_cache = ResultCache("python_backend_cache")
faiss_cache = ResultCache("faiss_backend_cache")

# warmup
mteb.evaluate(
    model,
    task,
    cache=None,
)

mteb.evaluate(
    python_backend,
    task,
    cache=python_cache,
)

mteb.evaluate(
    faiss_backend,
    task,
    cache=faiss_cache,
)

orionw

Looks great overall!

I think it doesn't take advantage of faiss's built in scoring functionality but by not doing that we can have more control (so that can be ignored). If we wanted to use faiss we could do something like

# Batch reconstruct candidate embeddings
candidate_embs = np.vstack([
    self.index.reconstruct(idx) for idx in candidate_indices
])

# Create temporary index to let FAISS handle scoring
temp_index = self.index_type(d)
temp_index.add(candidate_embs)

# Search returns scores and indices in one call
scores, local_indices = temp_index.search(
    query_emb.reshape(1, -1).astype(np.float32),
    min(top_k, len(candidate_indices))
)

But I think it just does dot product. So it looks great as is, but just mentioning this in case that's helpful.

Samoed · 2025-10-27T07:15:39Z

Yes, I think that's better. I've added support of cosine and dot product similarity support and scores are nearly the same (same for 1e-6).

KennethEnevoldsen

Looks good I would probably restructure it a bit.

I would probably seperate out the implementations from the protocol.

We also need to add documentation on these backends as well as some discussion on the trade-offs between them.

mteb/models/search_encoder_index/search_backend_protocol.py

mteb/models/search_encoder_index/default_backend_search.py

mteb/models/search_encoder_index/faiss_search_backend.py

tests/test_abstasks/test_predictions.py

Samoed · 2025-10-27T10:41:17Z

We also need to add documentation

Yes, wanted to add after your check on pr

Samoed · 2025-10-28T10:05:25Z

I've run this script and both evaluation method took same time, so I'm unsure a bit what to add in advantages of FAISS, except of dumping index, but we're clearing it after evaluation.

task	Stream	FAISS
SWEbenchVerifiedRR	536	541
ClimateFEVERHardNegatives	9	12

import logging

import mteb
from mteb.cache import ResultCache
from mteb.models import SearchEncoderWrapper
from mteb.models.search_encoder_index import StreamingSearchIndex, FaissSearchIndex

logging.basicConfig(level=logging.INFO)

model = SearchEncoderWrapper(mteb.get_model("minishlab/potion-base-2M"))
tasks = mteb.get_tasks(
    tasks=[
        "ClimateFEVERHardNegatives",
        "SWEbenchVerifiedRR",
    ],
)

cache = ResultCache("stream")

mteb.evaluate(
    model,
    tasks,
    cache=cache,
)

### FAISS
index_backend = FaissSearchIndex(model)
model = SearchEncoderWrapper(
    mteb.get_model("minishlab/potion-base-2M"),
    index_backend=index_backend
)
cache = ResultCache("FAISS")

mteb.evaluate(
    model,
    tasks,
    cache=cache,
)

orionw · 2025-10-28T15:55:59Z

I think faiss is not ideal for smaller reranking cases (~100-1000 docs to search for). We should see dramatic gains for retrieval though, with a large enough corpus. For ClimateFEVERHardNegatives it could just be initialization differences. Maybe if you try MS MARCO for retrieval?

I asked Claude what it thinks we should do for reranking and it suggested we retrieve the vectors from faiss for reranking but just use standard numpy afterwards. We could do this, but if it's roughly the same to use faiss then we might as well keep what we have for that.

If large scale retrieval is much faster I think that's the main benefit

Samoed · 2025-10-29T07:02:02Z

I tried running it on MSMARCO, and both backends showed similar times on sub-batches. If we remove the search over each sub-corpus batch, FAISS would probably show a speedup, but I’m not sure how to do that while still supporting the "streaming" backend.

mteb/models/search_encoder_index/search_backend_protocol.py

mteb/models/search_wrappers.py

# Conflicts: # mteb/models/search_wrappers.py

mteb/models/search_encoder_index/search_indexes/streaming_search_index.py

KennethEnevoldsen

I think we can improve the docs a bit, but codewise I think we are there

docs/advanced_usage/retrieval_backend.md

KennethEnevoldsen · 2025-11-25T15:22:39Z

docs/advanced_usage/retrieval_backend.md

shouldn't we also add documentation on the search backend?

Maybe we should also add something here so people can discover what has been added:

A kind of user-friendly changelog

It will be shown in advanced usage

Yeah, but people will not know what has happened since 2.0.0

I would probably change New in v2.0 to

- What is new - v2.3 - v2.2 - v2.1 - v2.0

I think this is more about changelog #3401

Fair we still need the API docs though

KennethEnevoldsen

We are still missing the API docs

KennethEnevoldsen · 2025-11-28T14:41:40Z

Makefile

 	make test

-build-docs:
+build-docs: build-docs-overview


oO does this work?

Yes, everything after : will be triggered before running a function

targets: prerequisites command

KennethEnevoldsen · 2025-11-28T14:42:26Z

I think this is good to merge

add search backend

c8b2bd3

Samoed requested review from KennethEnevoldsen and orionw October 25, 2025 21:14

Samoed changed the title ~~add search backend~~ feat: add search encoder backend Oct 25, 2025

Samoed added 2 commits October 26, 2025 00:19

make faiss optional

8a3527f

fix import

b2c3f60

orionw approved these changes Oct 26, 2025

View reviewed changes

Samoed added 4 commits October 27, 2025 10:29

use faiss in reranking

51111ca

add support for multiple similarities

ae31d1b

remove check

2ce10fd

update index check

74458c5

KennethEnevoldsen reviewed Oct 27, 2025

View reviewed changes

Samoed added 3 commits October 27, 2025 14:03

rename and move files

05b0ba8

add missing files

48143c0

fix import

7fbc60f

KennethEnevoldsen reviewed Oct 30, 2025

View reviewed changes

mteb/models/search_encoder_index/search_backend_protocol.py Outdated Show resolved Hide resolved

KennethEnevoldsen reviewed Oct 30, 2025

View reviewed changes

mteb/models/search_wrappers.py Outdated Show resolved Hide resolved

Samoed added 3 commits November 23, 2025 16:44

rename to add documents

97ab832

make index backend optional

f9b0c8b

Merge branch 'main' into search_barckend

91aad74

# Conflicts: # mteb/models/search_wrappers.py

Samoed requested a review from KennethEnevoldsen November 23, 2025 16:22

KennethEnevoldsen reviewed Nov 25, 2025

View reviewed changes

mteb/models/search_encoder_index/search_indexes/streaming_search_index.py Outdated Show resolved Hide resolved

Samoed added 3 commits November 25, 2025 16:19

remove streaming backend

9748dc9

fix test

f101699

add doc

c69be3a

KennethEnevoldsen reviewed Nov 25, 2025

View reviewed changes

add memory

1c51674

Samoed requested a review from KennethEnevoldsen November 25, 2025 16:13

KennethEnevoldsen reviewed Nov 26, 2025

View reviewed changes

add API to docs

1fb8a88

KennethEnevoldsen reviewed Nov 28, 2025

View reviewed changes

Samoed merged commit 4ed7ef4 into main Nov 28, 2025
10 checks passed

Samoed deleted the search_barckend branch November 28, 2025 15:29

	class IndexEncoderSearchProtocol(Protocol):
	"""Protocol for search backends used in encoder-based retrieval."""

	def add_document(
	self,
	embeddings: Array,
	idxs: list[str],
	) -> None:
	"""Add documents to the search backend.

	Args:
	embeddings: Embeddings of the documents to add.
	idxs: IDs of the documents to add.
	"""

	def search(
	self,
	embeddings: Array,
	top_k: int,
	similarity_fn: Callable[[Array, Array], Array],
	top_ranked: TopRankedDocumentsType \| None = None,
	query_idx_to_id: dict[int, str] \| None = None,
	) -> tuple[list[list[float]], list[list[int]]]:

Conversation

Samoed commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orionw left a comment

Choose a reason for hiding this comment

Uh oh!

Samoed commented Oct 27, 2025

Uh oh!

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Samoed commented Oct 27, 2025

Uh oh!

Samoed commented Oct 28, 2025

Uh oh!

orionw commented Oct 28, 2025

Uh oh!

Samoed commented Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KennethEnevoldsen Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen commented Nov 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Samoed commented Oct 25, 2025 •

edited

Loading

Samoed Nov 25, 2025 •

edited

Loading