Conversation
orionw
left a comment
There was a problem hiding this comment.
Looks great overall!
I think it doesn't take advantage of faiss's built in scoring functionality but by not doing that we can have more control (so that can be ignored). If we wanted to use faiss we could do something like
# Batch reconstruct candidate embeddings
candidate_embs = np.vstack([
self.index.reconstruct(idx) for idx in candidate_indices
])
# Create temporary index to let FAISS handle scoring
temp_index = self.index_type(d)
temp_index.add(candidate_embs)
# Search returns scores and indices in one call
scores, local_indices = temp_index.search(
query_emb.reshape(1, -1).astype(np.float32),
min(top_k, len(candidate_indices))
)
But I think it just does dot product. So it looks great as is, but just mentioning this in case that's helpful.
|
Yes, I think that's better. I've added support of cosine and dot product similarity support and scores are nearly the same (same for |
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
Looks good I would probably restructure it a bit.
I would probably seperate out the implementations from the protocol.
We also need to add documentation on these backends as well as some discussion on the trade-offs between them.
Yes, wanted to add after your check on pr |
|
I've run this script and both evaluation method took same time, so I'm unsure a bit what to add in advantages of FAISS, except of dumping index, but we're clearing it after evaluation.
import logging
import mteb
from mteb.cache import ResultCache
from mteb.models import SearchEncoderWrapper
from mteb.models.search_encoder_index import StreamingSearchIndex, FaissSearchIndex
logging.basicConfig(level=logging.INFO)
model = SearchEncoderWrapper(mteb.get_model("minishlab/potion-base-2M"))
tasks = mteb.get_tasks(
tasks=[
"ClimateFEVERHardNegatives",
"SWEbenchVerifiedRR",
],
)
cache = ResultCache("stream")
mteb.evaluate(
model,
tasks,
cache=cache,
)
### FAISS
index_backend = FaissSearchIndex(model)
model = SearchEncoderWrapper(
mteb.get_model("minishlab/potion-base-2M"),
index_backend=index_backend
)
cache = ResultCache("FAISS")
mteb.evaluate(
model,
tasks,
cache=cache,
) |
|
I think faiss is not ideal for smaller reranking cases (~100-1000 docs to search for). We should see dramatic gains for retrieval though, with a large enough corpus. For I asked Claude what it thinks we should do for reranking and it suggested we retrieve the vectors from If large scale retrieval is much faster I think that's the main benefit |
|
I tried running it on |
# Conflicts: # mteb/models/search_wrappers.py
mteb/models/search_encoder_index/search_indexes/streaming_search_index.py
Outdated
Show resolved
Hide resolved
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
I think we can improve the docs a bit, but codewise I think we are there
There was a problem hiding this comment.
It will be shown in advanced usage
There was a problem hiding this comment.
Yeah, but people will not know what has happened since 2.0.0
I would probably change New in v2.0 to
- What is new
- v2.3
- v2.2
- v2.1
- v2.0
There was a problem hiding this comment.
Fair we still need the API docs though
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
We are still missing the API docs
| make test | ||
|
|
||
| build-docs: | ||
| build-docs: build-docs-overview |
There was a problem hiding this comment.
oO does this work?
There was a problem hiding this comment.
Yes, everything after : will be triggered before running a function
targets: prerequisites
command|
I think this is good to merge |

Close #3406
I've implemented SearchEncoder protocol and now can be selected faiss or direct search (as previous).
mteb/mteb/models/search_encoder_index/search_backend_protocol.py
Lines 7 to 29 in c8b2bd3
I've saved "batched" approach for retrieval to store less memory during evaluation. Backend can be changed by
I've tested on Scifact using
potion-2Mand got 2s evaluation for default search and 3s forFAISS.Script to test