PoC - Vector rescoring in kNN#116350
Conversation
…after an approximate search
…, which are not using the original source
| if (knnCollectorManager instanceof TimeLimitingKnnCollectorManager timeLimitingKnnCollectorManager) { | ||
| queryTimeout = timeLimitingKnnCollectorManager.getQueryTimeout(); | ||
| } | ||
| return exactSearch(context, bitSetIterator, queryTimeout); |
There was a problem hiding this comment.
I don't think this will work as exactSearch would use the quantized scorers, also this would be done per segment. Which would be bad.
I think what we need to do is override rewrite to return another query that can be further rewritten but scores the previously scored documents given the raw floating point vectors.
We only want to rescore for the entire shard. Doing each segment would be very expensive.
There was a problem hiding this comment.
I see, thanks for explaining Ben. I'll give it another shot 🔫
| public static final ParseField NUM_CANDS_FIELD = new ParseField("num_candidates"); | ||
| public static final ParseField QUERY_VECTOR_FIELD = new ParseField("query_vector"); | ||
| public static final ParseField VECTOR_SIMILARITY_FIELD = new ParseField("similarity"); | ||
| public static final ParseField RESCORE_VECTOR_OVERSAMPLE = new ParseField("rescore_vector_oversample"); |
There was a problem hiding this comment.
I do think an object with a new parameter is best. We will likely have separate options to provide (rescore field, rescore kind, etc.)
There was a problem hiding this comment.
Agreed - this was just me not wanting to deal with the parser yet 😁
|
Closing this in favour of #116663 |
PoC that adds a new parameter to kNN query (
rescore_vector_oversample) that is used to:This is done by overriding the
approximateSearch()method in ESKnnFloatVectorQuery. It could be pushed down to the Lucene query if needed.Usage: