Skip to content

[8.18] Knn vector rescoring to sort score docs (#122653)#122679

Merged
elasticsearchmachine merged 2 commits intoelastic:8.18from
javanna:backport/8.18/122653
Feb 15, 2025
Merged

[8.18] Knn vector rescoring to sort score docs (#122653)#122679
elasticsearchmachine merged 2 commits intoelastic:8.18from
javanna:backport/8.18/122653

Conversation

@javanna
Copy link
Contributor

@javanna javanna commented Feb 15, 2025

RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite.

Relates to #116663

Closes #119711

RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of
doc ids and corresponding array including scores fo such docs. A binary search is
performed on top of the docs array, and such global ids are converted back to
segment level ids (subtracting the context docbase) when scoring docs.

RescoreKnnVectoryQuery did not sort the array of docs which caused binary search
to return non deterministic results, which in turn made us look up wrong docs,
something using out of bound ids. One symptom of this was observed in a DFSProfilerIT
test failure which triggered a Lucene assertion around doc id being outside of the
range of the bitset of live docs.

The fix is to simply sort the score docs array before extracting docs ids and scores
and providing them to KnnScoreDocQuery upon rewrite.

Relates to elastic#116663

Closes elastic#119711
@javanna javanna added backport auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) :Search Relevance/Vectors Vector search labels Feb 15, 2025
@elasticsearchmachine elasticsearchmachine merged commit 84ec26a into elastic:8.18 Feb 15, 2025
15 checks passed
@javanna javanna deleted the backport/8.18/122653 branch February 15, 2025 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport :Search Relevance/Vectors Vector search v8.18.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments