#15024: Improve prefix sum in Lucene99HnswVectorsReader#15790
Merged
kaivalnp merged 3 commits intoapache:mainfrom Mar 9, 2026
Merged
#15024: Improve prefix sum in Lucene99HnswVectorsReader#15790kaivalnp merged 3 commits intoapache:mainfrom
kaivalnp merged 3 commits intoapache:mainfrom
Conversation
kaivalnp
reviewed
Mar 5, 2026
Contributor
kaivalnp
left a comment
There was a problem hiding this comment.
LGTM
Looks like this improvement is in the range of noise for knnPerfTest.py, but is good-to-have anyways.
| dataIn.seek(graphLevelNodeOffsets.get(targetIndex + graphLevelNodeIndexOffsets[level])); | ||
| arcCount = dataIn.readVInt(); | ||
| assert arcCount <= currentNeighborsBuffer.length : "too many neighbors: " + arcCount; | ||
| int sum = 0; |
Contributor
There was a problem hiding this comment.
nit: Would prefer this variable inside the if block below
Contributor
Author
There was a problem hiding this comment.
Done, move inside the if block
lucene/CHANGES.txt
Outdated
|
|
||
| Optimizations | ||
| --------------------- | ||
| * GITHUB#15024: Improve prefix sum computation in Lucene99HnswVectorsReader for faster neighbor decoding. (Luis Negrin) |
Contributor
There was a problem hiding this comment.
This entry is under 11.0.0 -- can you move it to 10.5.0? (I can help with merge + backport)
Contributor
Author
There was a problem hiding this comment.
Move into 10.5.0, would appreciate help with the merge + backport
kaivalnp
pushed a commit
that referenced
this pull request
Mar 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements the optimization suggested in #15024, replacing the two-step prefix sum loop in
Lucene99HnswVectorsReaderwith a single-pass accumulator variant that avoids redundant memory reads.Before:
After:
This is a follow-up to #15027 by @yossev who proposed the same fix. Since that PR went stale (merge conflicts, formatting), I'm resubmitting with conflicts resolved, formatting fixed via
./gradlew tidy, and benchmark results included.I found this while looking for a good first issue to learn the contribution process — happy to adjust anything based on feedback!
Benchmark Results
Benchmarks were run using luceneutil KNN benchmark (
knnPerfTest.py).Machine: Intel Core i5-10210U, 8 logical cores, ~15 GB RAM
Dataset: cohere-v3-wikipedia-en 1024d, 400k docs, 10k queries, 8-bit quantized, dot_product
Baseline:
Candidate (this PR):
Recall is identical. Results are from a single run so small differences may fall within normal measurement variance.