Skip to content

#15024: Improve prefix sum in Lucene99HnswVectorsReader#15790

Merged
kaivalnp merged 3 commits intoapache:mainfrom
leng25:improve-hnsw-prefix-sum
Mar 9, 2026
Merged

#15024: Improve prefix sum in Lucene99HnswVectorsReader#15790
kaivalnp merged 3 commits intoapache:mainfrom
leng25:improve-hnsw-prefix-sum

Conversation

@leng25
Copy link
Copy Markdown
Contributor

@leng25 leng25 commented Mar 2, 2026

Summary

This PR implements the optimization suggested in #15024, replacing the two-step prefix sum loop in Lucene99HnswVectorsReader with a single-pass accumulator variant that avoids redundant memory reads.

Before:

currentNeighborsBuffer[0] = dataIn.readVInt();
for (int i = 1; i < arcCount; i++) {
  currentNeighborsBuffer[i] = currentNeighborsBuffer[i - 1] + dataIn.readVInt();
}

After:

int sum = 0;
for (int i = 0; i < arcCount; i++) {
  sum += dataIn.readVInt();
  currentNeighborsBuffer[i] = sum;
}

This is a follow-up to #15027 by @yossev who proposed the same fix. Since that PR went stale (merge conflicts, formatting), I'm resubmitting with conflicts resolved, formatting fixed via ./gradlew tidy, and benchmark results included.

I found this while looking for a good first issue to learn the contribution process — happy to adjust anything based on feedback!

Benchmark Results

Benchmarks were run using luceneutil KNN benchmark (knnPerfTest.py).

Machine: Intel Core i5-10210U, 8 logical cores, ~15 GB RAM
Dataset: cohere-v3-wikipedia-en 1024d, 400k docs, 10k queries, 8-bit quantized, dot_product

Baseline:

recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)
 0.977        9.920   9.893        0.997  400000   100     100       64        250     8 bits     7955    486.32        822.50          437.90             1         2015.68

Candidate (this PR):

recall  latency(ms)  netCPU  avgCpuCount    nDoc  topK  fanout  maxConn  beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)
 0.977        9.861   9.833        0.997  400000   100     100       64        250     8 bits     7955    486.32        822.50          437.90             1         2015.68

Recall is identical. Results are from a single run so small differences may fall within normal measurement variance.

Copy link
Copy Markdown
Contributor

@kaivalnp kaivalnp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Looks like this improvement is in the range of noise for knnPerfTest.py, but is good-to-have anyways.

dataIn.seek(graphLevelNodeOffsets.get(targetIndex + graphLevelNodeIndexOffsets[level]));
arcCount = dataIn.readVInt();
assert arcCount <= currentNeighborsBuffer.length : "too many neighbors: " + arcCount;
int sum = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would prefer this variable inside the if block below

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, move inside the if block


Optimizations
---------------------
* GITHUB#15024: Improve prefix sum computation in Lucene99HnswVectorsReader for faster neighbor decoding. (Luis Negrin)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry is under 11.0.0 -- can you move it to 10.5.0? (I can help with merge + backport)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move into 10.5.0, would appreciate help with the merge + backport

@github-actions github-actions bot modified the milestones: 11.0.0, 10.5.0 Mar 9, 2026
Copy link
Copy Markdown
Contributor

@kaivalnp kaivalnp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaivalnp kaivalnp linked an issue Mar 9, 2026 that may be closed by this pull request
@kaivalnp kaivalnp merged commit fb2b916 into apache:main Mar 9, 2026
13 checks passed
kaivalnp pushed a commit that referenced this pull request Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve prefix sum in Lucene99HnswVectorsReader

2 participants