Skip to content

CAGRA to HNSW serialization and search on CPU#16

Merged
rapids-bot[bot] merged 22 commits intorapidsai:branch-25.10from
SearchScale:searchscale/cagra-to-hnsw-serialization-search-feature
Sep 26, 2025
Merged

CAGRA to HNSW serialization and search on CPU#16
rapids-bot[bot] merged 22 commits intorapidsai:branch-25.10from
SearchScale:searchscale/cagra-to-hnsw-serialization-search-feature

Conversation

@narangvivek10
Copy link
Collaborator

Introducing a new Codec that uses CAGRA for building the index on GPU and serializing to Lucene-compatible HNSW index segments. The Lucene-compatible segments are searchable via the Lucene99HnswVectorsReader (which is the default in Lucene 10.x).

Note: This is based on top of #14 and should be rebased once that is merged.

TODO:

  • Benchmarks and more tests
  • Further refactoring to split the CuVSVectorsFormat into GPU and CPU-specific formats.

Fixes #13

 > Co-authored-by: Ishan Chattopadhyaya <ichattopadhyaya@gmail.com>
 > Co-authored-by: Puneet Ahuja <puneet@searchscale.com>
@narangvivek10 narangvivek10 self-assigned this Aug 15, 2025
@narangvivek10 narangvivek10 added feature request New feature or request improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Aug 15, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@narangvivek10 narangvivek10 added non-breaking Introduces a non-breaking change and removed non-breaking Introduces a non-breaking change feature request New feature or request labels Aug 15, 2025
@chatman chatman changed the title CAGRA to HNSW serialization and search on CPU [WIP] CAGRA to HNSW serialization and search on CPU Aug 15, 2025
@chatman
Copy link
Collaborator

chatman commented Aug 15, 2025

Another important TODO is to remove the existing hnswlib based HNSW support. And, of course, a lot of refactoring to make the codebase more manageable.

@chatman
Copy link
Collaborator

chatman commented Aug 15, 2025

I ran preliminary benchmarks using SIFT 1M (1 million vectors, 128D), using NoMergePolicy, topK=100.
Total queries 1000. QueryTime is total time for running all 1000 queries sequentially in a single thread.
For CAGRA->HNSW, the cuvsWriterThreads was 8, and client side IndexThreads was varied.

Note: There are as many segments created as many as IndexThreads, because of NoMergePolicy.

CAGRA -> HNSW Results

GraphDegree IntGraphDeg HnswLayers IndexThreads IndexingTime QueryTime RecallAccuracy
16 32 1 1 10535 1898 85.52
16 32 2 1 14228 1965 85.39
32 64 1 1 14064 2313 94.88
32 64 2 1 15751 2234 94.84
48 96 1 1 16916 2469 97.58
48 96 2 1 17993 3039 97.53
16 32 1 4 10146 3136 92.40
16 32 2 4 12631 3030 91.88
32 64 1 4 14080 4250 98.52
32 64 2 4 16038 4232 98.08
48 96 1 4 17160 5064 85.65
48 96 2 4 18683 5040 99.22

Lucene HNSW Results

IndexThreads MaxConn BeamWidth IndexingTime QueryTime RecallAccuracy
8 12 64 31837 3918 92.22
8 12 128 53646 4342 94.61
8 12 256 93810 4449 95.43
8 12 512 162199 4465 95.53
8 16 64 34062 4334 94.54
8 16 128 59424 4907 96.46
8 16 256 108894 5298 97.24
8 16 512 193887 5246 97.66
8 24 64 33506 4991 95.82
8 24 128 63504 5550 97.69
8 24 256 116333 6050 98.49
8 24 512 217195 6313 98.78
8 36 64 34037 5306 96.17
8 36 128 65455 5761 98.03
8 36 256 124144 6697 98.81
8 36 512 225417 7160 99.15
16 12 64 16057 4662 89.73
16 12 128 26876 4997 91.93
16 12 256 44286 5290 92.82
16 12 512 84553 5400 93.13
16 16 64 16637 5234 92.02
16 16 128 28918 5789 94.03
16 16 256 51598 6314 95.22
16 16 512 84021 6317 95.70
16 24 64 17436 6089 93.38
16 24 128 30400 6652 95.69
16 24 256 55510 7228 96.81
16 24 512 91794 7310 97.31
16 36 64 18637 6239 93.77
16 36 128 31330 6910 96.19
16 36 256 57253 7689 97.48
16 36 512 96139 8607 97.93

Steps to run:

  1. git clone -b ishan/cagra-lucene-upgrade https://github.com/searchscale/vectorsearch-benchmarks
  2. Edit the LD_LIBRARY_PATH in both scripts appropriately
  3. ./run_cagra_hnsw_grid.sh
  4. ./run_lucene-grid.sh

Links to suites:

  1. https://github.com/SearchScale/vectorsearch-benchmarks/blob/ishan/cagra-lucene-upgrade/run_cagra_hnsw_grid.sh
  2. https://github.com/SearchScale/vectorsearch-benchmarks/blob/ishan/cagra-lucene-upgrade/run_lucene-grid.sh

@punAhuja
Copy link
Contributor

Complete Vector Search Benchmark Results (58 Runs)

Algorithm Dataset VectorDim NumDocs FlushFreq CagraDegree CagraIntermediate IndexingThreads CuvsWriterThreads LuceneM LuceneEF IndexingTime QueryTime RecallAccuracy BenchmarkID ErrorDetails
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 4 8 N/A N/A 150366 42726 98.084 OPT_openai_4M_CAGRA Success
LUCENE_HNSW openai_4M 1536 4600000 1000000 N/A N/A N/A N/A 16 128 2065637 4459 92.873 OPT_openai_4M_LUCENE Success
CAGRA_HNSW wiki88M 768 87555327 1000000 64 128 4 8 N/A N/A 2086884 13907381 92.904 OPT_wiki88M_CAGRA Success
LUCENE_HNSW wiki88M 768 87555327 1000000 N/A N/A N/A N/A 16 128 19417780 1066578 79.703 OPT_wiki88M_LUCENE Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 4 4 N/A N/A 73113 39611 94.915 ANN_openai_4M_F100000_CAGRA_HNSW_32_T4_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 4 8 N/A N/A 73135 40334 95.052 ANN_openai_4M_F100000_CAGRA_HNSW_32_T4_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 4 12 N/A N/A 74055 40437 95.294 ANN_openai_4M_F100000_CAGRA_HNSW_32_T4_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 8 4 N/A N/A 67345 40112 95.359 ANN_openai_4M_F100000_CAGRA_HNSW_32_T8_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 8 8 N/A N/A 115800 39891 94.948 ANN_openai_4M_F100000_CAGRA_HNSW_32_T8_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 32 64 8 12 N/A N/A 177898 40448 94.900 ANN_openai_4M_F100000_CAGRA_HNSW_32_T8_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 4 4 N/A N/A 186752 79655 97.699 ANN_openai_4M_F100000_CAGRA_HNSW_64_T4_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 4 8 N/A N/A 169849 81434 97.759 ANN_openai_4M_F100000_CAGRA_HNSW_64_T4_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 4 12 N/A N/A 181421 83828 97.600 ANN_openai_4M_F100000_CAGRA_HNSW_64_T4_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 8 4 N/A N/A 171208 84179 97.415 ANN_openai_4M_F100000_CAGRA_HNSW_64_T8_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 8 8 N/A N/A 172237 81017 97.562 ANN_openai_4M_F100000_CAGRA_HNSW_64_T8_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 100000 64 128 8 12 N/A N/A 161637 81258 97.652 ANN_openai_4M_F100000_CAGRA_HNSW_64_T8_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 4 4 N/A N/A 161903 21648 95.619 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T4_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 4 8 N/A N/A 169990 21967 95.857 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T4_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 4 12 N/A N/A 170449 21262 95.844 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T4_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 8 4 N/A N/A 172529 21678 95.753 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T8_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 8 8 N/A N/A 161095 22523 95.358 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T8_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 32 64 8 12 N/A N/A 160944 22070 95.687 ANN_openai_4M_F1000000_CAGRA_HNSW_32_T8_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 4 4 N/A N/A 161681 43350 97.711 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T4_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 4 8 N/A N/A 168252 41989 98.094 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T4_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 4 12 N/A N/A 138998 41967 97.994 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T4_C12 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 8 4 N/A N/A 132342 50925 78.516 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T8_C4 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 8 8 N/A N/A 137720 51993 75.526 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T8_C8 Success
CAGRA_HNSW openai_4M 1536 4600000 1000000 64 128 8 12 N/A N/A 133855 50823 75.384 ANN_openai_4M_F1000000_CAGRA_HNSW_64_T8_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 4 4 N/A N/A 99430 10322 98.339 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T4_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 4 8 N/A N/A 120173 10368 98.255 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T4_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 4 12 N/A N/A 96693 10126 98.286 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T4_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 8 4 N/A N/A 107418 10512 98.282 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T8_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 8 8 N/A N/A 111274 10403 98.326 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T8_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 32 64 8 12 N/A N/A 119159 10508 98.178 ANN_wiki_5M_F100000_CAGRA_HNSW_32_T8_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 4 4 N/A N/A 122974 19243 98.352 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T4_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 4 8 N/A N/A 123551 18541 98.363 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T4_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 4 12 N/A N/A 102642 18848 98.366 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T4_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 8 4 N/A N/A 129123 19078 96.718 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T8_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 8 8 N/A N/A 125643 19154 98.484 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T8_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 100000 64 128 8 12 N/A N/A 129809 19021 98.643 ANN_wiki_5M_F100000_CAGRA_HNSW_64_T8_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 4 4 N/A N/A 140163 6594 97.095 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T4_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 4 8 N/A N/A 111620 6739 98.352 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T4_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 4 12 N/A N/A 131620 6674 98.253 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T4_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 8 4 N/A N/A 139468 7136 98.366 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T8_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 8 8 N/A N/A 126528 6912 98.352 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T8_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 32 64 8 12 N/A N/A 131120 7036 98.408 ANN_wiki_5M_F1000000_CAGRA_HNSW_32_T8_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 4 4 N/A N/A 122783 11064 97.919 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T4_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 4 8 N/A N/A 140311 11113 98.610 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T4_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 4 12 N/A N/A 160572 11147 98.711 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T4_C12 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 8 4 N/A N/A 147775 11804 98.339 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T8_C4 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 8 8 N/A N/A 144290 11494 98.698 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T8_C8 Success
CAGRA_HNSW wiki_5M 2048 5000000 1000000 64 128 8 12 N/A N/A 162375 11678 97.258 ANN_wiki_5M_F1000000_CAGRA_HNSW_64_T8_C12 Success
LUCENE_HNSW wiki_5M 2048 5000000 100000 N/A N/A N/A N/A 16 128 2981154 5820 92.027 ANN_wiki_5M_F100000_LUCENE_HNSW_M16_EF128 Success
LUCENE_HNSW wiki_5M 2048 5000000 100000 N/A N/A N/A N/A 16 256 5360833 6206 92.962 ANN_wiki_5M_F100000_LUCENE_HNSW_M16_EF256 Success
CAGRA_HNSW OpenAI4M 1536 4600000 500000 N/A N/A N/A N/A N/A N/A 84458 5325 98.02 CAGRA_HNSW_OpenAI4M_F500000 Success
LUCENE_HNSW OpenAI4M 1536 4600000 500000 N/A N/A N/A N/A N/A N/A 3953573 3251 95.19 LUCENE_HNSW_OpenAI4M_F500000 Success
CAGRA_HNSW OpenAI4M 1536 4600000 2000000 N/A N/A N/A N/A N/A N/A 80963 5261 98.09 CAGRA_HNSW_OpenAI4M_F2000000 Success
LUCENE_HNSW OpenAI4M 1536 4600000 2000000 N/A N/A N/A N/A N/A N/A 3958972 3002 95.12 LUCENE_HNSW_OpenAI4M_F2000000 Success

script: https://github.com/SearchScale/vectorsearch-benchmarks/blob/ishan/cagra-lucene-upgrade/3-datasets-benchmarks.sh

Steps to run benchmarks:

clone vectorsearch-bencmarks repo:
git clone -b ishan/cagra-lucene-upgrade https://github.com/searchscale/vectorsearch-benchmarks

Make sure you have cuvs built locally (or in a conda environment)

Edit the LD_LIBRARY_PATH based on your cuvs path

Run script:
./3-datasets-benchmarks.sh

Results will be stored in the results/ directory in a csv file.

Ensure you have the datasets downloaded:
wiki88mx768d: https://docs.rapids.ai/api/cuvs/stable/cuvs_bench/wiki_all_dataset/

@cjnolet
Copy link
Member

cjnolet commented Aug 29, 2025

Thanks for providing the updated benchmarks @punAhuja.

I notice the querytime seems to be significantly higher for the CAGRA->HNSW than for the HNSW versions. Any idea why this is the case? Are we comparing similar parameter ranges here (apples to apples)?

Screenshot from 2025-08-29 08-55-52

@punAhuja
Copy link
Contributor

1066578

This is extremely slow; we are investigating this.

@chatman
Copy link
Collaborator

chatman commented Aug 29, 2025

I notice the querytime seems to be significantly higher for the CAGRA->HNSW than for the HNSW versions. Any idea why this is the case? Are we comparing similar parameter ranges here (apples to apples)?

This is very likely some bug or something very fundamentally wrong. 13 seconds per query is unacceptable. We all discussed this internally, and Puneet is going to investigate the built index. Need to see in details why this is so slow.

Also, Puneet is planning to add query warm up before the benchmark begins. Will need to see if adding more layers make this situation any better.

@cjnolet
Copy link
Member

cjnolet commented Sep 15, 2025

/ok to test 1a01a6c

@chatman
Copy link
Collaborator

chatman commented Sep 15, 2025

@cjnolet This one still doesn't have the CI fixes I added to #14. So, I was thinking if that one goes in first, this one would be tested properly.

@chatman chatman marked this pull request as ready for review September 22, 2025 17:32
@chatman chatman requested review from a team as code owners September 22, 2025 17:32
@chatman chatman requested a review from a team as a code owner September 24, 2025 23:00
@chatman chatman requested a review from AyodeAwe September 24, 2025 23:00
@cjnolet
Copy link
Member

cjnolet commented Sep 25, 2025

/ok to test b23d6ec

@narangvivek10
Copy link
Collaborator Author

/ok to test fed1efb

@narangvivek10
Copy link
Collaborator Author

/ok to test dc7cc60

Copy link
Collaborator

@chatman chatman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some performance improvements that I want to get at, but we can do so in a subsequent PR.

@narangvivek10
Copy link
Collaborator Author

/ok to test a78dd29

…and add version update marker for cuvs-java dependency
@narangvivek10 narangvivek10 requested a review from a team as a code owner September 25, 2025 20:53
@narangvivek10
Copy link
Collaborator Author

/ok to test 8ff4c40

@narangvivek10 narangvivek10 changed the title [WIP] CAGRA to HNSW serialization and search on CPU CAGRA to HNSW serialization and search on CPU Sep 26, 2025
@cjnolet
Copy link
Member

cjnolet commented Sep 26, 2025

/merge

@rapids-bot rapids-bot bot merged commit ee25eda into rapidsai:branch-25.10 Sep 26, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

CAGRA to Lucene HNSW serialization feature

5 participants