-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] seamless execution between CPU and GPU, both for vector index building and search tasks in Rust #500
Comments
Hi @dmccloskey,
Standalone, cuVS itself is a GPU-focused library and the main focus is to enable faster GPU algorithms for vector search/indexing. For interop, we rely on compatiblity with other libraries and standards on the CPU. As you point out, we can build a graph-based index similar to HNSW (using the CAGRA algorithm) really fast on the GPU and we enable converting it to CPU, mostly for use in hnswlib. Since that article you linked was published we now have the capaiblity to construct the full HNSW hierarchy, which means we can now construct a proper hnswlib serialized format and no longer require the user to search the hnsw index on CPU through cuVS.
A small note- the reason cuVS contains a search API for hnswlib is because when the hierarchy is not needed (e.g. when the graph doesn't need to be mutable), we had to make a small change to hnswlib itself so that we could bypass the hierarchy during search. Whether to build the hierarchy is now an option and this exposed through C and Python but not yet through Rust. cuVS is also integrated into the Faiss library, which provides more seamless interoperability between CPU and GPU for the supported cuVS indexes. Question- do you require the hnsw hierarchy to be mutable? Is using cuVS through Faiss an option? |
Hi @cjnolet , Thanks for your reply.
That is pretty cool.
Not necessarily. The primary use case was for building an index that could then be served to multiple users of an application. The updates to the index would not be continuous while the users are using the application, but instead would be done by the developers prior to serving the application when major updates are made to the data underlying the index (e.g., new embedding model or significant update to the raw documents).
Yes, this could be an option. I believe FAISS has python bindings, but no Rust bindings. Much of our application is written in python, but we are exploring migrating to Rust in the future and currently conducting small pilot experiments to better quantify the benefits hence the interest in Rust bindings. A follow-up question on my side: Is saving CAGRA indexes supported for python or Rust? I noticed that this is possible in C++, but the bindings for the functionality do not appear to be available in python or Rust (or perhaps I am just missing it). |
From my understanding, it is possible to utilize either CPU or GPU for vector index building and search with the C++ API. On Medium https://medium.com/rapids-ai/rapids-24-08-better-scalability-performance-and-cpu-gpu-interoperability-f88086386da6, there was mention that this will be extended to C and Python in the future. Will this also be extended to Rust?
The text was updated successfully, but these errors were encountered: