Add IVF changes to support Faiss byte vector#2002
Conversation
17a4e8b to
60f21fb
Compare
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
e4fd8c6 to
3b0c8da
Compare
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
3b0c8da to
121912f
Compare
| faiss::write_index_binary(&idMap, indexPathCpp.c_str()); | ||
| } | ||
|
|
||
| void knn_jni::faiss_wrapper::CreateByteIndexFromTemplate(knn_jni::JNIUtilInterface * jniUtil, JNIEnv * env, jintArray idsJ, |
There was a problem hiding this comment.
should we move these things to IndexService just like we have for other indices?
There was a problem hiding this comment.
Yes, we need to refactor these methods and move to IndexService. But, including those refactoring changes into this PR will make it big so want to do it later. Junqiu already started refactoring train Indices in #1918. Will collab with Junqiu and add my changes to that PR
src/main/java/org/opensearch/knn/training/BinaryTrainingDataConsumer.java
Outdated
Show resolved
Hide resolved
|
@naveentatikonda can we put a small benchmark run for this to showcase that ByteVector with IVF is working and we are seeing drop in memory usage. |
This is the RSS metrics for Cohere-wiki-768 dataset (475,858 vectors) and the memory usage is as expected.
|
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
* Add HNSW changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add IVF changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com> (cherry picked from commit 2b303d9)
* Add HNSW changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add IVF changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com> (cherry picked from commit 2b303d9) Co-authored-by: Naveen Tatikonda <navtat@amazon.com>
* Add HNSW changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add IVF changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com> Signed-off-by: Akash Shankaran <akash.shankaran@intel.com>
* Add HNSW changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add IVF changes to support Faiss byte vector Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

Description
Add IVF changes to support Faiss byte vector which behind the scenes uses Faiss SQ8_direct_signed scalar quantizer.
Related Issues
#1659
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.