Skip to content

Add IVF changes to support Faiss byte vector#2002

Merged
naveentatikonda merged 4 commits intoopensearch-project:mainfrom
naveentatikonda:add-support-faiss-byte-vector-ivf
Aug 30, 2024
Merged

Add IVF changes to support Faiss byte vector#2002
naveentatikonda merged 4 commits intoopensearch-project:mainfrom
naveentatikonda:add-support-faiss-byte-vector-ivf

Conversation

@naveentatikonda
Copy link
Copy Markdown
Member

Description

Add IVF changes to support Faiss byte vector which behind the scenes uses Faiss SQ8_direct_signed scalar quantizer.

Related Issues

#1659

Check List

  • New functionality includes testing.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement backport 2.x v2.17.0 labels Aug 22, 2024
@naveentatikonda naveentatikonda self-assigned this Aug 22, 2024
@naveentatikonda naveentatikonda force-pushed the add-support-faiss-byte-vector-ivf branch from 17a4e8b to 60f21fb Compare August 22, 2024 04:54
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
@naveentatikonda naveentatikonda force-pushed the add-support-faiss-byte-vector-ivf branch 2 times, most recently from e4fd8c6 to 3b0c8da Compare August 26, 2024 16:53
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
@naveentatikonda naveentatikonda force-pushed the add-support-faiss-byte-vector-ivf branch from 3b0c8da to 121912f Compare August 26, 2024 17:04
faiss::write_index_binary(&idMap, indexPathCpp.c_str());
}

void knn_jni::faiss_wrapper::CreateByteIndexFromTemplate(knn_jni::JNIUtilInterface * jniUtil, JNIEnv * env, jintArray idsJ,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move these things to IndexService just like we have for other indices?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to refactor these methods and move to IndexService. But, including those refactoring changes into this PR will make it big so want to do it later. Junqiu already started refactoring train Indices in #1918. Will collab with Junqiu and add my changes to that PR

@navneet1v
Copy link
Copy Markdown
Collaborator

@naveentatikonda can we put a small benchmark run for this to showcase that ByteVector with IVF is working and we are seeing drop in memory usage.

@naveentatikonda
Copy link
Copy Markdown
Member Author

@naveentatikonda can we put a small benchmark run for this to showcase that ByteVector with IVF is working and we are seeing drop in memory usage.

This is the RSS metrics for Cohere-wiki-768 dataset (475,858 vectors) and the memory usage is as expected.
IVF memory requirement - 0.4 gb (appx.)
Heap - 1GB
Total memory provided - 1.7 GB

image

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Copy link
Copy Markdown
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@junqiu-lei junqiu-lei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@naveentatikonda naveentatikonda merged commit 2b303d9 into opensearch-project:main Aug 30, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 30, 2024
* Add HNSW changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add IVF changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
(cherry picked from commit 2b303d9)
naveentatikonda added a commit that referenced this pull request Aug 30, 2024
* Add HNSW changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add IVF changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
(cherry picked from commit 2b303d9)

Co-authored-by: Naveen Tatikonda <navtat@amazon.com>
akashsha1 pushed a commit to akashsha1/k-NN that referenced this pull request Sep 16, 2024
* Add HNSW changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add IVF changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Akash Shankaran <akash.shankaran@intel.com>
jingqimao77-spec pushed a commit to jingqimao77-spec/k-NN that referenced this pull request Mar 15, 2026
* Add HNSW changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add IVF changes to support Faiss byte vector

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Features Introduces a new unit of functionality that satisfies a requirement v2.17.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants