Skip to content

Add faiss support in jni#28

Merged
jmazanec15 merged 14 commits intoopensearch-project:faiss-developfrom
jmazanec15:faiss-initial
Jun 1, 2021
Merged

Add faiss support in jni#28
jmazanec15 merged 14 commits intoopensearch-project:faiss-developfrom
jmazanec15:faiss-initial

Conversation

@jmazanec15
Copy link
Copy Markdown
Member

@jmazanec15 jmazanec15 commented May 24, 2021

Description

This change refactors the jni so that it can support both nmslib and faiss in a single library. The interface revolves around a new class, JNIService, which includes several methods to make calls to the jni for creating indices, training indices, and searching indices. The interface is intended to be future proof (i.e. it will not change if we add a new library).

At the moment, we do not have the functionality to test the native code directly (i.e. for memory leaks). This can be done in the future using Google tests. However, mocking JNI specific components is very tricky. We may consider separating jni components from the rest of the native code so that we can use Google tests to test the rest of the native code.

Included changes in this PR:

  1. Add faiss v1.6.5 as submodule
  2. Refactor build logic to support faiss
  3. Refactor interface to jni in java layer into JNIService
  4. Add jni utilities to abstract out shared functionality
  5. Refactor jni consumers so build passes
  6. Fix existing tests so build passes
  7. Add JNIServiceTests to test JNI
  8. Add test data

This PR DOES NOT add faiss support to the OpenSearch interface. This will be done in a future PR.

Issues Resolved

#29

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15 jmazanec15 requested review from VijayanB, dblock and vamshin and removed request for dblock May 24, 2021 22:44
Copy link
Copy Markdown
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a cursory look paying more attention to the C++ code. I think you want to cleanup the allocation story to make sure nothing is leaking on exceptions, then we can re-review in detail.

Lack of C++ tests is a major problem. Not sure what people use today, Google tests, CPPUNIT or something, but feels like at least a beginning of something is needed, even if it's to test utility functions.

Comment thread jni/src/faiss_wrapper.cpp Outdated

// Train index if needed -- check if there is a case where index needs part trained but is trained
if(!indexWriter->is_trained) {
//TODO: What needs to be freed???
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything allocated should be wrapped in an auto_ptr or similar smart pointer construct to do things like env->DeleteLocalRef. I'd ensure this before this is merged.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will ensure all native allocations are wrapped in smart pointers. For calls to free references in jenv, I will work on some kind of wrapper to ensure everything is freed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, DeleteLocalRef is more of an optimization. Local references will automatically be freed when jni call returns.

Comment thread jni/src/faiss_wrapper.cpp

// Write the index to disk
std::string indexPathCpp(ConvertJavaStringToCppString(env, indexPathJ));
faiss::write_index(&idMap, indexPathCpp.c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need CatchCppExceptionAndThrowJava(env); around the whole function?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the future goal of separating java specific functionality from native only functionality, I left this wrapper to the jni functions here: https://github.com/jmazanec15/k-NN-1/blob/faiss-initial/jni/src/org_opensearch_knn_index_JNIService.cpp

Comment thread jni/src/faiss_wrapper.cpp Outdated
for (int i = 0; i < indexBytesCount; i++) {
vectorIoReader.data.push_back((uint8_t) indexBytesJ[i]);
}
env->ReleaseByteArrayElements(templateIndexJ, indexBytesJ, 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaks if any of the above fails? indexBytesJ needs to be inside a smart pointer that calls ReleaseByteArrayElements when it's deleted.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, I am not sure smart pointer is necessary. ReleaseByteArrayElements requires the Java object to be passed as the first argument. So, we would need to construct a wrapper that includes a reference to the Java object. Because the only code between allocation and release is appending to a vector, I think it is okay to keep this as a raw pointer.

Comment thread jni/src/faiss_wrapper.cpp Outdated

// if result is not enough, padded with -1s
int resultSize = kJ;
for(int i = 0; i < kJ; i++) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::find maybe a 1-liner?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15
Copy link
Copy Markdown
Member Author

I did a cursory look paying more attention to the C++ code. I think you want to cleanup the allocation story to make sure nothing is leaking on exceptions, then we can re-review in detail.

Sounds good. I will work on that today.

Lack of C++ tests is a major problem. Not sure what people use today, Google tests, CPPUNIT or something, but feels like at least a beginning of something is needed, even if it's to test utility functions.

Makes sense. I was debating whether or not to try to include C++ tests on this PR. I was thinking that because this is a development branch, this can be pushed back to a future PR so as not to add too many changes in one PR. What are your thoughts @dblock?

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
/**
* Service to interact with plugin's jni layer. Class dependencies should be minimal
*
* In order to compile C++ header file, run:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember that we have to comment KNNConstants to generate. is that still the case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to comment out. It will work as is.

// loadIndex from different library
String engineName;
Map<String, Object> parameters;
if (indexPathUrl.endsWith(KNNEngine.NMSLIB.getExtension())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we create a method "KNNEngine getEngineNameFromPath()", so that it will be easy to extend later? what do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense to add. Will add.

throw new IllegalArgumentException("[KNN] Invalid engine type for path: " + indexPathUrl);
}

final long indexPointer = JNIService.loadIndex(indexPathUrl, parameters, engineName);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be indexReference? i feel pointer is more of cpp jargon.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use pointer here because it is generated in the C++ layer and used to access meemory exclusively in C++ layer. So, the value it contains is in fact a valid pointer in C++ and cannot be used to access memory in Java.


private KNNIndex getKnnIndex() {
return knnIndex;
private boolean isClosed() {
Copy link
Copy Markdown
Member

@VijayanB VijayanB May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have private getter method?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is used exclusively in the cache. So private is fine. I guess I prefer to use getters as opposed to direct access to the members so that variables dont accidentally get changed, but I understand that this does not prevent a section of code in the cache from accessing directly a value and changing it, its more for convention. That being said, what are your thoughts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. I prefer method than variable mostly outside the class. I guess it comes in preference category, you can keep it as it is.

indices = knnIndexCache.getIndices(getHNSWPaths(searcher.getIndexReader()), getIndexName());
} finally {
searcher.close();
try (Engine.Searcher searcher = indexShard.acquireSearcher("knn-warmup")) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FAR: shall we create constant?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does FAR stand for?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Another review

Comment thread jni/src/faiss_wrapper.cpp
Comment on lines +41 to +56
if (idsJ == nullptr) {
throw std::runtime_error("IDs cannot be null");
}

if (vectorsJ == nullptr) {
throw std::runtime_error("Vectors cannot be null");
}

if (indexPathJ == nullptr) {
throw std::runtime_error("Index path cannot be null");
}

if (parametersJ == nullptr) {
throw std::runtime_error("Parameters cannot be null");
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these exceptions crash the jvm as these are cpp Exceptions. My understanding Java does not understand these exceptions and crash? If this crashes JVM, then we may need to type case to Java Exception. Same comment for all other places where we throw cpp Exceptions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wrap all of the calls to wrappers in https://github.com/jmazanec15/k-NN-1/blob/faiss-initial/jni/src/org_opensearch_knn_index_JNIService.cpp#L20. So, this will ensure that exception is handled correctly. Additionally, some of the tests test exceptions.

}
);
} catch (Exception ex) {
throw new RuntimeException("Unable to query the index: " + ex);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Add indexName to the Exception message to help in debugging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add here

KNNConstants.HNSW_ALGO_EF_SEARCH, KNNSettings.getEfSearchParam(indexName)
);
} else {
throw new IllegalArgumentException("[KNN] Invalid engine type for path: " + indexPathUrl);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexPathUrl should be hidden to the users. May be just have indexName in the message. We can
indexPathUrl to a debug log for debugging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense.

readLock.lock();
try {
if (knnIndexCacheEntry.isClosed()) {
throw new IOException("Index is already closed");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Add indexName, engineName to the message. If possible add spaceType as well. You may want to check other Exception places as well and see if we can add information to the error messages to help debug.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Copy link
Copy Markdown
Member

@vamshin vamshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15
Copy link
Copy Markdown
Member Author

I decided to refactor jni utils into an interface so that neither wrappers make calls directly to the JNIEnv. This will make it easier to add unit tests because we will just be able to mock the interface. This comes at a minimal performance cost of 1 extra function call, but, when compared to the other operations, this cost is negligible.

@jmazanec15 jmazanec15 merged commit adad71b into opensearch-project:faiss-develop Jun 1, 2021
@dblock dblock mentioned this pull request Jun 2, 2021
jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Oct 22, 2021
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
jmazanec15 added a commit that referenced this pull request Oct 22, 2021
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 7, 2022
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 30, 2022
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
jingqimao77-spec pushed a commit to jingqimao77-spec/k-NN that referenced this pull request Mar 15, 2026
Signed-off-by: Jack Mazanec <jmazane1@nd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants