Fix Tests/CI, refactor merge to call CAGRA's merge(), implement CAGRA prefiltering#14
Conversation
> Co-authored-by: Vivek Narang <vivek@searchscale.com>
- Code cleanup and simplifications - Fix EOFException in CAGRA merge with deletions
|
I think this is ready for review now. I feel this has many tests that will make sure we catch regressions in future PRs. |
|
/ok to test 433c485 |
|
@narangvivek10 @dantegd @cjnolet The tests got skipped in CI with the assumeTrue() thing in tests when the .so file isn't found.
|
|
I'm hoping the inability to find the jars will reduce somewhat, after rapidsai/cuvs#1296. I've yet to take |
|
/ok to test 9498678 |
|
/ok to test 4b09035 |
|
/ok to test 8c8e009 |
|
/ok to test c4f1abb |
|
/ok to test fa7cdb1 |
|
/ok to test f4bc2e6 |
|
@narangvivek10 I would like to avoid building libcuvs from source as that will be too slow. My preference would be to:
For 2), I've just pushed the fat jars to searchscale maven temporarily: https://maven.searchscale.com/snapshots/com/nvidia/cuvs/cuvs-java/25.10.0-07e42-SNAPSHOT/ Can you please revert your recent changes so that we can move forward fast with the python wheels for now? If possible, please add the 12.9.1 support to the pip thing as well, based on https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/libcuvs-cu12/ |
|
Actually, yes. That would be quickest. I'm supportive of @chatman's suggestion to use pip-wheels for the short term. It is less than ideal in the long run, but it will allow faster iteration for |
|
/ok to test 06e39dd |
|
/ok to test 76010c0 |
|
/ok to test d8bb38d |
|
I have made the changes to use a cleaner approach to using libcuvs from conda instead, and I see the |
ci/build_java.sh
Outdated
| rapids-print-env | ||
|
|
||
| # Locates the libcuvs.so file path and appends it to LD_LIBRARY_PATH | ||
| rapids-logger "Find libcuvs so file and append paths to LD_LIBRARY_PATH" |
There was a problem hiding this comment.
I don’t think we need this anymore, do we? The conda environment should already be part of the system path, right?
There was a problem hiding this comment.
I think we need this because when the job starts, I saw the LD_LIBRARY_PATH set to the value seen here. After this step the LD_LIBRARY_PATH is updated to this value, where the libcuvs.so and libcuvs_c.so exist.
| # To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`. | ||
| channels: | ||
| - conda-forge | ||
| - rapidsai-nightly |
There was a problem hiding this comment.
We need to be careful because this could tie our release packaging to nightly dependencies. Will let the build team comment on the best way to do this
There was a problem hiding this comment.
In other RAPIDS packages, these environment files are used to create the test environment. They do not go into built packages. I think the same is true for you here, but you know better than me.
There was a problem hiding this comment.
So, to the best of my understanding, my reasoning behind setting this to rapidsai-nightly was to make sure that the latest changes happening in the cuvs level should be used, so that all required updates (especially the ones needed due to breaking changes in cuvs) at the cuvs-lucene level are attended to in time (if not taken care of, already).
There was a problem hiding this comment.
It's a mix. There's benefits and drawbacks to each method. With nightlies, you will catch new issues. You may spend more time fixing breaking changes in smaller chunks. You might update your software in ways that are incompatible with the previous release. This last issue is not a concern for us, because we do not support mixing different release versions of different packages.
I think your setting here is fine and safe. If you find the breaks come too often and you want to sort them out in larger batches, then remove it.
msarahan
left a comment
There was a problem hiding this comment.
Minor suggestions. Main thing is to try to avoid LD_LIBRARY_PATH if you can. Just remember, the library search path matters more for what your library searches for (its dependencies) than it does for finding the .so's you're expecting to work with. You must ensure that libstdc++ from conda is always loaded first, before the system libstdc++ gets a chance to be otherwise found.
| # To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`. | ||
| channels: | ||
| - conda-forge | ||
| - rapidsai-nightly |
There was a problem hiding this comment.
In other RAPIDS packages, these environment files are used to create the test environment. They do not go into built packages. I think the same is true for you here, but you know better than me.
|
|
||
| private static CuVSResources cuVSResourcesOrNull() { | ||
| try { | ||
| System.loadLibrary( |
There was a problem hiding this comment.
If you can pass arguments to loadLibrary in your Java code and not rely on LD_LIBRARY_PATH, things will probably be more robust. The hard part is that it must be recursive. Loading a conda library should load all conda dependencies, or else you get weird runtime errors. This is especially problematic with libstdc++.
One way to ensure that libstdc++ does not accidentally come from the system is to explicitly load it from the conda environment before loading any other (conda-based) libraries.
I don't know much at all about Java, but this site seemed to have some helpful alternatives to LD_LIBRARY_PATH. The more you can change just your process and not the environment, the fewer strange issues you'll have. LD_LIBRARY_PATH is a foot gun. Use it if you must, but only after exhausting other options.
Co-authored-by: Mike Sarahan <msarahan@gmail.com>
|
/ok to test f27f907 |
|
I'm going to go ahead and merge this since it fixes CI. If we need further updates to the way cuvs is installed, we can open follow-up PRs |
|
/merge |
Introducing a new Codec that uses CAGRA for building the index on GPU and serializing to Lucene-compatible HNSW index segments. The Lucene-compatible segments are searchable via the `Lucene99HnswVectorsReader` (which is the default in Lucene 10.x). Note: This is based on top of #14 and should be rebased once that is merged. TODO: - Benchmarks and more tests - Further refactoring to split the `CuVSVectorsFormat` into GPU and CPU-specific formats. Fixes #13 Authors: - Vivek Narang (https://github.com/narangvivek10) - Puneet Ahuja (https://github.com/punAhuja) - Ishan Chattopadhyaya (https://github.com/chatman) Approvers: - Ishan Chattopadhyaya (https://github.com/chatman) - Corey J. Nolet (https://github.com/cjnolet) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #16

Refactoring, CI fixes (pulling libcuvs from pypi if not found), prefiltering support.
Added tests: