[REVIEW] [Java] Option to build fat-jars with native dependencies included by mythrocks · Pull Request #1296 · rapidsai/cuvs

mythrocks · 2025-08-27T18:49:45Z

This commit introduces an option to include the native libraries as part of a new Java JAR artifact.

In addition, this commit also adds scripts to build the libraries included in the fat-jars using gcc-toolset, to allow the libraries to be portable across several Linux / libstdc++ versions. (This was earlier attempted in #1264, but will now reside in this commit.)

Note that for the initial cut, the "fat" jars will include only the following libraries:

libcuvs.so
libcuvs_c.so
librmm.so
librapids_logger.so

The resultant JARs will still be dependent on LD_LIBRARY_PATH for other dependencies (cublas, cusparse, cusolver, nccl, etc.).

Two new profiles have been introduced in the pom.xml:

x86_64-cuda12
x86_64-cuda13 (Although this is more of an example than anything.)

The main JAR artifact (cuvs-java-25.x.x.jar) remains unmodified. But when -P x86_64-cuda12 is employed, an additional cuvs-java-25.x.x-x86_64-cuda12.jar is produced, containing libcuvs.so, libcuvs_c.so, and some (minimal) additional dependencies.

The idea is that the JAR artifact build for x86_64 + cuda12 would look something like:

# On an x86 build box, using the cuda-12 conda env, from the project root directory:

# Build `libcuvs.so` first.
LIBCUVS_BUILD_DIR=`pwd`/cpp/build/cuda12 ./build.sh libcuvs

# Now the Java build.
cd java/
CMAKE_PREFIX_PATH=`pwd`/../cpp/build/cuda12 ./build.sh

The java/build.sh detects the CPU platform and the CUDA version to automatically choose the profile (x86-cuda12).

Note that there are tangential changes to the pom.xml:

Fixes [BUG] [Java] Race condition between building src/main/java and src/main/java22 #1293: The Java 22 portion of the build will now follow the Java 21, to prevent races between the builds.
The maven-compiler-plugin version has been dropped to 3.11.0, to prevent build errors regarding a "0-byte module-info.class". This is apparently a known issue in 3.13, to be fixed in 3.14.

Signed-off-by: MithunR <mithunr@nvidia.com>

copy-pr-bot · 2025-08-27T18:49:55Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mythrocks · 2025-08-27T18:52:18Z

Note that the work in #1264 will depend on these changes.

…libs. Signed-off-by: MithunR <mithunr@nvidia.com>

Signed-off-by: MithunR <mithunr@nvidia.com>

…-compile

mythrocks · 2025-08-28T22:29:33Z

java/cuvs-java/src/assembly/native-with-deps.xml

+    <!-- Include native libraries from separate directory -->
+    <fileSets>
+        <fileSet>
+            <directory>${project.build.directory}/native-libs</directory>


This is the assembly file that packages native-libraries into the native-jar artifact.

mythrocks · 2025-08-28T22:31:21Z

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java

 final class JDKProvider implements CuVSProvider {

+  static {
+    OptionalNativeDependencyLoader.loadLibraries();


This is the call into the optional native-dependency loader.

If the jar includes native dependency libraries, they will be loaded at startup. If not, the load is skipped, and it runs as before (i.e. depending on $LD_LIBRARY_PATH, etc.)

Is there a way to pre-load specific paths? With the conda-pack option, we'd need to force pre-load the bundled libstdc++ library. I'm mentioning this because this looks like it might be the right place to do that pre-loading. If you just rely on LD_LIBRARY_PATH, you would likely end up with the system libstdc++, which will be too old for the conda packages.

Signed-off-by: MithunR <mithunr@nvidia.com>

ldematte

I have concentrated my review and comments on the Java part, but I've also tried build-in-docker with success. LGTM

ldematte · 2025-09-08T08:53:39Z

java/build.sh

-mvn verify "${MAVEN_VERIFY_ARGS[@]}" \
+mvn clean verify "${MAVEN_VERIFY_ARGS[@]}" -P "$BUILD_PROFILE" \
  && mvn install:install-file -Dfile=./target/cuvs-java-$VERSION.jar -DgroupId=$GROUP_ID -DartifactId=cuvs-java -Dversion=$VERSION -Dpackaging=jar \
+  && mvn install:install-file -Dfile=./target/cuvs-java-$VERSION-"$BUILD_PROFILE".jar -DgroupId=$GROUP_ID -DartifactId=cuvs-java -Dversion=$VERSION -Dclassifier="$BUILD_PROFILE" -Dpackaging=jar \


Nice, and thank you for keeping the slim jar too!

ldematte · 2025-09-08T08:57:26Z

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/OptionalNativeDependencyLoader.java

+/**
+ * A class that loads native dependencies if they are available in the jar.
+ */
+public class OptionalNativeDependencyLoader {


Reporting what we discussed offline and expanding it:
I like the idea; as you see, I've gone down a very similar route in #1316 to check if libcuvs can be loaded, and if not why, before proceeding.
I think it would be great if we can merge the efforts, and have an OptionalNativeDependencyLoader that works for both cases (either a conditional implementation -- we can introspect the jar to see if it's the slim or fat version, or with 2 different implementations).

I think we should merge this as-is; I have verified that with these changes the slim jar keeps working and nothing breaks. However, if you don't mind, I have opened an issue to keep track of the follow up work, so we can have both this loader (which is necessary for the fat-jar to work) and another similar to/derived from #1316 for the slim jar (which is necessary for the slim-jar to work).

java/cuvs-java/pom.xml

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/OptionalNativeDependencyLoader.java

Signed-off-by: MithunR <mithunr@nvidia.com>

mythrocks · 2025-09-08T21:39:46Z

I'm considering merging this tomorrow, unless @ldematte and @jameslamb have objections. I think I've addressed the concerns we had for the moment.

jameslamb · 2025-09-08T21:46:46Z

unless @ldematte and @jameslamb have objections

Sorry, I'm not able to review this soon. Please talk with @robertmaynard about it, and don't wait on me.

Some quick notes on what I see:

nothing in the docker-build/ directory added here appears to being run in CI... and I think you'll have a hard time doing docker-in-docker when you try to get that running in CI in future PRs
I still feel that it'd be easier to maintain and produce more-portable packages to instead use the existing Rocky Linux 8 images the rest of RAPIDS uses to build Python wheels, as I started prototyping in [DO NOT MERGE] cuvs-java: fat jars with libcuvs.so (proof-of-concept) #1311

mythrocks · 2025-09-08T22:02:54Z

Thank you, @jameslamb. I'll wait to see if @robertmaynard has any objections to the contents of docker-build/. (The other stuff, I'll leave in @ldematte's able hands.)

Re docker-in-docker: ACK. That will need resolving, but I agree with your assessment: It would be good to bake this into the CUVS CI container. Devs could then use that. I'll pick that thread up in a couple of weeks.

ldematte · 2025-09-09T06:49:40Z

I'm considering merging this tomorrow, unless @ldematte and @jameslamb have objections. I think I've addressed the concerns we had for the moment.

On the strictly Java (non-docker) part: GTG for me. We have plans on how extend it, but we'll address it separately after this and #1314 are merged.

robertmaynard · 2025-09-09T20:35:38Z

Some follow-up work yall should think about is to update the C++ build flags used once #1317 is merged. That would allow you to build rmm/logger/fmt/spdlog statically and only have to bundle libcuvs.so and libcuvs_c.so

mythrocks · 2025-09-09T20:42:15Z

That would allow you to build rmm/logger/fmt/spdlog statically and only have to bundle libcuvs.so and libcuvs_c.so

Thank you for the heads-up. Agreed. That's would be to C++ libs what the fat-jar is for Java.

Come to that, I'm not quite sure why we have a separate libcuvs.so and a libcuvs_c.so. The latter can't be used without the former.

This is worth investigating further.

java/docker-build/build-in-docker

java/docker-build/Dockerfile

Plus, corrected project name. Signed-off-by: MithunR <mithunr@nvidia.com>

Signed-off-by: MithunR <mithunr@nvidia.com>

msarahan · 2025-09-10T16:35:20Z

java/build.sh

+CUDA_VERSION_FROM_NVCC=$(nvcc --version | grep -oP 'release [0-9]+' | awk '{print $2}')
+CUDA_MAJOR_VERSION=${CUDA_VERSION_FROM_NVCC:-12}


Suggested change

CUDA_VERSION_FROM_NVCC=$(nvcc --version | grep -oP 'release [0-9]+' | awk '{print $2}')

CUDA_MAJOR_VERSION=${CUDA_VERSION_FROM_NVCC:-12}

CUDA_MAJOR_VERSION=$(nvcc --version | grep -oP 'release [0-9]+' | awk '{print $2}')

Any reason to have two variables here? My inclination would be to error out if the inner expression fails. The pipefail setting at the top of the script takes care of propagating the status.

msarahan · 2025-09-10T16:38:13Z

java/docker-build/Dockerfile

+ARG CUDA_VERSION=12.9.1
+ARG OS_RELEASE=9
+ARG TARGETPLATFORM=linux/amd64


Suggested change

ARG CUDA_VERSION=12.9.1

ARG OS_RELEASE=9

ARG TARGETPLATFORM=linux/amd64

ARG CUDA_VERSION

ARG OS_RELEASE

ARG TARGETPLATFORM

Redeclaring doesn't need default values

msarahan · 2025-09-10T16:46:47Z

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java

 final class JDKProvider implements CuVSProvider {

+  static {
+    OptionalNativeDependencyLoader.loadLibraries();


Is there a way to pre-load specific paths? With the conda-pack option, we'd need to force pre-load the bundled libstdc++ library. I'm mentioning this because this looks like it might be the right place to do that pre-loading. If you just rely on LD_LIBRARY_PATH, you would likely end up with the system libstdc++, which will be too old for the conda packages.

msarahan · 2025-09-10T16:48:48Z

java/docker-build/Dockerfile

+ARG CCACHE_VERSION=4.11.2
+
+# Default x86_64 from x86 build, aarch64 cmake for arm build
+ARG CMAKE_ARCH=x86_64


Would be better to consume docker platform, so that thing are not defined in multiple places and get out of sync. Code generated by cursor.

Suggested change

ARG CMAKE_ARCH=x86_64

# Extract architecture part (remove linux/ prefix and any variant suffix)

local arch="${TARGETPLATFORM#linux/}"

arch="${arch%%/*}" # Remove variant suffix like /v8, /v1, etc.

# Convert to CMake platform naming

case "$arch" in

"amd64"|"x86_64")

echo "x86_64"

;;

"arm64"|"aarch64")

echo "aarch64"

;;

*)

echo "Error: Unsupported architecture '$arch' in platform '$platform'" >&2

echo "Supported architectures: amd64, x86_64, arm64, aarch64" >&2

return 1

;;

esac```

msarahan · 2025-09-10T16:53:05Z

java/docker-build/run-in-docker

+fi
+
+if [ "$(uname -m)" == "aarch64" ]; then
+    DOCKER_BUILD_EXTRA_ARGS=(--build-arg TARGETPLATFORM=linux/arm64 --build-arg CMAKE_ARCH=aarch64 "${DOCKER_BUILD_EXTRA_ARGS[@]}")


this could be cleaned up to one line with the code above to translate TARGETPLATFORM to CMAKE_ARCH. I would not base the TARGETPLATFORM on the host CPU type. Better to pass it in as a parameter/env var so you don't need specific hardware to build a docker image.

I agree with your point. Cross-compilers would break this.

I had to start somewhere.

mythrocks · 2025-09-10T17:03:48Z

@msarahan Would it be alright if we left further changes to this in a separate follow up PR? I'm afraid I'm away, and will be unavailable for the next 3 weeks. There are PRs queued up behind this one.

mythrocks · 2025-09-10T20:56:47Z

/merge

…to UnsupportedProvider/UnsupportedOperationExceptions (#1316) This PR further extends #1296 and #1314 to give meaningful error messages in case libcuvs fails to load. The `jextract` generated bindings we use in cuvs-java use `SymbolLookup#libraryLookup` to load the `cuvs_c` dynamic library; this uses `RawNativeLibraries#load` (see https://github.com/openjdk/jdk/blob/master/src/java.base/share/native/libjava/RawNativeLibraries.c#L58); `RawNativeLibraries#load` in turn calls `JVM_LoadLibrary`. `JVM_LoadLibrary` does a good job to put together a good error message (e.g. calling `dlerror`, trying to locate and inspect the file for platform mismatch, etc. Unfortunately, `RawNativeLibraries#load` calls it passing false to the `throwException` parameter, which means that the detailed error messages are not surfaced. This PR follows the pattern introduced in #1296 and preloads libcuvs (and dependencies) using `JVM_LoadLibrary` directly with `throwException` true; preloading it will also cause the OS to look for and load all dependencies. In case of error we can see what's broken in better detail; e.g. if `libcuvs_c.so` is present, but `librmm.so` is missing: ``` java.lang.UnsupportedOperationException: cannot create JDKProvider: libcuvs_c.so: librmm.so: cannot open shared object file: No such file or directory at com.nvidia.cuvs@25.10.0/com.nvidia.cuvs.spi.UnsupportedProvider.newCuVSResources(UnsupportedProvider.java:35) at com.nvidia.cuvs@25.10.0/com.nvidia.cuvs.CuVSResources.create(CuVSResources.java:90) at com.nvidia.cuvs@25.10.0/com.nvidia.cuvs.CuVSResources.create(CuVSResources.java:79) ``` Fixes #1321 Authors: - Lorenzo Dematté (https://github.com/ldematte) Approvers: - Ishan Chattopadhyaya (https://github.com/chatman) - Chris Hegarty (https://github.com/ChrisHegarty) - MithunR (https://github.com/mythrocks) URL: #1316

mythrocks added 11 commits August 18, 2025 14:49

WIP: Fat jar with native libs.

38dd2b7

Signed-off-by: MithunR <mithunr@nvidia.com>

WIP: Added resources plugin to package native libs.

9b08099

Signed-off-by: MithunR <mithunr@nvidia.com>

WIP: Fixed pom.xml to optionally build the native jars.

d293578

Signed-off-by: MithunR <mithunr@nvidia.com>

Support different CUDA versions in different profiles.

9b9454c

Signed-off-by: MithunR <mithunr@nvidia.com>

More descriptive assembly file name.

91134fe

Signed-off-by: MithunR <mithunr@nvidia.com>

Converged assembly files.

963d4c7

Signed-off-by: MithunR <mithunr@nvidia.com>

Can build for different CUDA versions.

0e74c22

Signed-off-by: MithunR <mithunr@nvidia.com>

mvn clean verify

2bfd452

Signed-off-by: MithunR <mithunr@nvidia.com>

Fixed compiler order issue.

b8603b4

Signed-off-by: MithunR <mithunr@nvidia.com>

Added auto-detection of platform.

c35578d

Signed-off-by: MithunR <mithunr@nvidia.com>

Fixed manifests for fat jars.

fbbe6bf

Signed-off-by: MithunR <mithunr@nvidia.com>

mythrocks self-assigned this Aug 27, 2025

mythrocks requested a review from a team as a code owner August 27, 2025 18:49

mythrocks added the feature request New feature or request label Aug 27, 2025

mythrocks requested a review from a team as a code owner August 27, 2025 18:49

mythrocks added the Java label Aug 27, 2025

mythrocks requested a review from msarahan August 27, 2025 18:49

mythrocks added the Build label Aug 27, 2025

github-project-automation bot moved this to Todo in Vector Search, ML, & Data Mining Release Board Aug 27, 2025

github-project-automation bot added this to Vector Search, ML, & Data Mining Release Board Aug 27, 2025

mythrocks marked this pull request as draft August 27, 2025 18:49

mythrocks added the non-breaking Introduces a non-breaking change label Aug 27, 2025

mythrocks added 4 commits August 27, 2025 16:48

Added library loader for explicit library loads, for included native …

2e7c7a4

…libs. Signed-off-by: MithunR <mithunr@nvidia.com>

Moved dependency loader to a separate class.

194a926

Signed-off-by: MithunR <mithunr@nvidia.com>

Removed rapids-logger from native deps.

0407740

Signed-off-by: MithunR <mithunr@nvidia.com>

Merge remote-tracking branch 'origin/branch-25.10' into fat-jar-fixed…

2806b08

…-compile

mythrocks commented Aug 28, 2025

View reviewed changes

cjnolet moved this from Todo to In Progress in Vector Search, ML, & Data Mining Release Board Sep 5, 2025

mythrocks added 2 commits September 5, 2025 08:59

README.md tweaks.

c237252

Signed-off-by: MithunR <mithunr@nvidia.com>

Merge branch 'branch-25.10' into fat-jar-fixed-compile

938fc09

mythrocks mentioned this pull request Sep 5, 2025

[WIP][Java] Add libcuvs version check #1315

Closed

ldematte approved these changes Sep 8, 2025

View reviewed changes

ldematte mentioned this pull request Sep 8, 2025

[FEA] Expand/customize dependecy loading introduced in fat-jars to work with the slim-jar too #1321

Closed

ldematte reviewed Sep 8, 2025

View reviewed changes

java/cuvs-java/pom.xml Outdated Show resolved Hide resolved

ldematte reviewed Sep 8, 2025

View reviewed changes

java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/OptionalNativeDependencyLoader.java Outdated Show resolved Hide resolved

mythrocks added 2 commits September 8, 2025 10:18

Switching java22 back to the "compile" phase.

90ac2a2

Signed-off-by: MithunR <mithunr@nvidia.com>

Review comments: Naming convention, plus final-ize data member.

1d3500e

Signed-off-by: MithunR <mithunr@nvidia.com>

robertmaynard reviewed Sep 9, 2025

View reviewed changes

java/docker-build/build-in-docker Show resolved Hide resolved

java/docker-build/Dockerfile Outdated Show resolved Hide resolved

java/docker-build/Dockerfile Show resolved Hide resolved

mythrocks added 2 commits September 9, 2025 15:14

Review: Corrected repo for rocky8.

80e7572

Plus, corrected project name. Signed-off-by: MithunR <mithunr@nvidia.com>

Review: Better defaults for build.sh.

16a6eee

Signed-off-by: MithunR <mithunr@nvidia.com>

mythrocks requested a review from robertmaynard September 9, 2025 22:25

Merge branch 'branch-25.10' into fat-jar-fixed-compile

ea90557

msarahan approved these changes Sep 10, 2025

View reviewed changes

benfred approved these changes Sep 10, 2025

View reviewed changes

rapids-bot bot merged commit 60244ca into rapidsai:branch-25.10 Sep 10, 2025
156 of 164 checks passed

github-project-automation bot moved this from In Progress to Done in Vector Search, ML, & Data Mining Release Board Sep 10, 2025

ldematte mentioned this pull request Sep 12, 2025

[Review][Java] Add detailed error message for libcuvs load failure to UnsupportedProvider/UnsupportedOperationExceptions #1316

Merged

ldematte mentioned this pull request Sep 16, 2025

[Java] CUDA 13 support #1276

Open

		CUDA_VERSION_FROM_NVCC=$(nvcc --version \| grep -oP 'release [0-9]+' \| awk '{print $2}')
		CUDA_MAJOR_VERSION=${CUDA_VERSION_FROM_NVCC:-12}

	CUDA_VERSION_FROM_NVCC=$(nvcc --version \| grep -oP 'release [0-9]+' \| awk '{print $2}')
	CUDA_MAJOR_VERSION=${CUDA_VERSION_FROM_NVCC:-12}
	CUDA_MAJOR_VERSION=$(nvcc --version \| grep -oP 'release [0-9]+' \| awk '{print $2}')

-ARG CMAKE_ARCH=x86_64
+    # Extract architecture part (remove linux/ prefix and any variant suffix)
+    local arch="${TARGETPLATFORM#linux/}"
+    arch="${arch%%/*}"  # Remove variant suffix like /v8, /v1, etc.
+    # Convert to CMake platform naming
+    case "$arch" in
+        "amd64"|"x86_64")
+            echo "x86_64"
+            ;;
+        "arm64"|"aarch64")
+            echo "aarch64"
+            ;;
+        *)
+            echo "Error: Unsupported architecture '$arch' in platform '$platform'" >&2
+            echo "Supported architectures: amd64, x86_64, arm64, aarch64" >&2
+            return 1
+            ;;
+    esac```

Conversation

mythrocks commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Aug 27, 2025

Uh oh!

mythrocks commented Aug 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mythrocks commented Sep 8, 2025

Uh oh!

jameslamb commented Sep 8, 2025

Uh oh!

mythrocks commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldematte commented Sep 9, 2025

Uh oh!

robertmaynard commented Sep 9, 2025

Uh oh!

mythrocks commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mythrocks commented Sep 10, 2025

Uh oh!

mythrocks commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mythrocks commented Aug 27, 2025 •

edited

Loading

mythrocks commented Sep 8, 2025 •

edited

Loading