-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java maven snapshots and releases #2675
Comments
CC @Craigacp |
I'm looking into the maven central requirements now, but there are a bunch of things to work out around the artifact signing and group id before it can be uploaded there. It should be straightforward to build the next release for Java on RHEL7/OL7/CentOS7, Ubuntu (provided you don't use gcc 9), and macOS. It also builds fine on Windows using the latest release of Visual Studio's build tools. It's not going to be possible to build it for each individual execution provider and host it on maven central, so I expect the binary will probably be CPU only. |
Since this is the first release of the API, it's still in preview and will take some time to mature before publishing to maven. Are you able to build from source as suggested? In the future when it's available in maven, it will still just be limited build flavors. |
After some playing around for a bit (disclaimer I work primarily in java and I am not a regular user of cmake), I was able to compile it, albeit for a single architecture (linux 64bit). My mac build env is a mess unrelated to anything here, so I have not built it there yet. I did test the java functionality briefly. We are going to stress test and do additional checks before we consider using your library. In regards to the longer term vision of how this should be distributed, here are my thoughts. It should be released to maven, preferably maven central repository. This makes it very easy for users to incorporate it into it their projects. Snapshots are a nice to have, but I guess technically not a requirement. I see that the azure pipeline seems to have a release and snapshot mechanism for the python library. Ideally, Java would have a similar mechanism. I guess the project maintainers would probably handle this. There are some nuances in the actual artifacts that are published to maven. Specifically there is the nature of how different architectures (linux, windows, mac, etc) are handled. Typically the maven classifier is used to differentiate artifacts based on OS/arch. Additionally there is the other issue of how much to bundle in the artifact. I was looking at the "onnxruntime4j-1.1.0-with-binaries.jar" it contains both the libonnxruntime.so and libonnxruntime4j_jni.so I think we should approach this more like bytedeco's javacpp presets handles it: http://repo1.maven.org/maven2/org/bytedeco/opencv/4.1.2-1.5.2/ Feel free to download a few different jar's and run There is a base artifact with the java classes. This would be like the plain onnxruntime4j-1.1.0.jar. Then there are a series of artifacts with classifiers for the different os/arch. Each of those would contain the libonnxruntime4j_jni.so or libonnxruntime4j_jni.dylib or onnxruntime4j_jni.dll respectively. Alternatively you could forego the classifiers and put them all in one jar, but this is probably against best practice (polluting the java class path with unused data) and may be difficult to have your azure pipeline do since it seems to be split up by arch. There is another question about whether whether whatever is published should contain a compiled onnxruntime library. Probably for most users this would be desired as an easy quick start. However power users may want to provide their own onnxruntime library. I believe it may be possible to accomplish this by breaking out into another artifact: one with just the onnxruntime library that depends on a core artifact containing Java classes and points towards classified artifacts containing the onnxruntime4j_jni library. Ideally, these are the following artifacts:
Use cases:
To accommodate that last case, it may be necessary to refactor OnnxRuntime.java to try loading the onnxruntime from the resources first if it is present. onnxruntime4j_jni would be loaded always. Let me know if you have any other questions or ideas. |
We could take inspiration from the tensorflow maven configuration https://github.com/tensorflow/tensorflow/tree/master/tensorflow/java/maven They have an artifact with the classes (libtensorflow) and another two artifacts for the jni code (libtensorflow_jni and libtensorflow_jni_gpu). It appears both have both the main library (like |
I may attempt to package the artifacts in the manner posted above and post a gist or maybe open a pr. Is there a preference amongst the maintainers of maven versus gradle? I guess what is easier to work with in the azure pipeline environments? I am more accustomed to gradle. Gradle has a nice wrapper, and I think maven does as well but it is not built in. |
The TF Java config is changing towards a JavaCPP style one, as the project is moving to use JavaCPP for it's native bindings. I think it would be preferable to have a pom file which gets integrated into the jar by calling |
It's very easy to get this working with GitHub Actions and JavaCPP, indeed. Check tensorflow/java#25 for an example of what we can do with about 100 lines of YAML of these days. If the only thing missing from JavaCPP is a plugin for Gradle, if there is a demand for it, this is something I can work on. BTW, there isn't any issue with calling the CMake build from Maven, as per with these snapshots here: https://github.com/bytedeco/javacpp-presets/tree/master/onnxruntime /cc @frankfliu |
@saudet that's the wrong way around. It needs to call maven from cmake with the appropriate flags. @yuzawa-san has already refactored the build to use gradle for the Java parts, it's awaiting merging. |
@Craigacp That's not an answer to my question. I asked you that question before, and I also asked you before why you didn't like JavaCPP, and you didn't answer, but maybe that's one of the whys, so please elaborate more if you could. It would be helpful. |
The aim is to drive the whole build through the build.py script and CMake combination that the rest of the library uses. The JNI binding needs to be compiled using the same flags as the onnx runtime native library, so it's easier to leave that in CMake. I originally had the CMake compile the Java parts and generate the native headers too, but CMake isn't very good at building Java projects so it's tricky to make the build more flexible. The new PR moves the Java build over to gradle while still using CMake to drive the whole build and to compile the jni parts. Calling CMake from Maven or Gradle doesn't help as maven/gradle isn't the start of the build process. The aim is to build the onnx runtime native library along with whichever of the c#, python and Java bindings are requested at the same time. |
I see where you're coming from, but no, it doesn't. You're wrong. In fact, JNI bindings sometimes require a different set of flags than the rest of the code. CMake, Bazel, Autoconf, etc don't come with any special support for JNI for that. We need to tack these things on manually anyway, so why not do everything in one single place, such as with a tool like JavaCPP with it's own little "build system" for JNI, which we can also use as a plugin for Maven, Gradle, etc to build alongside Java source files? That makes much more sense to me than adding proper support for JNI and Java to all native build tools! I also discussed the value of this with @karllessard previously, see karllessard/tensorflow-java-sandbox#2 (comment). Anyway, thinking about it some more, the C API is fine for what we need for our own applications, for Given your employer, you should know about the importance of all this for enterprises. In the meantime though, the C API is fine us, but for the sake of the community, I do hope one day you understand the issues at play here. |
The JNI bindings are compiled with the same flags as that allows them to use conditional compilation to turn on or off the various execution providers. When they are turned off it throws a catchable exception that says this provider isn't available and you need to recompile the native library. There is currently no way to probe the available providers from the C or C++ APIs. Without this functionality then the Java binding could not present a sensible error message when the user requested a provider which was not available, and as Java doesn't have conditional compilation then the Java code has to present entry points for all the providers. If you know of a cross platform way to probe the functions exported by a native library from Java, then I'd be happy to replace this code. I note that the JavaCPP preset does not have any of this functionality, and so is limited to the CPU provider. It doesn't even appear to use the DNNL provider that it's compiled against as the function that enables DNNL on a SessionOptions is not present. The native loading procedure is properly documented in the If you have performance concerns or other issues that aren't simply a cover for complaints about not using your JavaCPP library, then open an issue or file a PR. |
That's not an issue with JavaCPP, but thanks for pointing this out! The C/C++ APIs of ONNX Runtime does in fact support that just fine. As @EmergentOrder found out quickly enough when he started to use them, the only line missing was an
I'm perfectly fine with not using JavaCPP, but neither you nor your employer are interested in providing an alternative. For Python, we have tools like conda, pip, cython, pybind11, etc that everyone uses. But for some reason that I still don't understand, people like you at Oracle and OpenJDK feel that, in the case of Java, it's perfectly alright for everyone to reinvent the wheel and build everything from scratch. That isn't going to work! In my opinion, you'll need to do something about it, sooner or later. |
Hello, Any update on this ? |
@Craigacp FYI, I've added a CUDA-enabled build to the JavaCPP Presets and as expected it works just fine. If we try to call
Moreover, snapshots are readily available for Linux, Mac, and Windows, both with and without CUDA: |
I've pushed commit bytedeco/javacpp-presets@f1e2aa4, and the Java API created by @Craigacp now gets bundled with the JavaCPP Presets for ONNX Runtime, so we can use it alongside the C and C++ APIs, all from Java. The @Craigacp With that I believe I've shown that JavaCPP has everything we need, but if you still feel there is something missing or something that needs to be fixed, please let me know. I will be happy to work on making JavaCPP better! |
I've released artifacts on Maven Central, which work just as well with Gradle, sbt, etc: |
BTW, there are now Gradle plugins for JavaCPP. It's very easy to use them for any native library, including ONNX Runtime, either to build it from source or to use existing precompiled artifacts: |
Binaries are now available on Maven Central for CPU and GPU with the v1.3.1 release. |
Is your feature request related to a problem? Please describe.
No
System information
Linux x86_64, Mac x86_64
Describe the solution you'd like
The java API #2215 was merged recently and seems likely it will be on a release in the near future. I would like to try testing around with this in my development environment and with some production traffic. Historically, we have had trouble with JNI integrations and would like to ensure there are no leaks or other bugs. The easiest way would be to use a snapshot build published to maven (ideally maven central). Is this work underway? If not could you please prioritize this work. Additionally, when the time comes, a published maven artifact would be most desired.
Describe alternatives you've considered
Building locally, placing snapshots on corporate internal maven. This would require us to pull periodically, and ideally your CI job would build the snapshots.
Additional context
n/a
The text was updated successfully, but these errors were encountered: