Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflection NPE using native-image on a compile Clojure jar #385

Closed
Gastove opened this issue Apr 29, 2018 · 14 comments
Closed

Reflection NPE using native-image on a compile Clojure jar #385

Gastove opened this issue Apr 29, 2018 · 14 comments
Assignees

Comments

@Gastove
Copy link

Gastove commented Apr 29, 2018

I do know that Clojure isn't officially supported per se; that said, this particular failure case surprised me, so I figured I'd stop by and mention it.

If I have a pure Clojure standard library project (e.g. no dependencies), and compile it into a jar, I can build it into a native image flawlessly with Graal. It's fast, starts instantly -- all I ever dreamed of. This works whether I try it on a regular or shaded/fat/uber jar (which makes sense; with no dependencies, they should be basically identical).

On the other hand, if I include a dependency, on another Clojure library, and compile that, the resulting jar always NPEs during native-image generation, regardless of how I try to do it. I've tried building a shaded/fat/uber jar; I've also tried building a normal jar and making sure all dependencies are on the classpath. No dice. Regardless of what I do, I get what for me is a somewhat obtuse stack trace -- which is to say, I've got no clue on earth where to start digging in to help fix it.

Here's the verbose build output:

StartServer [
/opt/graaal/graalvm-1.0.0-rc1/bin/java \
-Xbootclasspath/a:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/boot/graaljs-scriptengine.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/boot/graal-sdk.jar \
-cp \
/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/svm.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/pointsto.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/objectfile.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/jvmci-api.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/jvmci-hotspot.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/graal.jar \
-server \
-d64 \
-noverify \
-XX:+UnlockExperimentalVMOptions \
-XX:+EnableJVMCI \
-XX:-UseJVMCIClassLoader \
-XX:+UseJVMCICompiler \
-Dgraal.CompileGraalWithC1Only=false \
-XX:CICompilerCount=2 \
-Dgraal.VerifyGraalGraphs=false \
-Dgraal.VerifyGraalGraphEdges=false \
-Dgraal.VerifyGraalPhasesSize=false \
-Dgraal.VerifyPhases=false \
-Dgraal.EagerSnippets=true \
-Xss10m \
-Duser.country=US \
-Duser.language=en \
-Dsubstratevm.version=68c7c1073a86a3d541ffb82434acc664f3096079:substratevm \
-Dgraalvm.version=1.0.0-rc1 \
-Dorg.graalvm.version=1.0.0-rc1 \
-Dcom.oracle.graalvm.isaot=true \
-Djvmci.class.path.append=/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/graal.jar \
-Xmx13386191664 \
-Xms1g \
com.oracle.svm.hosted.server.NativeImageBuildServer \
-port=26682 \
-logFile=/home/rdonaldson/.native-image/machine-id-60cc043f721b40558c35b66ac3a767f9/session-id-1711/server-id-a5c19bf68c58aff3fdb29998ada9d914cc3a9456591fccc0d94132bcba77df6a2b5538d41b2e899dde246cf4cea395b6864a1a85b5a57820f92190a3fd93d013/server.log
]
Build on Server(pid: 2687, port: 26682)*
SendBuildRequest [
-task=com.oracle.svm.hosted.NativeImageGeneratorRunner
-imagecp
/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/svm.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/pointsto.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/builder/objectfile.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/jvmci-api.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/jvmci-hotspot.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/jvmci/graal.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/boot/graaljs-scriptengine.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/boot/graal-sdk.jar:/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/library-support.jar:/home/rdonaldson/Code/language-tour/reference-impl:/home/rdonaldson/Code/language-tour/reference-impl/target/reference-impl-0.1.0-SNAPSHOT-standalone.jar
-H:Path=/home/rdonaldson/Code/language-tour/reference-impl
-H:Class=reference_impl.core
-H:Name=reference-impl-0.1.0-SNAPSHOT-standalone
-H:CLibraryPath=/opt/graaal/graalvm-1.0.0-rc1/jre/lib/svm/clibraries/linux-amd64
]
   classlist:   3,355.82 ms
       (cap):   2,763.61 ms
       setup:   4,988.67 ms
    analysis:  17,697.96 ms
fatal error: java.lang.NullPointerException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
	at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1005)
	at com.oracle.svm.hosted.NativeImageGenerator.run(NativeImageGenerator.java:398)
	at com.oracle.svm.hosted.NativeImageGeneratorRunner.buildImage(NativeImageGeneratorRunner.java:240)
	at com.oracle.svm.hosted.NativeImageGeneratorRunner.build(NativeImageGeneratorRunner.java:337)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.executeCompilation(NativeImageBuildServer.java:378)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.lambda$processCommand$8(NativeImageBuildServer.java:315)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.withJVMContext(NativeImageBuildServer.java:396)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.processCommand(NativeImageBuildServer.java:312)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.processRequest(NativeImageBuildServer.java:256)
	at com.oracle.svm.hosted.server.NativeImageBuildServer.lambda$serve$7(NativeImageBuildServer.java:216)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
	at com.oracle.graal.pointsto.ObjectScanner.scanField(ObjectScanner.java:113)
	at com.oracle.graal.pointsto.ObjectScanner.doScan(ObjectScanner.java:263)
	at com.oracle.graal.pointsto.ObjectScanner.finish(ObjectScanner.java:307)
	at com.oracle.graal.pointsto.ObjectScanner.scanBootImageHeapRoots(ObjectScanner.java:78)
	at com.oracle.graal.pointsto.ObjectScanner.scanBootImageHeapRoots(ObjectScanner.java:60)
	at com.oracle.graal.pointsto.BigBang.checkObjectGraph(BigBang.java:581)
	at com.oracle.graal.pointsto.BigBang.finish(BigBang.java:552)
	at com.oracle.svm.hosted.NativeImageGenerator.doRun(NativeImageGenerator.java:653)
	at com.oracle.svm.hosted.NativeImageGenerator.lambda$run$0(NativeImageGenerator.java:381)
	at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Error: Processing image build request failed

Thanks for the amazing project :D Let me know if there's something helpful I can to do help sort this out.

@cstancu cstancu self-assigned this Apr 30, 2018
@ckampfe
Copy link

ckampfe commented Apr 30, 2018

@Gastove Have you tried doing lein compile :all (or the boot equiavlent) and then pointing native-image at the resulting class files, rather than the jar?

Like:

native-image \
-H:+ReportUnsupportedElementsAtRuntime \
-cp .:target \
your_ns.core

I've had good luck with this where I previously was getting similar errors using the -jar option.
I realize this doesn't help you solve your issue, but might help you keep moving :)

@cstancu
Copy link
Member

cstancu commented Apr 30, 2018

@Gastove thank you for trying out GraalVM. That NPE was reported before and fixed in the meantime. The NPE was hiding another issue: trying to load a jar from a static initializer, which I suspect might happen when you include a dependency on another Clojure library in your second configuration. See this section on limitations regarding static initializers. You should try building from source (see the quick start guide), the NPE should go away and you should get a more helpful error message.

@Gastove
Copy link
Author

Gastove commented Apr 30, 2018

@ckampfe Ah! Interesting -- I'll try that, thanks! I hadn't considered approaching it that way, and I appreciate the guidance!

@cstancu Ahhh, good to know. I'll see what I can do with a fresher build of Graal.

@Gastove
Copy link
Author

Gastove commented Apr 30, 2018

Ah, well. @ckampfe, your approach gets me to a NoClassDefFoundError, but no hints as to which class. And @cstancu, the master branch of graal does not build for me. I can report this or provide detail if you like, but I assume it's a side-effect of an open-source project under active development, and I can check back later. Your preference?

@cstancu
Copy link
Member

cstancu commented Apr 30, 2018

Please report them and we'll try to fix. Is the NoClassDefFoundError a runtime or compile time issue?
What issue do you encounter when you try to build from source?

@Gastove
Copy link
Author

Gastove commented Apr 30, 2018

I'm hitting NoClassDefFoundError at compile-time, while attempting to run native-image against a compiled Clojure class. I note that if I, for instance, miss-spell the name of the class I'm trying to compile, I get an error much earlier -- the error I'm seeing comes from the build server, but I have no notion of which class it can't find.

When I try to build Graal from source, it goes like this:

  1. I clone Graal
  2. I clone mx
  3. The quickstart guide indicates I need JAVA_HOME set to a "labsjdk". For the sake of following instructions exactly, I'm currently using the Graal "Enterprise Edition" 1.0.0rc1 release.

Now, from the substratevm dir in the Graal repo, I do like so:

env JAVA_HOME=/opt/graaal/graalvm-1.0.0-rc1 mx build

The first time I ran this, this worked for a while, compiling C libs and working merrily; now, unless I do an mx clean, I hit errors immediately. The first thing that crops up is this:

java.util.ServiceConfigurationError: com.oracle.truffle.api.TruffleRuntimeAccess: Provider org.graalvm.compiler.truffle.runtime.hotspot.HotSpotTruffleRuntimeAccess not a subtype
	at java.util.ServiceLoader.fail(ServiceLoader.java:239)

But the stack traces and error reporting are... intense. Rather than paste them here, please peruse this gist for the full output: https://gist.github.com/Gastove/288ba000ba73052ab383a9393318a77c

What other information can I provide? What can I do to help?

@Gastove
Copy link
Author

Gastove commented Apr 30, 2018

(Ah, just for reference: I'm using Fedora 27.)

@cstancu
Copy link
Member

cstancu commented Apr 30, 2018

@Gastove the build issue is known, it is a bootstrapping issue. The problem is that you try to build Graal/Truffle with GraalVM. Your JAVA_HOME needs to point to a LabsJDK.

@dougxc
Copy link
Member

dougxc commented Apr 30, 2018

Just to be clear, @cstancu is referring to the downloads at the bottom of the page under "JVMCI JDK Downloads".

@Gastove
Copy link
Author

Gastove commented Apr 30, 2018

@dougxc AH. Okay, that makes more sense. I did see that instruction/requirement, and thought I had followed it -- clearly, I got confused. Will re-try and report back.

@Gastove
Copy link
Author

Gastove commented May 1, 2018

Okay, yep: this works considerably better. I don't have a clear error message about what, specifically, is using a static initializer, and in perfect world one would be very nice -- would make working around it much easier. But! Still happy to get this far, and I appreciate the guidance. I personally would consider this closed.

@cstancu
Copy link
Member

cstancu commented May 1, 2018

@Gastove what does the error message look like? We are working on improving the error messages and make them more actionable.

@Gastove
Copy link
Author

Gastove commented May 2, 2018

@cstancu holy moly -- I don't..... understand why, but I'm seeing two different stacktraces built out of the same jar; one is very helpful, the other less so. In both cases, here's the command I'm using to execute native-image:

$ env JAVA_HOME=/opt/labsjdk/labsjdk1.8.0_161-jvmci-0.42/ ~/Code/open-source/gr
aal/substratevm/native-image -jar target/refeError: Detected a ZipFile object in the image heap. This is not supported. The object was reac
hed from a static initializer. All static class initialization is done during native image con
struction, thus a static initializer cannot contain code that captures state dependent on the 
build machine. Write your own initialization methods and call them explicitly from your main e
ntry point.
Trace:  object java.util.WeakHashMap$Entry
        object java.util.WeakHashMap$Entry[]
        object java.util.WeakHashMap
        object java.net.URLClassLoader
        object org.joda.time.tz.ZoneInfoProvider
        object java.util.concurrent.atomic.AtomicReference
        method org.joda.time.DateTimeZone.getProvider()
Call path from entry point to org.joda.time.DateTimeZone.getProvider():
        at org.joda.time.DateTimeZone.getProvider(DateTimeZone.java:429)
        at org.joda.time.DateTimeZone.forID(DateTimeZone.java:216)
        at org.joda.time.format.DateTimeFormatterBuilder$TimeZoneId.parseInto(DateTimeFormatte
rBuilder.java:2407)
        at org.joda.time.format.DateTimeFormatter.parseLocalDateTime(DateTimeFormatter.java:88
4)
        at clj_time.format$parse_local.invokeStatic(format.clj:171)
        at clj_time.format$parse_local.invoke(format.clj:167)
        at clj_time.format$parse_local$iter__699__703$fn__704$fn__705$fn__706.invoke(format.cl
j:175)
rence-impl-0.1.0-SNAPSHOT-standalone.jar 

The first stack trace looks something like this:

rror: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: Unsupported construc
tor java.lang.ClassLoader.<init>(ClassLoader) is reachable: The declaring class of this elemen
t has been substituted, but this element is not present in the substitution class             
To diagnose the issue, you can add the option -H:+ReportUnsupportedElementsAtRuntime. The unsu
pported element is then reported at run time when it is accessed the first time.              
Trace:                                                                                        
        at parsing java.security.SecureClassLoader.<init>(SecureClassLoader.java:76)          
Call path from entry point to java.security.SecureClassLoader.<init>(ClassLoader):            
        at java.security.SecureClassLoader.<init>(SecureClassLoader.java:76)                  
        at java.net.URLClassLoader.<init>(URLClassLoader.java:100)                            
        at clojure.lang.DynamicClassLoader.<init>(DynamicClassLoader.java:41)                 
        at clojure.main$repl.invokeStatic(main.clj:219)                                       
        at clojure.core.server$repl.invokeStatic(server.clj:177)                              
        at clojure.core.server$repl.invoke(server.clj:177)                                    
        at clojure.lang.AFn.run(AFn.java:22)                                                  
        at java.lang.Thread.run(Thread.java:748)                                              
        at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThre
ads.java:222)                                                                                 
        at Lcom/oracle/svm/core/code/CEntryPointCallStubs;.com_002eoracle_002esvm_002ecore_002
eposix_002ethread_002ePosixJavaThreads_002epthreadStartRoutine_0028com_002eoracle_002esvm_002e
core_002eposix_002ethread_002ePosixJavaThreads_0024ThreadStartData_0029(generated:0)          
Error: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: Unsupported field ja
va.lang.reflect.Proxy.proxyClassCache is reachable                                            
To diagnose the issue, you can add the option -H:+ReportUnsupportedElementsAtRuntime. The unsu
pported element is then reported at run time when it is accessed the first time.              
Trace:                                                                                        
        at parsing java.lang.reflect.Proxy.isProxyClass(Proxy.java:791)                       
Call path from entry point to java.lang.reflect.Proxy.isProxyClass(Class):                    
        at java.lang.reflect.Proxy.isProxyClass(Proxy.java:791)             

So, I know what the problem is, but not where in the code to dig to fix it. The other goes like so:

Error: Detected a ZipFile object in the image heap. This is not supported. The object was reac
hed from a static initializer. All static class initialization is done during native image con
struction, thus a static initializer cannot contain code that captures state dependent on the 
build machine. Write your own initialization methods and call them explicitly from your main e
ntry point.
Trace:  object java.util.WeakHashMap$Entry
        object java.util.WeakHashMap$Entry[]
        object java.util.WeakHashMap
        object java.net.URLClassLoader
        object org.joda.time.tz.ZoneInfoProvider
        object java.util.concurrent.atomic.AtomicReference
        method org.joda.time.DateTimeZone.getProvider()
Call path from entry point to org.joda.time.DateTimeZone.getProvider():
        at org.joda.time.DateTimeZone.getProvider(DateTimeZone.java:429)
        at org.joda.time.DateTimeZone.forID(DateTimeZone.java:216)
        at org.joda.time.format.DateTimeFormatterBuilder$TimeZoneId.parseInto(DateTimeFormatte
rBuilder.java:2407)
        at org.joda.time.format.DateTimeFormatter.parseLocalDateTime(DateTimeFormatter.java:88
4)
        at clj_time.format$parse_local.invokeStatic(format.clj:171)
        at clj_time.format$parse_local.invoke(format.clj:167)
        at clj_time.format$parse_local$iter__699__703$fn__704$fn__705$fn__706.invoke(format.cl
j:175)

This is absolutely perfect -- it doesn't ever hit code I actually wrote, but it does point to a specific call stack within a specific library I can now work around.

I do not at all understand why I got one, then the other. Possibly I didn't manage to kill the build server after rebuilding graal myself?

@cstancu
Copy link
Member

cstancu commented May 2, 2018

It is "normal" to get multiple "stack traces" during image building. Each of them corresponds to an unsupported feature error. When the analysis runs we don't stop on the first discovered unsupported feature but keep going on unexplored code paths. This way you get a complete image of all the problems. So it has nothing to do with the build server; you would get the same output if you run without the server. When we report unsupported features discovered by the analysis the stack traces are not actually execution stack traces, but constructed call paths from the entry method to the location of the issue, i.e., the Call path from entry point to ... element in the output. The analysis finds the shortest call path by traversing the call graph.

I actually see three different unsupported feature errors in the output you pasted:

The Trace element for the first issue, Unsupported constructor java.lang.ClassLoader.<init>(ClassLoader) is reachable, gives you the location where that constructor is called from: at parsing java.security.SecureClassLoader.<init>(SecureClassLoader.java:76). On that line there is a call to super(parent), the constructor that we don't currently support. This issue is a side effect of the fact that we don't allow ClassLoaders in the boot image heap. We are working on relaxing this limitation since class loaders are also used to load resources and it should be possible to "load" a class that was previously included in the image.

The Trace element for the second issue, Unsupported field java.lang.reflect.Proxy.proxyClassCache is reachable similarly show you where the unsupported field java.lang.reflect.Proxy.proxyClassCache is reachable from: at parsing java.lang.reflect.Proxy.isProxyClass(Proxy.java:791).

The Trace element for the third issue, Detected a ZipFile object in the image heap., is a bit more interesting: it shows the types corresponding to the object chain from the booth image heap root method org.joda.time.DateTimeZone.getProvider() to the ZipFile constant object. Since we run the static initializers at image build time we deal with concrete objects here, that we then add in the boot image heap. The root here is a static method rather than a static final field because Graal embeds some constant nodes in the graphs.

Thank you for providing these details. This helps us understand how the error messages are perceived by people not familiar with the internals of the system and improve them by clearing out the confusing parts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants