Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system tests failing on aix due to libcrypto could not be loaded #129

Closed
pshipton opened this issue Oct 11, 2018 · 25 comments
Closed

system tests failing on aix due to libcrypto could not be loaded #129

pshipton opened this issue Oct 11, 2018 · 25 comments
Assignees

Comments

@pshipton
Copy link
Member

https://ci.eclipse.org/openj9/job/Test-sanity.system-JDK8-aix_ppc-64_cmprssptrs/107/console
https://ci.eclipse.org/openj9/job/Test-extended.system-JDK8-aix_ppc-64_cmprssptrs/58/console

03:47:33 clone_stf:
03:47:33      [exec] Cloning into 'stf'...
03:47:33      [exec] exec(): 0509-036 Cannot load program /opt/freeware/libexec/git-core/git-remote-https because of the following errors:
03:47:33      [exec] 	0509-022 Cannot load module /usr/lib/libcurl.a(libcurl.so.4).
03:47:33      [exec] 	0509-150   Dependent module /home/u0020236/workspace/Test-sanity.system-JDK8-aix_ppc-64_cmprssptrs/openjdkbinary/j2sdk-image/jre/lib/ppc64/libcrypto.a(libcrypto.so) could not be loaded.
03:47:33      [exec] 	0509-152   Member libcrypto.so is not found in archive 
03:47:33      [exec] 	0509-022 Cannot load module git-remote-https.
03:47:33      [exec] 	0509-150   Dependent module /usr/lib/libcurl.a(libcurl.so.4) could not be loaded.
03:47:33      [exec] 	0509-022 Cannot load module .
03:47:33      [exec] Result: 128
@pshipton
Copy link
Member Author

@mbvreddy

@pshipton
Copy link
Member Author

@mbvreddy @enasser any ideas about this one?

@mbvreddy
Copy link
Contributor

It is trying to load libcurl.a and its dependent library libcrypto.so (in libcrypto.a). libcrypto.a from OpenSSL 1.1.1 bundled with JDK has only libcrypto.so.1.1. It doesn't have libcrypto.so. Hence, it failed. I am wondering why libcrypto.a from JDK directory is loaded. It should load libcrypto.a that contains required version of libcrypto.so

@mbvreddy mbvreddy self-assigned this Oct 12, 2018
@pshipton
Copy link
Member Author

@Mesbah-Alam @smlambert can you please help with this one. I suspect the tests are setting LIBPATH or similar to contain Test-sanity.system-JDK8-aix_ppc-64_cmprssptrs/openjdkbinary/j2sdk-image/jre/lib/ppc64 before the system locations, and this is causing the problem.

@Mesbah-Alam
Copy link

Mesbah-Alam commented Oct 15, 2018

openj9Settings.mk does set LIBPATH here: https://github.com/eclipse/openj9/blob/b03cae14f7322dc04e72874128adc8150d91028f/test/TestConfig/openj9Settings.mk#L80.

Looking at the code in the above mk file, one of the values being set in LIBPATH would be "$(JAVA_LIB_DIR)$(D)$(ARCH_DIR)$(D)$(VM_SUBDIR)", which, for this test would be: Test-sanity.system-JDK8-aix_ppc-64_cmprssptrs/openjdkbinary/j2sdk-image/jre/lib/ppc64/compressedrefs


@pshipton - this code has been there for a while. Is this a new problem on AIX? Can it be an AIX issue with the Curl installation?

I did a google with the error and found this: nodejs/build#551

There is a comment by @sxa555 here that might be of interest:

It should work if LIBPATH is pointing at /usr/lib (although that could well cause other issues of course!) - the libcrypto.a in /opt/freeware/lib doesn't have a non-version-suffixed version of libcrypto.so in it but the one in /usr/lib does. I'm not sure why the one in /opt/freeware/lib doesn't ... You'd expect it to work with the version of curl in there. You could just override it for the execution of curl for now.

@pshipton
Copy link
Member Author

pshipton commented Oct 15, 2018

@Mesbah-Alam this is a new problem as libcrypto was recently added to the VM directory.

  • do we still need to add the VM directory to the LIBPATH?
  • if so can we add it to the end rather than the beginning? Not sure if this will resolve the problem but its worth a shot.

@Mesbah-Alam
Copy link

Mesbah-Alam commented Oct 15, 2018

@pshipton -

do we still need to add the VM directory to the LIBPATH?

  • I spoke to Sophia Guo, and she confirmed that lots of functional tests use this value in LIBPATH. So, we do need it.

if so can we add it to the end rather than the beginning? Not sure if this will resolve the problem but its worth a shot.

@pshipton
Copy link
Member Author

@Mesbah-Alam Weird, because when I look at a machine the LIBPATH is set to /usr/lib and this directory does contain libcrypto.a. Are you sure the changes took effect in the grinder? I don't see any confirmation in the output.

@Mesbah-Alam
Copy link

The changes should have taken effect, from the job output:

12:51:57 get testKitGen and functional test material...
12:51:57 git clone -b openj9-openjdk-jdk8-issues-129-fliplibpathorder https://github.com/Mesbah-Alam/openj9.git

So it's checking out openj9 from the branch where the change was made..

@Mesbah-Alam
Copy link

Mesbah-Alam commented Oct 15, 2018

Logging into the machine pll011.rtp.raleigh.ibm.com where this Grinder ran , I also see the change present in /home/j9build/workspace/Grinder/openjdk-tests/TestConfig/openj9Settings.mk :

 +79          endif
   +80          ADD_JVM_LIB_DIR_TO_LIBPATH:=export LIBPATH=$(Q)$(LIBPATH)$(P)$(JAVA_LIB_DIR)$(D)$(VM_SUBDIR)$(P)$(JAVA_SHARED_LIBRARIES_DIR)$(P)$(JAVA_BIN)$(D)j9vm$(Q);
   +81  else

@pshipton
Copy link
Member Author

@keithc-ca @mikezhang1234567890 any ideas?

@Mesbah-Alam
Copy link

Logging into the machine and issuing the make commands locally to compile the test makes it work. It could be a grinder issue which is causing the change to not get picked up! I've created a PR with the change https://github.com/eclipse/openj9/pull/3289/files

@keithc-ca
Copy link
Member

Some environment variables get set as part of an interactive login; perhaps the jenkins slave doesn't have LIBPATH set as expected.

@pshipton
Copy link
Member Author

I think we are good, Mesbah's fix is working. The initial tests were not running with the fix.

@pshipton
Copy link
Member Author

Fixed by eclipse-openj9/openj9#3289

@pshipton
Copy link
Member Author

Still failing, I've put comments into eclipse-openj9/openj9#3289

@Mesbah-Alam
Copy link

@pshipton
Copy link
Member Author

@Mesbah-Alam can you please scour the code again for any places we might have missed.

@Mesbah-Alam
Copy link

I did another search. I can not find any other place where LIBPATH is being exported other than the places we have already changed. The problem seems to be somewhere else.

@pshipton
Copy link
Member Author

I believe its the VM itself which is setting the LIBPATH and causing this problem. The VM sets the LIBPATH as follows, and I expect anything exec'ed from the VM will inherit this.

2CIENVVAR LIBPATH=/bluebird/builds/bld_399771/sdk/ap6480/jre/lib/ppc64:/bluebird/builds/bld_399771/sdk/ap6480/jre/lib/ppc64/ compressedrefs:/bluebird/builds/bld_399771/sdk/ap6480/jre/lib/ppc64/j9vm:/bluebird/builds/bld_399771/sdk/ap6480/jre/lib/ppc64:/bl uebird/builds/bld_399771/sdk/ap6480/jre/../lib/ppc64:/bluebird/builds/bld_399771/sdk/ap6480/jre/lib/icc:/usr/lib:/usr/lib

@pshipton
Copy link
Member Author

pshipton commented Oct 17, 2018

Disabling openssl on AIX at OpenJ9 eclipse-openj9/openj9#3334

@pshipton
Copy link
Member Author

Created an issue at Adopt as well adoptium/temurin-build#658

@sxa
Copy link
Contributor

sxa commented Oct 18, 2018

I've hit issues before with the AIX JVM replacing things in the LIBPATH and causing problems. Splatting LIBPATH before the operations spawned from java would work too. Disabling it seems a bit drastic though - could we not just stop it bundling, or even just require the user to manually adjust the LIBPATH if they want to use it?

@pshipton
Copy link
Member Author

pshipton commented Oct 18, 2018

@sxa555 We could just stop bundling. This will cause the VM to go looking for a libcrypto.a I believe. If everything works correctly, it would typically find the wrong version on the system, fail to load the JNI library, and fail back to Java crypto code. There is some risk of it not working correctly and then I'm not sure what will happen, although @mbvreddy tells me it works. If you want to try not bundling at Adopt we can see if any problems occur.

For OpenJ9 since we don't have the correct libcrypto library on the machines, disabling the openssl support seemed the safer short term alternative to ensure that AIX testing would proceed without problems for the 0.11.0 release.

If the JVM is created with openssl support enabled, a user (including OpenJ9) could build an openssl 1.1 library and make it available on the machine for the VM to find. However this may just result in similar problems as described in this Issue, depending on how the user makes the crypto library available to the VM. I expect there are workarounds, which require experimentation and documentation. This can be sorted out in the next week or so I expect.

@groeges
Copy link
Member

groeges commented Jun 5, 2019

OpenSSL should now be working on all platforms. Closing issue.

@groeges groeges closed this as completed Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants