Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS X: Odd loader issues when a shared library depends on a static-only library #507

Closed
kayasoze opened this issue Oct 14, 2015 · 14 comments
Closed
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: bug

Comments

@kayasoze
Copy link

I haven't narrowed down a succinct repro. case, but I've observed that in the following scenario, odd linker errors occur:

cc_test -> cc_library(foo) (supports both shared and static) -> cc_library(bar) (wraps autoconf built static library)

Adding linkstatic=1 to foo fixes all linker issues. It seems that Bazel doesn't support a shared-library dependency itself depending on a static library.

Note that in this case, bar is static because it is built via a cross-platform genrule, and Bazel can't support a .so out on Linux and a .dylib out on OS X; select() is prohibted in outs. Thus, it only produces a .a

@ulfjack
Copy link
Contributor

ulfjack commented Oct 14, 2015

What's the error message from the linker?

Wild guess: configure autoconf to generate a PIC static library (IIRC, pass -fPIC to the compiler); non-PIC code cannot be linked into dynamic libraries.

@kayasoze
Copy link
Author

That's a great guess, but I've already compiled most, if not all, of my static dependencies PIC. I'm out sick today, but one example is that dyld won't be able to find the shared library for gtest-main. Bizzarrely, that's a Bazel-built shared library completey unrelated to the other dependecy with the static child (which happens to be gRPC). So in this unit test example:

'''
cc_test(
...
deps = [
"gtest-main",
"gRPC", # depends on static foo.a
],
)

... the so for gtest-main will not be found unless I set linkstatic=1 on gRPC. Strange but true.

@kayasoze kayasoze reopened this Oct 14, 2015
@kayasoze kayasoze changed the title OS X: Odd linker issues when a shared library depends on a static-only library OS X: Odd loader issues when a shared library depends on a static-only library Oct 14, 2015
@ulfjack
Copy link
Contributor

ulfjack commented Oct 15, 2015

Can you post the actual error message? It would help immensely. Are you saying that gtest-main is found if you remove the gRPC dependency?

@kayasoze
Copy link
Author

Symbols are found when gRPC is compiled with linkstatic=1; they're not found otherwise. Here's another example, where a static symbol is not found. The symbols which are missing, and whether they arise from a static or shared dependency, don't seem to be deterministic, and may depend on the order of the dependencies as listed in the test.

without gRPC linkstatic=1:

bazel test ... --test_output=streamed                                                                                                                                                                                            ⏎ master ✱
WARNING: Streamed test output requested so all tests will be run locally, without sharding, one at a time.
INFO: Found 3 targets and 1 test target...
dyld: Symbol not found: _deflate
  Referenced from: /private/var/tmp/_bazel_..../_solib_darwin//libthird-party_Sgoogle-grpc_Slibgrpc.so
  Expected in: flat namespace
 in /private/var/tmp/_bazel_.../bazel-out/local_darwin-fastbuild/bin/.../../_solib_darwin//libthird-party_Sgoogle-grpc_Slibgrpc.so
tools/test/test-setup.sh: line 44:  6884 Trace/BPT trap: 5       "$@"

With linkstatic=1 there is no issue.

In this case, _deflate is a symbol exported by a statically linked sub-dependency of gRPC, zlib:

test -> grpc++ -> gpr -> zlib

Here, 'gpr' is another cc_library() in gRPC that contains the dependency on zlib.

zlib's BUILD file:

filegroup(
  name = "zlib-srcs",
  srcs = glob(["**"]),
)

genrule(
  name = "build-zlib",
  srcs = [
    ":zlib-srcs",
  ],
  outs = [
    "lib/libz.a",
    "include/zconf.h",
    "include/zlib.h",
  ],
  cmd = "export PATH=$$PATH:/bin && export WORKSPACE=$$PWD && " +
    "rsync -a %s/ $(GENDIR)/%s && " % (PACKAGE_NAME, PACKAGE_NAME) +
    "cd $(GENDIR)/%s && " % PACKAGE_NAME +
    "./configure --prefix=$$PWD && " +
    "make -j8 install"
)

cc_library(
  name = "zlib",
  srcs = [
    ":build-zlib",
    "empty.cc",
  ],
  includes = ["include"],
  visibility = ["//visibility:public"],
  linkstatic = 1,
)

@ulfjack
Copy link
Contributor

ulfjack commented Oct 20, 2015

I have no idea why that would happen. Can you run 'otool -L' on the test binary?

@damienmg
Copy link
Contributor

Closing as obsolete without answers. Please ping back if you are still experiencing the issue.

@kayasoze
Copy link
Author

The bug didn't fix itself, no. I still have many examples of where bazel test on Linux will succeed, but bazel test of the same target on Mac OS X will fail, e.g.:

dyld: lazy symbol binding failed: Symbol not found: __ZN6google17InitGoogleLoggingEPKc

Another instance of the bug arises when a test depends on two or more shared libraries which share a dependency linked with linkstatic=1. For example:

foo_test -> bar (shared) -> google-glog (static)
         -> baz (shared) -> google-glog (static)

... is problematic. Removing linkstatic=1 from google-glog fixes the problem. Adding linkstatic=1 to the test itself also fixes the issue, so it would appear that a shared target depending on a static target is a necessary condition to reproduce the bug. A diamond dependency may also be necessary.

@damienmg damienmg reopened this Nov 27, 2015
@damienmg damienmg added type: bug P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Nov 27, 2015
@ulfjack
Copy link
Contributor

ulfjack commented Jun 15, 2016

We'll need help from someone who better understands C++ linking on OSX to fix this. I'm out of my depth here.

@ulfjack ulfjack removed their assignment Jun 15, 2016
@ulfjack ulfjack added this to the 1.0 milestone Jun 15, 2016
@ulfjack
Copy link
Contributor

ulfjack commented Jun 15, 2016

Also see #492.

@david-german-tri
Copy link

david-german-tri commented Jan 25, 2017

I have a potentially related problem on RobotLocomotion/drake#4896 at RobotLocomotion/drake@360cf61. On OS X, I get dyld failures like the following. On Linux, no problems.

$ bazel test drake/solvers:constraint_test
..........
INFO: Found 1 test target...
FAIL: //drake/solvers:constraint_test (see /private/var/tmp/_bazel_davidgerman/45b8dc5f243a1640ca673b124a0d188d/execroot/drake-distro/bazel-out/local-opt/testlogs/drake/solvers/constraint_test/test.log).
INFO: From Testing //drake/solvers:constraint_test:
==================== Test output for //drake/solvers:constraint_test:
dyld: Library not loaded: /private/var/tmp/_bazel_davidgerman/45b8dc5f243a1640ca673b124a0d188d/bazel-sandbox/c399b4d0-6ee8-4c39-9479-63cfda35259d-0/execroot/drake-distro/bazel-out/local-opt/genfiles/external/ipopt/lib/libcoinasl.0.dylib
  Referenced from: /private/var/tmp/_bazel_davidgerman/45b8dc5f243a1640ca673b124a0d188d/bazel-sandbox/dc9ada7b-dbd5-4ffe-a4f8-3160a2fd082a-0/execroot/drake-distro/bazel-out/local-opt/bin/drake/solvers/constraint_test.runfiles/drake/drake/solvers/constraint_test
  Reason: image not found
external/bazel_tools/tools/test/test-setup.sh: line 114:   689 Trace/BPT trap: 5       "${TEST_PATH}" "$@"
================================================================================
Target //drake/solvers:constraint_test up-to-date:
  bazel-bin/drake/solvers/constraint_test
INFO: Elapsed time: 10.468s, Critical Path: 0.81s
//drake/solvers:constraint_test                                          FAILED in 0.2s

libcoinasl.0.dylib is an output from a genrule that builds an external using autotools. It follows the advice from #281 (comment) - different genrules for different platforms. The dependency graph is:

genrule 
  <- wrapper cc_library "ipopt", uses select to pick platform-appropriate genrule
    <- cc_library "ipopt_solver", linkstatic = 1 
      <- cc_library "mathematical_program", linkstatic = 1
        <- cc_test "constraint_test"

Here's my only clue so far. As you can see, the test executable is in bazel-sandbox/dc9ada7b-dbd5-4ffe-a4f8-3160a2fd082a-0, but it's looking for the shared library in bazel-sandbox/c399b4d0-6ee8-4c39-9479-63cfda35259d-0, which does not exist. The shared library can be found under bazel-genfiles, which of course is a symlink into execroot, not into any bazel-sandbox directory. otool -L confirms the mysteriousc399 sandbox path is actually baked into the test binary in bazel-bin.

@Helcaraxan
Copy link

A friendly ping on this issue as we are also facing this and it prevents us from switching of our old test flow to Bazel only.

@hlopko
Copy link
Member

hlopko commented Aug 17, 2017

It seems to me that what @david-german-tri reported is actually #3450, for which I'm submitting the fix right now. @Helcaraxan are you affected by #3450 or #507? Can you please verify?

Now to the original issue, @kayasoze is it still broken? Could you compose a repro case? I can take a look then.

@Helcaraxan
Copy link

@mhlopko I am currently trying to reproduce it. That said some of our project structure has significantly changed since last April so not sure that I will be able to. If I am not giving any further updates here in the next days consider I was not able to reproduce it.

@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: bug
Projects
None yet
Development

No branches or pull requests

8 participants