Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big buildsystem -> cargo rollup #2518

Merged

Conversation

cgwalters
Copy link
Member

@cgwalters cgwalters commented Jan 29, 2021

build-sys: Explicit workspace, move libdnf deps to crate

First explicitly state that we're a workspace. AIUI
this is actually implicit today via our use of a path
dependency, but in the future we may have other sub-crates.
So let's make it explicit now.

Also move the libdnf dependencies directly to that sub-crate.


Move libdnf build over to Cargo

This is now further migration towards Cargo/Rust possible
because we switched our main binary. We've had an internal
libdnf-sys crate for a while, but now it can take over
the build of the underlying library too (like many -sys
crates support).

This itself is just an incremental step towards migrating
the main rpm-ostree build system to e.g. cmake too (or
perhaps directly with the cc crate, not sure yet) and
driving it via cargo too.


build-sys: Drop Fedora 25+ rpm version check

I think we can just assume this nowadays.


build-sys: Delete duplicate Rust pkg-config dependencies

We didn't need this after switching to a Rust main.


build-sys: Cleanly split up deps of public shlib vs internals

First, the public shared library only depends on a few
things (not the libdnf dependencies) so let's ensure we
only link it to those libraries.

And then, I realized we don't actually need the libdnf
dependencies here - I think I only added those back here
when trying vainly to keep the C unit tests working. But
we don't have those anymore! So we can delete the duplication
and fully rely on Cargo taking care of libdnf.

Conceptually for a static library we don't "link" it against
anything in Automake, that happens at the final stage with
the Rust linker.

@cgwalters
Copy link
Member Author

Draft since this depends on #2502

dnl These are the dependencies of the public librpmostree-1.0.0 shared library
PKG_CHECK_MODULES(PKGDEP_LIBRPMOSTREE, [gio-unix-2.0 >= 2.50.0 json-glib-1.0 ostree-1 >= 2020.7 rpm])
dnl And these additional ones are used by for the rpmostreeinternals C/C++ library
PKG_CHECK_MODULES(PKGDEP_RPMOSTREE, [polkit-gobject-1 libarchive])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can clearly see this one with e.g.:

$ eu-readelf -a /usr/lib64/librpmostree-1.so.1.0.0 |grep NEEDED | wc -l
16
$ eu-readelf -a .libs/librpmostree-1.so.1.0.0 |grep NEEDED | wc -l
9

@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch 2 times, most recently from 0f0c8ac to bccde5f Compare February 1, 2021 13:46
@cgwalters cgwalters marked this pull request as ready for review February 1, 2021 13:46
@cgwalters
Copy link
Member Author

And lifting draft on this one!

@cgwalters
Copy link
Member Author

= note: /usr/bin/ld: /usr/lib64/librpm.so: undefined reference to symbol 'sqlite3_column_text'
/usr/bin/ld: /usr/lib64/libsqlite3.so: error adding symbols: DSO missing from command line

This...is interesting. This only fails in the test suite, not the main build (I carefully checked every change along the way here builds but I didn't run the test suite).

I think this may be related to rust-lang/rust#53945

We could try using gold or the LLVM linker but...eh. I think it'll be simpler to add a workaround of just adding a usage of an sqlite function.

@cgwalters
Copy link
Member Author

cgwalters commented Feb 1, 2021

Yeah building with CC=clang CXX=clang++ LD=/usr/bin/ld.lld RUSTFLAGS="-C link-arg=-fuse-ld=lld" works fine. So...OK I tried the "fake out the linker" thing but that really just led into a pile of hacks.

I updated this with another commit:

build-sys: Switch to clang by default

Eventually we want to do cross-language LTO:
https://developers.redhat.com/blog/2020/03/18/cross-language-link-time-optimization-using-red-hat-developer-tools/

But in the immediate term, using clang/lld instead of ld.bfd (or ld.gold)
works around a linker bug: #2518 (comment)

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Feb 1, 2021
Eventually we want to do cross-language LTO:
https://developers.redhat.com/blog/2020/03/18/cross-language-link-time-optimization-using-red-hat-developer-tools/

But in the immediate term, using clang/lld instead of ld.bfd (or ld.gold)
works around a linker bug: coreos#2518 (comment)
cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Feb 1, 2021
Eventually we want to do cross-language LTO:
https://developers.redhat.com/blog/2020/03/18/cross-language-link-time-optimization-using-red-hat-developer-tools/

But in the immediate term, using clang/lld instead of ld.bfd (or ld.gold)
works around a linker bug: coreos#2518 (comment)
@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch 2 times, most recently from 844413e to 8a6ae18 Compare February 1, 2021 18:37
@cgwalters
Copy link
Member Author

Ohh I see we're not generating the versioned shlib for some reason, probably related to the linker swap. But not seeing this locally, hmm...

@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch from 8a6ae18 to 65fc470 Compare February 1, 2021 19:46
@cgwalters
Copy link
Member Author

Man I am so confused. It's like libtool is just silently skipping running the actual clang -shared invocation. I can't reproduce...ah wait I just did.

@cgwalters
Copy link
Member Author

OK yeah it seems like there are just known issues with libtool and lld:

But but but...it somehow works in my pet dev container (also f33/x86_64 etc.)

@cgwalters
Copy link
Member Author

OK I figured out at least one issue...trying to omit the dependency between the private shlib and the public one from Automake's POV breaks libtool because in that case it only "materializes" the shared library at install time, which is odd but whatever. We can just roll those libraries in Automake-side rather than doing it in build.rs, it's fine.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Feb 1, 2021
Eventually we want to do cross-language LTO:
https://developers.redhat.com/blog/2020/03/18/cross-language-link-time-optimization-using-red-hat-developer-tools/

But in the immediate term, using clang/lld instead of ld.bfd (or ld.gold)
works around a linker bug: coreos#2518 (comment)
@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch from 65fc470 to db4fe45 Compare February 1, 2021 22:20
@@ -36,10 +36,6 @@ librpmostree_1_la_CFLAGS = $(AM_CFLAGS) -I$(srcdir)/libglnx -I$(srcdir)/src/libp
librpmostree_1_la_LDFLAGS = $(AM_LDFLAGS) -version-number 1:0:0 -Bsymbolic-functions
librpmostree_1_la_LIBADD = $(PKGDEP_LIBRPMOSTREE_LIBS) libglnx.la

# XXX: work around clang being passed -fstack-clash-protection which it doesn't understand
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Feb 1, 2021
Eventually we want to do cross-language LTO:
https://developers.redhat.com/blog/2020/03/18/cross-language-link-time-optimization-using-red-hat-developer-tools/

But in the immediate term, using clang/lld instead of ld.bfd (or ld.gold)
works around a linker bug: coreos#2518 (comment)
@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch 2 times, most recently from 336a805 to 78ae8ff Compare February 2, 2021 13:52
@cgwalters
Copy link
Member Author

OK I think moving the libdnf dependencies into the sub-crate fixed the linking problems.

Now I just have one more issue which is the initramfs test fails only when run as root in the buildroot container...

@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch from 78ae8ff to 270cc00 Compare February 2, 2021 15:36
@cgwalters
Copy link
Member Author

OK I think this one is going to pass CI! 🙏

@cgwalters
Copy link
Member Author

OK I think we're still just spawning too many processes in CI. I'm looking at moving the unit test workload to Prow.

@cgwalters
Copy link
Member Author

🎉 CI passed!

# This currently needs to duplicate the libraries from libdnf
[package.metadata.system-deps]
rpm = "4"
# libdnf dependencies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: guess this comment is redundant now that it lives in the libdnf crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Will fix this in a followup, hoping to get this in and avoid restarting CI 3-4 times again)

Comment on lines +9 to +10
// I picked /usr/libexec/rpm-ostree just because we need an
// arbitrary path - we don't actually install there.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we drop these :PATH values now that we build statically, or does the build still error out if we omit them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd default to /usr/local then I think; the idea here is to use a should-be-nonexistent path so that if it ever shows up at runtime we know it's a bug. (And actually one thing we've broken is the libdnf translations, but ostree and rpm-ostree aren't translated, and we don't even expose a way to change the output language for our DBus API so...)

// This is an awful hack necessary because this header file
// is generated in the libdnf build, but we don't want to serialize
// our C++ build waiting for that. So just generate it ahead
// of time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So IIUC, this is intended to fulfill requests for #include <libdnf/dnf-version.h> when compiling rpm-ostree? So then theoretically if we use those macros somewhere in rpm-ostree, we'd pull in the wrong ones? I feel like this will come back to bite us in the future (or cause some subtle bug that'll be really hard to debug).

Hmm, how much is this hack buying us? Aren't we parallelizing the Rust builds anyway? For the developer case at least, I think it's OK if a fresh build takes a bit longer. And as we move more things from C++ to Rust, we'll naturally speed up C++ compilation. I'd vote for dropping this hack, but fine with it if others don't mind it.

Copy link
Member Author

@cgwalters cgwalters Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could wire things up so that make → cargo build -p libdnf-sys → C++ and in parallel make -> cargo build, because cargo will do locking internally.

But...another ugly part is that cargo puts the output in a target directory only known to the build.rs script, so we'd have to write a file with that and load it into make somehow.

If you look at e.g. rg 'LIBDNF_.*VERSION' the version macros are not used by us, and are basically not used by libdnf either. The only actual usage in code is that they're exposed in the Python "hawkey" bindings which we don't even build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at e.g. rg 'LIBDNF_.*VERSION' the version macros are not used by us, and are basically not used by libdnf either. The only actual usage in code is that they're exposed in the Python "hawkey" bindings which we don't even build.

Yeah, saw that. I agree it's safe to do this now, but mocking build output just feels very underhanded. It's the kind of thing no one will notice until 3 years from now when someone comes upon it after debugging for 4 hours because something started relying on those macros not lying and breaking something else. (Offhand, I could also imagine rpm-ostree --version printing the version of libdnf that we're vendoring using these macros so that you don't have to do the lookup in GitHub.)

Anyway, fine as is too and maybe I'm over-estimating the likelihood of this causing trouble in the future. But again I'd vote for optimizing the build system in other ways instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the values here currently come from the spec file rpm-software-management/libdnf@a240bf2 so...it'd be quite ugly to replicate that, and in any case it's really going to be a lie anyways.

OK I changed things so that the header file is empty - that way if e.g. libdnf starts depending on the value in any way it becomes a hard compile error.

Another thing we could do I guess is validate the sha256 checksum of the vendored header copy and fail if it changed, but eh.

@jlebon
Copy link
Member

jlebon commented Feb 3, 2021

/approve

Curious what others think about the convo above (and feel free to /lgtm if you'd like!).

@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch from 2c4f579 to cb6b666 Compare February 3, 2021 22:29
First explicitly state that we're a workspace.  AIUI
this is actually implicit today via our use of a `path`
dependency, but in the future we may have other sub-crates.
So let's make it explicit now.

Also move the libdnf dependencies directly to that sub-crate.
This is now further migration towards Cargo/Rust possible
because we switched our main binary.  We've had an internal
`libdnf-sys` crate for a while, but now it can take over
the build of the underlying library too (like many `-sys`
crates support).

This itself is just an incremental step towards migrating
the main rpm-ostree build system to e.g. cmake too (or
perhaps directly with the `cc` crate, not sure yet) and
driving it via `cargo` too.
I think we can just assume this nowadays.
We didn't need this after switching to a Rust main.
First, the public shared library only depends on a few
things (not the libdnf dependencies) so let's ensure we
only link it to those libraries.

And then, I realized we don't actually need the libdnf
dependencies here - I think I only added those back here
when trying vainly to keep the C unit tests working.  But
we don't have those anymore!  So we can delete the duplication
and fully rely on Cargo taking care of libdnf.

Conceptually for a static library we don't "link" it against
anything in Automake, that happens at the final stage with
the Rust linker.
@cgwalters cgwalters force-pushed the libdnf-sys-build-cargo branch from cb6b666 to 6159820 Compare February 4, 2021 12:37
@cgwalters
Copy link
Member Author

OK rebased 🏄 this to test out the new CI from openshift/release#15461

@cgwalters
Copy link
Member Author

/refresh

@cgwalters
Copy link
Member Author

/override ci/prow/sanity
That CI context is stale.

@openshift-ci-robot
Copy link
Collaborator

@cgwalters: /override requires a failed status context to operate on.
The following unknown contexts were given:

  • ci/prow/sanity

Only the following contexts were expected:

  • ci/prow/build
  • ci/prow/build-clang
  • ci/prow/unit
  • continuous-integration/jenkins/pr-merge
  • tide

In response to this:

/override ci/prow/sanity
That CI context is stale.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jlebon
Copy link
Member

jlebon commented Feb 4, 2021

Cool, new approach WFM!

/lgtm

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jlebon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit ae3392f into coreos:master Feb 4, 2021
@cgwalters
Copy link
Member Author

Looks like this fails in Koji:

https://kojipkgs.fedoraproject.org//work/tasks/892/61310892/build.log

 -- Configuring incomplete, errors occurred!
  See also "/builddir/build/BUILD/rpm-ostree-2021.1.99.gae3392ff/target/release/build/libdnf-sys-c03445fbdba92219/out/build/CMakeFiles/CMakeOutput.log".
  --- stderr
  Running CMake on libdnf...
  CMake Deprecation Warning at CMakeLists.txt:8 (cmake_policy):
    The OLD behavior for policy CMP0005 will be removed from a future version
    of CMake.
    The cmake-policies(7) manual explains that the OLD behaviors of all
    policies are deprecated and that a policy should be set to OLD only under
    specific short-term circumstances.  Projects should be ported to the NEW
    behavior and not rely on setting a policy to OLD.
  Building libdnf version: 0.58.0
  CMake Warning (dev) at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
    The package name passed to `find_package_handle_standard_args`
    (LibSolv_ext) does not match the name of the calling package (LibSolv).
    This can lead to problems in calling code that expects `find_package`
    result variables (e.g., `_FOUND`) to follow a certain pattern.
  Call Stack (most recent call first):
    /usr/share/cmake/Modules/FindLibSolv.cmake:76 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
    CMakeLists.txt:54 (find_package)
  This warning is for project developers.  Use -Wno-dev to suppress it.
  You have changed variables that require your cache to be deleted.

Looking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants