Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crate_universe bootstrapping #650

Closed
sayrer opened this issue Mar 27, 2021 · 13 comments
Closed

crate_universe bootstrapping #650

sayrer opened this issue Mar 27, 2021 · 13 comments

Comments

@sayrer
Copy link
Contributor

sayrer commented Mar 27, 2021

I read #598, and I have some experience with this. What I currently do is run a shell script called "update_bazel.sh" that looks like this for cargo-raze:

#!/usr/bin/env bash

This command seems to require network access no matter what:

cargo generate-lockfile && \ 

The following command can be made to run in offline mode (which is not always offline) in the sandbox with a private registry created after the step above. You'll get a bunch of .crate files in the directory of your choice. To get this command to run as a genrule, you'll need to define RUSTC and LD_RUN_PATH env variables, the values for which you can track down this way. rustc needs some neighboring .so files to run this command, so I did LD_RUN_PATH=$$(dirname $(location :rustc))/../lib.

However, this still won't work, because cargo metadata assumes it can write files wherever it wants. I think it could probably be made to work if you could predict its output in a .bzl file, but I just gave up after I saw #598.

cargo metadata > metadata.json && \ 

If run on the host machine, cargo metadata inserts host machine absolute paths. So I did this awful thing:

sed  -i "s#$(pwd)#CARGO_CRATE_PATH#g" metadata.json && \

Then I use Bazel to run cargo-raze

bazel run @cargo_raze//:raze -- --manifest-path=$(realpath Cargo.toml)

I later reversed the sed to run cbindgen as a genrule:
https://github.com/grafica/crustls/blob/15152596ec726f3dc9389c5668672f544b6b783e/BUILD.bazel#L98

After all this, I did get a build that works with RBE, and running cargo metadata remotely almost worked.

cc: @illicitonion @gibfahn @hlopko @UebelAndre

@UebelAndre
Copy link
Collaborator

I'm also working on something but it's using cross. my thought is that users would be able to run a script to update binaries but CI would rebuild and assert that the outputs are exactly the same, ensuring no funny business was checked in. I have a good feeling this will work out since cross builds the binaries in docker so we should have great control over the outputs.

This command seems to require network access no matter what:

cargo generate-lockfile && \ 

You can pass --offline here but I'd expect that not work in cases where the local index is not updated.

Then I use Bazel to run cargo-raze

bazel run @cargo_raze//:raze -- --manifest-path=$(realpath Cargo.toml)

I think it'd be great to be able to build this in Bazel, if you get something working with Cargo raze I'd love to see it 😄 Though I think it'll be a bit challenging with it's dependencies on things like openssl, just a heads up 😅

@sayrer
Copy link
Contributor Author

sayrer commented Mar 27, 2021

I think it'd be great to be able to build this in Bazel, if you get something working with Cargo raze I'd love to see it 😄 Though I think it'll be a bit challenging with it's dependencies on things like openssl, just a heads up 😅

This is what it takes :)

https://github.com/grafica/dockerfiles/blob/main/ubuntu_20_04_cargo_raze/Dockerfile

This ran on BuildBuddy's RBE. (edit: you also need a close to main branch copy of @rules_foreign_cc because they just fixed some CMake bugs.)

@UebelAndre
Copy link
Collaborator

This is what it takes :)

Wait, you built crate_universe_resolver there?

@sayrer
Copy link
Contributor Author

sayrer commented Mar 27, 2021

No, built and ran cargo-raze with that specified as the platform. Something like this: https://docs.buildbuddy.io/docs/rbe-setup/#platforms. Once I found bazel-contrib/rules_foreign_cc#573 (from you, thanks), I just picked through the make/CMake errors until it compiled. So, I didn't get it all building with Bazel, I just created a Docker image that can build it with RBE using all the legacy stuff via @rules_foreign_cc.

Here's the build (I think with a 30-day expiry): https://app.buildbuddy.io/invocation/a00fa3a8-9067-4cd2-a475-bfcc36bbffee

//:docker_image_platform in the logs is the Dockerfile above.

I haven't tried things with #635 fixed, so it's possible that I won't need gcc/g++ in the Docker image, but I am not sure.

@illicitonion
Copy link
Collaborator

Super excited to see this conversation happening!!

A few general things to note:

  • crate_universe_resolver has no interesting deps other than those shared with cargo-raze, so if we get cargo-raze building, crate_universe_resolver should be a simple extension.
  • As far as I know the only complex dep chain is: crate_universe_resolver -> cargo-raze -> crates-index -> git2 -> libgit2-sys -> libssh2-sys -> openssl-sys. If we could drop cargo-raze's dep on crates-index, or drop crates-index's dep on git2, life would be a lot simpler (almost everything would then be pure rust, or trivial C). That dep chain is used to update a local crate index, and fetch the sha256s of crates. I believe @UebelAndre has made the Cargo.lock support in cargo-raze sufficient such that that dep is only actually used if you have extra binary crates, which crate_universe_resolver doesn't support, so possibly we could hide that behind a feature in cargo-raze to make those dependencies optional (or just delete it)?
  • @hlopko mentioned an interested in a vendored mode - that could potentially help with bootstrapping - if we have all of our deps checked in, we don't need to resolve them!
  • I know that we have Mac, Windows, and Linux CI builders, so if cross-compiling is proving problematic, we could possibly try to do multiple single-platform builds on CI somehow.

I've generally been imagining that we'd have CI build the binaries and publish them to Google Cloud Storage, and update a file which lists URLs and sha256s. I think that having a script which builds the binaries, and either verifies that they're up to date, or saves them somewhere, is a great direction to go - whether we have contributors run that locally and have CI verify they're up to date, or have CI somehow automatically do the update for you, we can work out once we have the script. I think it's also OK if local development involves a slightly different flow than CI.

In terms of the general approach, I've in the past gotten crate_universe_resolver building with Bazel individually on each of Linux and Mac, but not tried cross-compiling. IIRC this mostly meant pulling in a static openssl dep using rules_foreign_cc, and pointing libssh2 at that built openssl, and bootstrapped using the previous commit's crate_universe_resolver - this was a bit of a pain because it meant breaking changes were fiddly, but also, keeping ourselves honest with that (once we have stable APIs) maybe is ok.

MacOS is always the special corner, but as we have CI running MacOS, worst-case just building binaries there shouldn't be the end of the world...

@UebelAndre
Copy link
Collaborator

* As far as I know the only complex dep chain is: `crate_universe_resolver` -> `cargo-raze` -> `crates-index` -> `git2` -> `libgit2-sys` -> `libssh2-sys` -> `openssl-sys`. If we could drop `cargo-raze`'s dep on `crates-index`, or drop `crates-index`'s dep on `git2`, life would be a lot simpler (almost everything would then be pure rust, or trivial C). That dep chain is used to update a local crate index, and fetch the sha256s of crates. I _believe_ @UebelAndre has made the Cargo.lock support in cargo-raze sufficient such that that dep is only actually used if you have extra binary crates, which `crate_universe_resolver` doesn't support, so possibly we could hide that behind a feature in `cargo-raze` to make those dependencies optional (or just delete it)?

I mean, it doesn't get used but it's still compiled 😅. I opened google/cargo-raze#305 to hopefully just delete it because I couldn't find a super clean way to section off the functionality with a feature. But I've since just made a branch on my fork of cargo-raze and cut out all those dependencies. We aren't using them here and would hard assert that we should not. Whatever information we get from cargo metadata is what we should use. And to prevent a detour in the conversation, I'm not currently prepared to discuss anything about cutting out cargo entirely that's what raze and universe are using so that's what we should continue to rely on until we've answered enough questions about this implementation.

* @hlopko mentioned an interested in a vendored mode - that could potentially help with bootstrapping - if we have all of our deps checked in, we don't need to resolve them!

Worth a discussion but I'd like to first get to a point where users can use the rules and have it just work. My focus is on bootstrapping and I don't think vendoring works here. We need a compiled binary for the repository rule so something has to have done that work before Bazel starts up so just having the sources there doesn't really work IMO since I don't think it's acceptable to have the repository rule try to build the binary.

I might have something working by tomorrow but it's going to be using cargo for bootstrapping. I think this is acceptable given the experimental state we're in but I also feel we should have no issues using cargo for bootstrapping. The next step would be to have the resolver build another resolver, which is similar to what I did for cargo-raze.

@sayrer
Copy link
Contributor Author

sayrer commented Mar 28, 2021

I mean, it doesn't get used but it's still compiled

If that whole dependency chain is in there for a case that wouldn't really apply in Bazel, perhaps we could get the Cargo folks to add a feature for us. Agree that eliminating that stuff would make this problem much easier.

@illicitonion
Copy link
Collaborator

/cc @acmcarther - How would you feel about deprecating, deleting, or making optional, support for binary deps in cargo-raze?

I put together a couple of potential branches:

Just conditionaling out the crate-index dep: https://github.com/illicitonion/cargo-raze/tree/crate-index - binary deps in theory work but won't have checksums

Conditionaling out the whole binary deps feature: https://github.com/illicitonion/cargo-raze/tree/binary-deps - allows us to also conditional out a dep on cargo-clone-crate which pulls in reqwest which in turn depends on openssl by default on some platforms (and cargo-clone-crate doesn't allow you to configure how it depends on rustls, though probably could/should :))

@UebelAndre
Copy link
Collaborator

/cc @acmcarther - How would you feel about deprecating, deleting, or making optional, support for binary deps in cargo-raze?

So, while @acmcarther is the code-owner, I'm the author of that feature and advocate of removing it. I already have a branch with that feature totally isolated behind a feature. I only did this today and haven't had time to split it out into anything I'd expect to get merged. This is a non-blocker for the time being since I can continue to work off my branch.

Though, to directly address the question, I don't think we can totally remove the feature until we have a solution for the wasm-bindgen rules which rely on the binary_deps feature. google/cargo-raze#218 has some context on this.

@UebelAndre
Copy link
Collaborator

I have a working solution for bootstrapping. I still need to run some tests for the Buildkite CI builds but I'm able to build binaries for various platforms in a way that makes the resolver available to the rule. I'll try to iron out a few more things then open a PR soon.

@UebelAndre
Copy link
Collaborator

I was able to open #663 but am working with @hlopko and @illicitonion to improve the solution I came up with.

@UebelAndre
Copy link
Collaborator

I'm not sure if this is resolved but we do have some sort of "bootstrapping" for crate_universe now. You can currently look at for crate_universe-# releases (eg crate_universe-8) which contains a snippet for a crate_universe_defaults.bzl file which can be copied into your workspace.

This method though is subject to change since the rule is still extremely experimental.

@UebelAndre
Copy link
Collaborator

Closing this since the implementation for crate_universe has changed substantially since #1158

If there are issues with the new implementation lets track them in new tickets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants