Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin git repos to specific revisions #499

Closed
wants to merge 1 commit into from

Conversation

strugee
Copy link
Contributor

@strugee strugee commented Dec 14, 2018

Fixes upstreams potentially being able to break our builds, and also makes builds more deterministic which helps with reproducible builds.

Fixes #498

@tlaurion
Copy link
Collaborator

tlaurion commented Feb 1, 2019

git commits are still the latest.

@tlaurion
Copy link
Collaborator

tlaurion commented Feb 8, 2019

@strugee :

You can now run IASL ACPI compiler from /builds/tlaurion/heads-ci/heads/build/coreboot-4.8.1/util/crossgcc/xgcc.
make[1]: Leaving directory '/builds/tlaurion/heads-ci/heads/build/coreboot-4.8.1'
#echo '******* Building crossgcc-arm (this might take a while) ******'
#make -C "/builds/tlaurion/heads-ci/heads/build/coreboot-4.8.1" crossgcc-arm
2019-02-08 02:22:49+00:00 CONFIG coreboot
git clone https://github.com/GregorR/musl-cross --branch abf1dd54ac05f5b2d34d750027fa083b870e069e "/builds/tlaurion/heads-ci/heads/build/musl-cross-git"
Cloning into '/builds/tlaurion/heads-ci/heads/build/musl-cross-git'...
fatal: Remote branch abf1dd54ac05f5b2d34d750027fa083b870e069e not found in upstream origin
make: *** [Makefile:355: /builds/tlaurion/heads-ci/heads/build/musl-cross-git/.canary] Error 128
ERROR: Job failed: exit code 1

output src
src banch

Fixes upstreams potentially being able to break our builds, and also
makes builds more deterministic which helps with reproducible builds.

Fixes linuxboot#498
@strugee
Copy link
Contributor Author

strugee commented Feb 11, 2019

@tlaurion weird, I could have sworn this worked when I filed the PR... in any case it should be fixed now; I just force-pushed an amended commit and I'm running a test build on my laptop right now. Although now that I look at this PR again I wonder if it's better to use submodules?

@strugee
Copy link
Contributor Author

strugee commented Feb 11, 2019

Test build successfully cloned the musl-cross repository.

@tlaurion
Copy link
Collaborator

tlaurion commented Feb 28, 2019

@strugee: I'm wondering if pinning is the right direction for submodules generally.

For example, Nitrokey/nitrokey-hotp-verification#5 will be merged soon and it will require from Heads maintainers to validate that commit ids are the latest.

A better approach to me would be to validate that the building of modules occurs on top of git pulls to be on the latest versions for each git dependent submodules. I would prefer this to the current proposed approach, so that multiple CI builds can confirm the reproducibility of a Head's commit id for a specific produced rom.

For example, coreboot will probably depend on a fixed commit ID when measured boot will be merged for all Head's supported boards, until a new coreboot release is made including those changes.

What are your thoughts about it?

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 6, 2019

@strugee Nitrokey hotp updated.

I still think this is not optimal strategy.

@strugee
Copy link
Contributor Author

strugee commented Mar 6, 2019

Hey, I'm sorry for the delayed response! It's midterms season for me at college so I've been kind of swamped. In any case:

A better approach to me would be to validate that the building of modules occurs on top of git pulls to be on the latest versions for each git dependent submodules. I would prefer this to the current proposed approach, so that multiple CI builds can confirm the reproducibility of a Head's commit id for a specific produced rom.

I'm having trouble parsing this sentence, can you rephrase?

The basic problem I see with what's in tree now is that there isn't a 1:1 correspondence between a Heads commit id and the resulting build artifact. Clearly that is problematic for reproducible builds because "built from xyz commit" is not enough to start confirming that a given binary was legitimately built from the original source code. There's other ways to work around that (e.g. we could embed the commit ids of the submodules in the final result, so you could extract them and use that information to reconstruct the source tree) but reproducible builds aren't the only problem. Having such a relation between commit id and binary has lots of other advantages I can think of too. E.g.:

  • git bisect works (better)
  • CI can't break because an upstream broke something and the new commits suddenly got picked up
  • Contributors can't run into "works on my laptop!" problems because everyone's using mismatched code
  • Probably some others I haven't thought of, idk I'm really tired at the moment

Note too that a lot of these problems aren't problems immediately, but manifest over time as upstreams add new commits.

I wonder if it would help if I added some automation to this PR? I can write a script that automatically checks that the vendored commit ids aren't out of date. We could even set up something where that script is run, the build is smoketested (since AFAICT there isn't anything beyond smoketesting in-tree at this juncture?), and the results are committed automatically.

It's possible I am completely missing something though so if I am let me know.

@strugee
Copy link
Contributor Author

strugee commented Mar 6, 2019

That all being said if we do end up wanting to go with a pinning approach I'm wondering if we should just nuke a lot of this custom code and directly use git submodules instead.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 14, 2019

@strugee : I'll try to be clearer.

Right now, if a clone is done for a specific module, it isn't verified to be the latest, nor forced pulled again on a subsequent build, layaing there forever. I think the problem lies there more then anything else, letting builds be different from one machine to another, depending on what commit id was pulled into the modules being build.

Making sure that pulls are done for each git dependent module on each make, would, IMOHO, resolve the reproducibility issues for a specific Heads commit id. Am I clearer?

I agree that some tracing of those commit could be added (maybe in the ./build/$BOARD/hashes.txt?), but a make real.clean clears any differences that could right now happen when differences are observed between what should be reproducible builds and what is actually built.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 20, 2019

@strugee
This block in main Makefile is what I consider problematic, i'm not a really good Makefile artist.

To create the .canary if it doesn't exist, we clone the repo once, after which we apply the patches, only once.

I do not know how to implement this. But a better way would be, if the destination directory exists, to validate if it's on the latest commit and pull the changes.

Thoughts?

@tlaurion
Copy link
Collaborator

#618

@osresearch
Copy link
Collaborator

@strugee you're absolutely right that "there should be a 1:1 correspondence between a Heads commit id and the resulting build artifact". Anything that breaks that is a bug that needs to be fixed.

I'm not a fan of git submodules, especially on things that have massive dependencies like the Linux kernel. This wouldn't be too bad if you could clone a specific hash, but shallow checkouts can't get anything other than the latest commit: https://twitter.com/qrs/status/956648929394905089

✓ git checkout --depth 1 --branch $BRANCH $URL
✓ git submodule add --branch $BRANCH  $URL
✓ git submodule add --depth 1 $URL
✘ git submodule add --depth 1 --branch $BRANCH $URL
"cannot update paths and switch to branch at the same time"

For things that are hosted on github or kernel.org, we can use the technique in #618 to download a specific version as a tar file and then both verify the sha256 of it as well as pin to a specific tag. This seems like a lower overhead way to handle it. Essentially we bypass all of git, unless you really want to work with a current head (which should be doable for devs, but not allowed in committed code).

@osresearch
Copy link
Collaborator

As of #618 all of the git checkouts are now pinned. And the recent #650 patch adds a pinned version of musl-cross-make as well.

@osresearch osresearch closed this Jan 9, 2020
@strugee
Copy link
Contributor Author

strugee commented Jan 15, 2020

Wonderful, that's great to hear. My apologies for being unresponsive!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pin musl-cross-git to a particular commit for reproducible builds
3 participants