Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible builds #447

Closed
morozov opened this issue Dec 9, 2019 · 7 comments · Fixed by #1077
Closed

Reproducible builds #447

morozov opened this issue Dec 9, 2019 · 7 comments · Fixed by #1077

Comments

@morozov
Copy link

morozov commented Dec 9, 2019

Feature Request

If box compile is run multiple times on the same sources using the same configuration, it produces different results:

diff --git a/phpbrew b/phpbrew
index c79c4a49..6283320d 100755
--- a/phpbrew
+++ b/phpbrew
@@ -7,1028 +7,329 @@
  * @link https://github.com/humbug/box
  */

-Phar::mapPhar('box-auto-generated-alias-817b78bd5c84.phar');
+Phar::mapPhar('box-auto-generated-alias-8598625848b9.phar');

-require 'phar://box-auto-generated-alias-817b78bd5c84.phar/bin/phpbrew';
+require 'phar://box-auto-generated-alias-8598625848b9.phar/bin/phpbrew';

 __HALT_COMPILER(); ?>

One of the factors is the usage of random_bytes() to generate the stub alias:

https://github.com/humbug/box/blob/efe97a7d48169d01b776cdfb1d43608cad3f13c2/src/Configuration/Configuration.php#L853-L855

When it comes to automation, it's important that the same sources produce the same artifacts in order to not build the same things over and over. See https://reproducible-builds.org/ for more details.

Instead of using random bytes, Box could use an approach similar to the one used in Composer and hash all the sources used during the build.

@theofidry
Copy link
Member

theofidry commented Dec 9, 2019

I think there is more to that as if I recall correctly the data of creation is might be added somewhere as a header or metadata, which is also what Composer does. The hash you are pointing at is for the lock file only, not the Composer PHAR itself which is not reproducible either. That said I'm not against the idea of switching to reproducible builds by default.

It does also mean that when using PHP-Scoper, the prefix used should be deterministic as well

@morozov
Copy link
Author

morozov commented Dec 9, 2019

I think there is more to that as if I recall correctly the data of creation is might be added somewhere as a header or metadata […]

Definitely. The rest is also different but this is the most obvious difference.

The hash you are pointing at is for the lock file only, not the Composer PHAR itself which is not reproducible either.

Of course. This is just an example of how the contents are hashed (i.e. specific files in specific order, etc). What I was trying to say is that in Composer, the same configuration that resulted in the same dependency resolution will have the same lock hash effectively making the process of dependency resolution reproducible:

$ rm -rf vendor composer.lock
$ composer update
$ grep content-hash composer.lock
    "content-hash": "7f80fc47150ea0d769c4a2797c070a40",

$ rm -rf vendor composer.lock
$ composer update
$ grep content-hash composer.lock
    "content-hash": "7f80fc47150ea0d769c4a2797c070a40",

@theofidry
Copy link
Member

/cc @ondrejmirtes

@ondrejmirtes
Copy link

I've managed to get to a 100% reproducible build 🎉 This is the recipe:

  • PHP-Scoper needs to have a stable prefix instead of a generated one (https://github.com/humbug/php-scoper/blob/master/docs/configuration.md#prefix).
  • Before doing composer install of the project there needs to be environment variable COMPOSER_ROOT_VERSION set to a stable string, otherwise the ever-changing Git commit will become part of the PHAR.
  • Composer's autoloader-suffix also needs to be set to something stable, like this: composer config autoloader-suffix PHPStanChecksum
  • And the weirdest part - to stabilize timestamps of files included in the PHAR and to always have the same PHAR signature, the following trick needs to be used: phpstan/phpstan-src@cb1a1f4

@theofidry
Copy link
Member

Revisiting this issue.

From this I think there is two actionables:

  • Adding a doc entry about reproducible builds
  • The last point should be added within Box

I am also wondering if this is it. Are you using your own stub as well or something? Because I would guess otherwise the standard stub would be a problem (due to the random alias)

@ondrejmirtes
Copy link

I had to look up what "stub file" is in context of a PHAR file :) There's no specific stub setting in my Box configuration (https://github.com/phpstan/phpstan-src/blob/d194a471f9a88d4da0ae756c6664b008cf48b03c/compiler/build/box.json) and if I inspect the PHAR file to see how the stub looks like, there's nothing random in it:

#!/usr/bin/env php
<?php

Phar::mapPhar('phpstan.phar');

require 'phar://phpstan.phar/bin/phpstan';

__HALT_COMPILER(); ?>

@theofidry
Copy link
Member

aha yes, but you have alias set which is what it does. That's fine I just need to not miss it in the docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants