-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create .changes files for reproducible builds #3062
Create .changes files for reproducible builds #3062
Conversation
9e8917a
to
31e121c
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #3062 +/- ##
==========================================
- Coverage 28.31% 28.30% -0.02%
==========================================
Files 86 86
Lines 14799 14807 +8
==========================================
Hits 4191 4191
- Misses 10608 10616 +8 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only needed for 000release-packages? If so, this probably fits better into
openSUSE-release-tools/pkglistgen/tool.py
Line 746 in 84b76ec
self.commit_package(release_dir) |
pkglistgen/tool.py
Outdated
date = datetime.strptime(product_version, "%Y%m%d") | ||
except ValueError: | ||
logging.warning( | ||
"Warning: This should only happen in staging. Can not parse product version %s as date, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also used for ALP, BCI, Leap, SLE, ... which do not use a date as version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a problem...
Like this PR is now, AFAIK the changes file gets copied when running PRODUCT_SERVICE to each release package, afterwards it is only used in 000release-packages. The current date, meaning wall-clock, can not be used as it is not reproducible, meaning know from source instead of depending on when the build was started. |
In all fairness: reproducibility on a package that changes its version daily sounds like not exactly something we should care for :) (at least on TW - Leap/SLE might be different and have a few days of the same files) |
Right, we can just use the wall-clock of generation time until the move to git when we can use a date from a commit. |
i.e. you want to make the pkglistgen output reproducible and not just the generated package sources?
As that way there will not be any change commited if the generated output does not change. |
31e121c
to
3ba6f3a
Compare
Ah it is also easy to get the date of the last commit to 000package-groups from OBS. Like this ok? |
d3dbe51
to
7c68a33
Compare
@@ -717,6 +720,14 @@ def update_and_solve_target( | |||
if not product_version: | |||
# for stagings the product version doesn't matter (I hope) | |||
product_version = '1' | |||
with open(product_dir + "/release.changes", 'w', encoding="utf-8") as f: | |||
date = package_source_changed(api.apiurl, project, group).replace(tzinfo=timezone.utc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the changelog date should really state when the change was done and that's datetime.now()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
AFAIK it was not done now, only the build was done now. It may also have been rebuilt resulting in no change if the input was the same. So I think it was done when the source (the group package) was commited (or earlier). I feel it is similar to that we don't change the changelog when we rebuild rpms. I do think we gain no information by storing when it was built.
The idea of reproducible builds is that the build date doesn't matter as the output will be the same no matter at what time the build is done. The advantage of reproducible builds is that to security review you only review the source and not the build output, to verify the build output you only need to bit wise compare it to a rebuild. Embedding $now would prevent that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3062 (comment): Only if a change was actually performed, add a changes entry which contains datetime.now()
.
package_source_changed(api.apiurl, project, group)
is constant except for manual changes, so the current code will end up using a backdated changelog entry, resulting in mtimes which don't really make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mh. Instead of just the date any input to the release package changed, a different date that seems useful to me is when any source package in the whole distro last changed. Sadly that is probably too work intensive to compute with OBS-vcs, but it hopefully will be easy when we move to git.
Your suggestion to only change the date when the output changes, but then to $now does not make sense to me, because that is neither of those and non-reproducible (by the definition of https://reproducible-builds.org/ ).
Any date is always sort of backdated due to latency until being observed. Latency from one CPU sending to it being observed, e.g. by an output device like a disk or another CPU core in a multi core machine. Time is not intuitive like you imply. So IMHO the date being backdated, doesn't make it make less sense, even when it is further back by magnitudes.
It seems to me that it being reproducible and knowing when a human with osc commit triggered the change and defined the full outcome of the computation has more value than knowing what clock the CPU perceived when it did the computation.
I don't know of any case where using $now here would actually break something, it just hinders making this build step reproducible, but at least not the next ones.
If you prefer I can change it to always $now and when we move to any distro from obs-vcs to git, we can discuss again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #3064 for the alternative with $now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion to only change the date when the output changes, but then to $now does not make sense to me, because that is neither of those and non-reproducible (by the definition of https://reproducible-builds.org/ ).
It is perfectly reproducible as the date is saved. When I do osc vc
, the same happens and it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There I'm not talkling about the build output of the release rpm, but the source of it, which is the build output of this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out in #3064 (comment) I missed that some of the input is from 000update-repos . I added that. It seems 000product is not an input. Did I miss anything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the binary list of the FTP tree is part of the input. Any name that disappears is added to the weakremovers.inc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah it gets that from the solv modules. sigh. Then there is no easy way around it. Thank you.
Note for which ever solution is picked: the %changelog in the spec.in files in 000package-groups probably needs to be removed before the following run. |
release rpm files currently do not have .changes files, but these are needed to extract SOURCE_DATE_EPOCH for reproducible builds. Use the newest time stamp of the last commit of all inputs as the date for the changes entry. All the inputs are are in the packages 000package-groups or 000update-repos.
7c68a33
to
8efe201
Compare
release rpm files currently do not have .changes files, but these are needed to extract SOURCE_DATE_EPOCH for reproducible builds.
Use the time stamp of the last commit to the 000package-groups
package as the date for the changes entry.