Skip to content

Comments

systemd v250#150491

Closed
andir wants to merge 12 commits intoNixOS:masterfrom
andir:systemd-250
Closed

systemd v250#150491
andir wants to merge 12 commits intoNixOS:masterfrom
andir:systemd-250

Conversation

@andir
Copy link
Member

@andir andir commented Dec 13, 2021

Motivation for this change

Prepare the systemd upgrade to version 250.

Once the final release of systemd v250 has been made & our confidence in our packaging + system configuration is high enough I will rebase thos onto staging with the final version instead of the RC.

I intend to writ proper commit message for all the changes included in this WIP PR once I get to it.

Some of the changes are getting rid of rather old carried-forward bash scripts that patch various paths in the systemd expression. It turned out that a few of those are not required anymore, wouldn't match anything or were not actually catching all the cases. I've written a bit of check code for each of the cases (that we can now declare in Nix rather than Bash code) that ensures that we aren't trying to patch non-existing files or missing a newly introduced (obvious) case. This is in-line with how we deal with the shared libraries mechanism that I introduced some time ago.

As usual I've a jobset on my personal hydra instance for this PR: https://hydra.h4ck.space/jobset/nixpkgs/systemdv250-small (ipv6-only)
Instructions for the binary cache are on the front page of that instance.

The rough steps that still have to be taken are:

  • Go through the changes and add / update option changes for
    • systemd itself
    • networkd
    • any other parts such as udev etc..
  • build the small nixos test suite as initial smoke test (ongoing)
  • build the full nixos test suite
  • test on personal machines / servers as "canary deployments"
    • tested GNOME on a notebook
    • tested on some server
  • investigate / report systemd-boot issues with secure boot enabled. @andir & @NickCao did confirm this issue. Secure boot doesn't work with both our and the Arch binaries.
Things done
  • Built on platform(s)
    • x86_64-linux

@andir andir requested a review from flokli December 13, 2021 00:03
@github-actions github-actions bot added the 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. label Dec 13, 2021
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: systemd now only supports either gold or bfd as value here. We should test cross compiling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely breaks cross-compilation without any target prefix. We might need a wrapper when systemd does support a full name here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since those values are directly passed to binutils, they should not break when cross compiling: https://github.com/systemd/systemd/pull/21264/files#diff-ba74c03f6f96be2712bae81ef72761cea2ec521483d82e64e1eb174809eb84a2R259

@ofborg ofborg bot requested review from edolstra and kloenk December 13, 2021 00:55
@ofborg ofborg bot added 11.by: package-maintainer This PR was created by a maintainer of all the package it changes. 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. labels Dec 13, 2021
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does someone still care about these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is about musl support, that was recently introduced.

We should probably highlight @yu-re-ka to take a look if things still apply, but as per #141980 (comment), it shouldn't block this PR.

Copy link
Member

@alyssais alyssais Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patches no longer apply. :(

For the last major systemd release it looks like it took OpenEmbedded about a week to catch up, so hopefully we won't be waiting too long.

But again, no need to block the PR for Musl.

(Edit: here's the place to watch)

@mweinelt mweinelt mentioned this pull request Dec 24, 2021
13 tasks
@andir andir force-pushed the systemd-250 branch 3 times, most recently from 147d0bb to 948d620 Compare December 25, 2021 12:19
@Mic92
Copy link
Member

Mic92 commented Dec 26, 2021

Tested. Works for me on my laptop.

We don't have to do that as we already set all the feature flags to
null. Setting individual libraries to null instead of disabling their
feature flag will lead with bad example that will cause each of the
features to be disabled with multiple flags in the systemdMinimal
variant.

If a dependency is pulled in via another feature we should disable that
rather than setting it to null. Overriding a given package should be the
last resort.
This allows us to make test-only dependencies optional in builds that
aren't running tests (sadly all of our builds).
@andir
Copy link
Member Author

andir commented Dec 31, 2021

Tested. Works for me on my laptop.

You aren't running with systemd-boot, are you? For me this branch killed systemd-boot (it doesn't load anymore).

NOTE: nvm somehow my secureboot signing key didn't sign the new bootloader. A bug but somewhere else in the stack :)
NOTE2: It also boots up on my GNOME notebok without any issues so far. I'll continue using this but perhaps this has been the least involved systemd bump in a while! 🥳

@andir
Copy link
Member Author

andir commented Dec 31, 2021

Tested. Works for me on my laptop.

@Mic92 one issue that I ran into: My system time gets reset to Jan 01 1980. I've seen that in random nixos tests but wasn't sure if that was a previous impurity.

Since systemd commit ce4121c6ff92c1c368874bd451b73fa9b1ddec4a the option
is no longer known and we should remove it from our expression.
This has never been a valid option as far as I can tell. I am not sure
why we started adding it in the first place.
As of systemd commit add384dd4d2b96db6ace5ad9c52b1dd7553ebec2 that
option doesn't have any effect anymore. Systemd defaults to calling its
own systemctl instead. As of systemd commit
9a85778412fa3e3f8d4561064131ba69f3259b28 that option was finally removed
from the meson_options.txt.
When initializing a system (e.g. first boot / livecd) we have no good
reference source for time. systemd-timesyncd however would revert back
to its configured fallback time (in our case 01.01.1980). Since we
probably don't want to hardcode a specific date as fallback we are now
using the current system time (wherever that might have come from) to
initialize the reference clock file.

The only systems that might be remotely affected by this change are
machines that have highly unreliable RTCs or those where the battery
that backs the RTC is running empty.

Historically these systems always had a tough time with anything time
related and likely required manual intervention.

For stateless systems (those that wipe / between reboots or our
installer CDs) this has the consequence that time will always be reset
to whatever the system comes up with on boot. This is likely the correct
time coming from an RTC. No harm done here the situation is likely
unchanged for them.

For stateful systems (those that retain the / partition across reboots)
there shouldn't be a change at all. They'll provide an initial clock
value once on their lifetime (during first boot / after installation).
From then onwards systemd-timesyncd will update the file with the newer
fallback time (that will be picked up on the next boot).
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Dec 31, 2021
@AndersonTorres
Copy link
Member

I am tracking Meson 0.60 adoption at #153082. New meson complains about invalid options.

@NickCao
Copy link
Member

NickCao commented Jan 2, 2022

Just see that we are building in developer mode, does that incur any implication?

@flokli
Copy link
Member

flokli commented Jan 4, 2022

Linking to systemd/systemd#21964, which might cause a regression in some setups.

From the linked wireguard ML PSA:

This means that if you're currently using systemd-networkd v250 with
0.0.0.0/0 or ::/0 or similar in your allowed IPs, those allowed IPs
will be automatically added to the main routing table, which might
prove problematic for folks who are already manually doing fancy
fwmark things with systemd-networkd. If this applies to you, you may
want to set `RouteTable=off` explicitly.

At the moment, I suspect this mostly affects Arch Linux users who
followed fwmark instructions on their wiki.

@github-actions github-actions bot added 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation labels Jan 5, 2022
@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jan 5, 2022
@Foxboron
Copy link

Foxboron commented Jan 6, 2022

@andir
Can you upload the failing bootloader?

I suspect you are encountering this: systemd/systemd@12caf72

Might not be an actual error in the bootloader.

@Izorkin
Copy link
Contributor

Izorkin commented Jan 8, 2022

An error is displayed during system startup:

[   31.884744] systemd[2461]: /nix/store/iqms6zimc01336b0s0m0a30hg3nc48pp-systemd-250.1/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.

@NickCao
Copy link
Member

NickCao commented Jan 9, 2022

An error is displayed during system startup:

[   31.884744] systemd[2461]: /nix/store/iqms6zimc01336b0s0m0a30hg3nc48pp-systemd-250.1/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.

Reproduced locally, but it should not be executed in the first place, per #146497

@Mic92
Copy link
Member

Mic92 commented Jan 10, 2022

I no longer experienced the DNS issues btw with resolved.

@Mic92 Mic92 mentioned this pull request Jan 10, 2022
13 tasks
@Mic92
Copy link
Member

Mic92 commented Jan 10, 2022

While reviewing #153237
I noticed that aesmd now fails with:

Jan 10 02:18:33 turingmachine systemd[75093]: aesmd.service: Failed to set up mount namespacing: File exists
Jan 10 02:18:33 turingmachine systemd[75093]: aesmd.service: Failed at step NAMESPACE spawning /nix/store/n6mxmq4cc5apw9az1cz6qmidf7w4nj93-copy-aesmd-data-files.sh: File exists

Hopefully this is also reproducible in the aesmd test.

@andir
Copy link
Member Author

andir commented Jan 10, 2022

An error is displayed during system startup:

[   31.884744] systemd[2461]: /nix/store/iqms6zimc01336b0s0m0a30hg3nc48pp-systemd-250.1/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.

Reproduced locally, but it should not be executed in the first place, per #146497

This feels orthogonal. Mind opening a PR for that?

@veehaitch
Copy link
Member

Hopefully this is also reproducible in the aesmd test.

The aesmd NixOS test also doesn't pass due to the same error. Unfortunately, I don't really get why.

@jonringer
Copy link
Contributor

With the 22.05 release, we should probably be seeking to get this in sooner than later. About 3 months / 6 more staging-next cycles until this will be disallowed from being merged until after 22.05 branch-off

@flokli
Copy link
Member

flokli commented Jan 19, 2022

About 3 months / 6 more staging-next cycles until this will be disallowed from being merged until after 22.05 branch-off

Sorry for the nit, but that's a bit of a strong word for a volunteer effort.

I'd have appreciated if this was simply a "We should try to get this in well before the stabilization phase". Also, this should probably not only apply to systemd, but all more scary world-rebuilds with possible fallouts, such as glibc/binutils/coreutils upgrades etc… IMHO, a 3 months blocking/freeze phase is probably a bit too much for a project with a 6 months release cycle.

I'm not the author of this PR, but I'm well aware @andir is doing a lot of thorough testing, even runs his own Hydra jobset (on his own Hydra instance).

@flokli
Copy link
Member

flokli commented Jan 19, 2022

@andir can you upgrade this to the v250.3 systemd-stable release, and maybe target the staging branch?

@andir
Copy link
Member Author

andir commented Jan 19, 2022

With the 22.05 release, we should probably be seeking to get this in sooner than later. About 3 months / 6 more staging-next cycles until this will be disallowed from being merged until after 22.05 branch-off

Lol. I'll just stop working in this. Thanks.

@andir
Copy link
Member Author

andir commented Jan 19, 2022

I'll close this as I don't intend to continue working on this. I'm only working on Nixpkgs while the freezes aren't in place. Now asking for even tighther schedule or using strong wording like here isn't what I want to waste my time on.

@andir andir closed this Jan 19, 2022
@NickCao
Copy link
Member

NickCao commented Jan 19, 2022

Is there anything else blocking this other than the sgx issue?

@flokli
Copy link
Member

flokli commented Jan 21, 2022

@NickCao see above, this needs to be updated to the latest 250.x point release, and tested again.

With @andir dropping this effort, there needs to be someone else taking this over, reviewing the upstream change since v250 and properly making sure there's no new regressions.

Running all these VM tests in a jobset is also pretty hefty on compute requirements.

@Artturin
Copy link
Member

@jonringer if someone is willing to work on this then could you give them access to your server im sure it would be of great help

@jonringer
Copy link
Contributor

Sorry for the nit, but that's a bit of a strong word for a volunteer effort.

It was meant to be a statement of fact: https://nixos.github.io/release-wiki/Release-Process-Walkthrough.html

Release managers are volunteers too. I'm just saying that mid-April will be the cutoff point for release critical packages for the 22.05 release. The window will open back up in mid-may for breaking changes to unstable.

If changes aren't in a good state to merge, they can always take some more time. But, forcing merges of "scary, ecosystem wide changes" right before a small stabilization window isn't really satisfactory other people's mental health either.

Lol. I'll just stop working in this. Thanks.

That's not my intended effect. The systemd bump for 20.09 was merged to master right before ZHF, and it cause a lot of issues with DE's, and getting the release in a usable state.

If we get this merged in 2 months, then that gives ~4 more weeks of time for edge case failures to be sorted out on unstable. Those weeks really help with release stabilization, especially if fixes need to go through staging cycles.

@jonringer if someone is willing to work on this then could you give them access to your server im sure it would be of great help

My server is open to anyone with nixpkgs commit bits.

@jonringer
Copy link
Contributor

IMHO, a 3 months blocking/freeze phase is probably a bit too much for a project with a 6 months release cycle.

I must have had poor wording, theres 3 months UNTIL it freezes. The freeze period is 4 weeks long for release critical packages. For the 22.05 release, this will be from mid-April to mid-May.

However, additional time before then is also welcome; as it allows for more issues to be discovered on unstable before ZHF.

@jonringer
Copy link
Contributor

If I offended someone, I apologize. Just wanted to avoid a situation where April comes around, and we have to say "No, this is too much risk for stabilization".

@andir I really appreciate all the work you have done for nixpkgs.

I can take up this PR if you're discouraged from continuing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 101-500 This PR causes between 101 and 500 packages to rebuild on Darwin. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 11.by: package-maintainer This PR was created by a maintainer of all the package it changes.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.