Skip to content

Comments

initrd: Opt-in bare bones systemd-based initrd#164943

Merged
dasJ merged 20 commits intoNixOS:masterfrom
ElvishJerricco:systemd-initrd-reuse-systemd-module
Apr 3, 2022
Merged

initrd: Opt-in bare bones systemd-based initrd#164943
dasJ merged 20 commits intoNixOS:masterfrom
ElvishJerricco:systemd-initrd-reuse-systemd-module

Conversation

@ElvishJerricco
Copy link
Contributor

@ElvishJerricco ElvishJerricco commented Mar 20, 2022

Third time's the charm.

Motivation

This PR adds an option, boot.initrd.systemd.enable, that causes NixOS to generate an initrd that uses systemd for PID 1. This brings several benefits:

  • Upstream units can be used to implement almost all of stage 1 logic, eliminating the need for the many lines of shell scripting NixOS has in its traditional stage 1.
  • Early boot becomes parallelized.
  • systemd-ask-password can be used to request credentials, providing a more streamlined way for early boot tasks to communicate with users.
  • Configuration of stage 1 becomes declarative in exactly the same way that systemd in stage 2 is declarative, instead of adding imperative shell scripts in a variety of places.

How this compares to my previous attempts

This version of a systemd-based initrd has three major improvements over the previous attempts.

  1. It is an opt-in alternative to NixOS's traditional initrd, instead of a replacement.
  2. It uses the same systemd option types that NixOS uses for stage 2 systemd configuration.
  3. It will be implemented piecemeal, instead of in one large PR.
  4. The produced initrd appears to be quite a bit smaller than the previous attempts for whatever reason, at just 9.1M (which is quite good compared to NixOS's tradtional initrd).

The third improvement is especially important. This PR only implements the core functionality needed to have a systemd-based initrd. There are many things that NixOS's traditional initrd supports that this does not yet. But with this framework in place, we can begin to add those features to the systemd-initrd one by one in separate, parallel PRs.

Notably, this PR does keep the make-initrd-ng concept from the previous one. You can check the README included with it in this PR for details, but the short version is that we don't copy full closures into initrd, and we don't patch things. Instead, we try to automate copying only the exactly correct things to their original paths to the best of our ability, and manually add what the automation misses in the boot.initrd.systemd.objects option. The benefits are outlined in the README.

This can be tested with nix-build ./nixos -A vm.

What this approach will not do

is include a compatibility layer for options like postDeviceCommands and the like. Although this is possible in principle with special units that run the concatenated script with Before= and After= clauses on the appropriate targets, I think this is a bad idea for several reason.

  1. It is antithetical to the purpose of using systemd. We want stage 1 to be parallelized, and declarative. These scripts are the opposite of both. It would slow down boot times and make the code for systemd-initrd less clean.
  2. These scripts sometimes expect to be run in the user interactive console. Trying to accommodate this would be troublesome.
  3. Maddness this way lies. I can only imagine the plethora of complexity and edge cases involved in trying to keep extraUtils and the *Commands options alive in this initrd.

Next steps

Here are a number of things to do in future PRs, in very loose, decreasing order of how I view their importance.

  • Testing. Currently this will only boot extremely simple setups. It'd be good to know how simple that is, and it will be important to test the other things we work on as we go to make sure they work as well.
  • Support for the script style options in boot.initrd.systemd.services. Currently they do not copy the necessary scripts in.
  • Proper udev rules configuration.
  • Basic LUKS support.
  • ZFS / btrfs / LVM / mdadm support.
  • Plymouth support
  • Networking / SSH support
  • Advanced LUKS features (e.g. yubikeys).
  • Move activation into stage 1 so that we can immediately switch-root to systemd. This is nice because systemd can serialize and deserialize some useful information if it knows it's switch-root'ing to systemd.

Most of these should be doable independently of each other, and likely need little more API than the core functionality provided by this PR.


Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.05 Release Notes (or backporting 21.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

@ElvishJerricco ElvishJerricco requested a review from dasJ as a code owner March 20, 2022 08:04
@github-actions github-actions bot added 6.topic: kernel The Linux kernel 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. 8.has: module (update) This PR changes an existing module in `nixos/` labels Mar 20, 2022
@ElvishJerricco ElvishJerricco mentioned this pull request Mar 20, 2022
20 tasks
@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Mar 20, 2022
@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch 2 times, most recently from 57296ec to 5253f1c Compare March 20, 2022 09:03
Comment on lines 198 to +200
lib = import ./systemd-lib.nix { inherit lib config pkgs; };
unitOptions = import ./systemd-unit-options.nix { inherit lib systemdUtils; };
types = import ./systemd-types.nix { inherit lib systemdUtils; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lib = import ./systemd-lib.nix { inherit lib config pkgs; };
unitOptions = import ./systemd-unit-options.nix { inherit lib systemdUtils; };
types = import ./systemd-types.nix { inherit lib systemdUtils; };
lib = import ./systemd/lib.nix { inherit lib config pkgs; };
unitOptions = import ./systemd/unit-options.nix { inherit lib systemdUtils; };
types = import ./systemd/types.nix { inherit lib systemdUtils; };

Maybe? I am not sure. Just an idea.

@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch from d6f3bd9 to 4b4e589 Compare March 20, 2022 20:05
@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch from f97afcd to 954b88f Compare March 21, 2022 00:50
Copy link
Member

@ckiee ckiee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started this review yesterday, there's a chance you already touched some of the things I reviewed.

Mostly code quality nitpicks, but more test{s,ing} for [this] core system component would be nice to see.

Comment on lines 328 to 361
mapAttrs' (n: v: nameValuePair "${n}.path" (pathToUnit n v)) cfg.paths
// mapAttrs' (n: v: nameValuePair "${n}.service" (initrdServiceToUnit n v)) cfg.services
// mapAttrs' (n: v: nameValuePair "${n}.slice" (sliceToUnit n v)) cfg.slices
// mapAttrs' (n: v: nameValuePair "${n}.socket" (socketToUnit n v)) cfg.sockets
// mapAttrs' (n: v: nameValuePair "${n}.target" (targetToUnit n v)) cfg.targets
// mapAttrs' (n: v: nameValuePair "${n}.timer" (timerToUnit n v)) cfg.timers
// listToAttrs (map
(v: let n = escapeSystemdPath v.where;
in nameValuePair "${n}.mount" (mountToUnit n v)) cfg.mounts)
// listToAttrs (map
(v: let n = escapeSystemdPath v.where;
in nameValuePair "${n}.automount" (automountToUnit n v)) cfg.automounts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels pretty repetitive but it does keep the code simpler vs. a higher level approach. Not sure what to make of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it's not the first time this is repeated in nixos. I suppose I could factor it out into a function. Would that be good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that'd be nice! The generic-unit-ification being in what is effectively the consumer of the systemd lib seems weird.

@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch from 55a6600 to 9d86e36 Compare March 21, 2022 03:25
@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch from 5d8aeef to 89ccc7d Compare March 21, 2022 14:25
@ElvishJerricco ElvishJerricco changed the title [WIP]: initrd: Opt-in bare bones systemd-based initrd initrd: Opt-in bare bones systemd-based initrd Mar 21, 2022
@ElvishJerricco
Copy link
Contributor Author

#164016 has been merged, so I've rebased onto its last commit and removed the WIP label from this PR

Copy link
Member

@dasJ dasJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really looked at the Rust code apart from my one suggestion which helped me in debugging yesterday.

The approach is very promising and apart from the first commit not really splittable anymore. Most things I comment here are so that we nail some details right at the beginning instead of people starting to work on different aspects of the initrd while the core of it is still in flux.

I'm currently discussing stuff with @tfc to make sure we can get better test coverage. While I don't think this test needs more, I'd love future PRs with more features having the ability to add more stage 1 tests.

About the switch to the real system - I don't know if this PR should already contain code to switch to the new systemd directly without the shell script in between. While this approach in this PR already works of course, it's another thing people may make assumptions about ("this worked in systemd stage-1 before!!"). As you know, I'm experimenting with this detail and will provide my findings when they come up.

@ElvishJerricco ElvishJerricco force-pushed the systemd-initrd-reuse-systemd-module branch from 89ccc7d to c730ab0 Compare March 22, 2022 11:02
flokli added 2 commits March 24, 2022 18:47
Make this reachable from pkgs.fakeNss. This is useful outside docker
contexts, too.

NixOS#164943 (comment)
@ElvishJerricco ElvishJerricco requested a review from roberth as a code owner March 24, 2022 17:53
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use module composition instead of generated modules.
It keeps the types simple.

@dasJ dasJ force-pushed the systemd-initrd-reuse-systemd-module branch from 1ea78c4 to 1e5261f Compare April 1, 2022 07:57
@dasJ
Copy link
Member

dasJ commented Apr 1, 2022

@roberth can you take another look? I added two commits, one that does the type composition as requested and another one that adds type composition to the unit types to get rid of options that are not used

@dasJ dasJ force-pushed the systemd-initrd-reuse-systemd-module branch from 22ac161 to 0d69a99 Compare April 1, 2022 09:51
@dasJ dasJ force-pushed the systemd-initrd-reuse-systemd-module branch from 0d69a99 to c465c8d Compare April 1, 2022 09:58
dasJ added 2 commits April 1, 2022 13:26
As requested by @roberth, we now have an option similar to
environment.etc. There's also extra store paths to copy and a way to
suppress store paths to make customizations possible.

We also link mount and umount to /bin to make recovery easier when
something fails
This is more in line with what dracut does (it appends "Initramfs") and
makes it clear where the boot is currently at when it hangs.
@ElvishJerricco
Copy link
Contributor Author

@dasJ Those changes look great!

@dasJ
Copy link
Member

dasJ commented Apr 2, 2022

If there are no more objections and pushes, I'll merge this in the next 24-48h

@dasJ dasJ merged commit 7cdc4dd into NixOS:master Apr 3, 2022
@bobvanderlinden
Copy link
Member

Awesome! This unlocks so much potential improvements. Thanks a lot for all the work of everyone! 🥳🥳🥳

@ncfavier
Copy link
Member

Tried this out on my laptop, it's almost usable except for two issues:

@infinisil infinisil added the 1.severity: significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Apr 19, 2023
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-gpt-partition-automounting-possible-such-that-dev-disk-by-id-is-not-necessary/34790/4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1.severity: significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. 6.topic: kernel The Linux kernel 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.