Skip to content

nixos/test-driver: use vhost-device-vsock for SSH backdoor#453305

Open
Ma27 wants to merge 6 commits intoNixOS:staging-nixosfrom
Ma27:vsock-vhost
Open

nixos/test-driver: use vhost-device-vsock for SSH backdoor#453305
Ma27 wants to merge 6 commits intoNixOS:staging-nixosfrom
Ma27:vsock-vhost

Conversation

@Ma27
Copy link
Member

@Ma27 Ma27 commented Oct 18, 2025

A few notes for reviewers:

Since I'm on systemd 257, I used the following ssh config:

Host vsock-mux/*
        ProxyCommand /run/current-system/sw/lib/systemd/systemd-ssh-proxy %h %p
        ProxyUseFdpass yes
        CheckHostIP no

        # Disable all kinds of host identity checks, since these addresses are generally ephemeral.
        StrictHostKeyChecking no
        UserKnownHostsFile /dev/null

Because of that and with this being an invasive last-minute change, I'd prefer to target 26.05 instead of 25.11 here.


vhost-device-vsock[1] is a custom implementation of AF_VSOCK, but the
application on the host-side uses a UNIX domain-socket. This gives us
the following nice properties:

  • We don't need to do --arg sandbox-paths /dev/vhost-vsock anymore for
    debugging builds within the sandbox. That means, untrusted users can
    also debug these kinds of tests now.

  • This prevents CID conflicts on the host-side, i.e. there's no need for
    using sshBackdoor.vsockOffset for tests anymore.

A big shout-out goes to Allison Karlitskaya, the developer of test.thing[2]
who talked about this approach to do AF_VSOCK on All Systems Go 2025.

This patch requires systemd 258[3] because this contains vhost-mux in
its SSH config which is needed to connect to the VMs from now on.

To not blow up the patches even more, this only uses AF_VSOCK for the
debugger. A potential follow-up for the future would be a removal of the
current backdoor.service and replace it entirely by this
functionality.

[1] https://github.com/rust-vmm/vhost-device/blob/main/vhost-device-vsock/README.md
[2] https://codeberg.org/lis/test.thing
[3] #427968


Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@Ma27 Ma27 requested review from roberth and tfc October 18, 2025 19:47
@Ma27 Ma27 added the 2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off label Oct 18, 2025
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-nixos-tests This PR causes rebuilds for all NixOS tests and should normally target the staging branches. 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 6.topic: testing Tooling for automated testing of packages and modules 8.has: documentation This PR adds or changes documentation labels Oct 18, 2025
@@ -147,6 +150,8 @@ def cmd(
if not allow_reboot:
qemu_opts += " -no-reboot"

qemu_opts += vhost_vsock.qemu_args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StartCommand class is already extremely educated about qemu specific things.
The VHostVsock class encapsulates very qemu specific concerns.

The sum of LOC that we get with this abstraction is seemingly higher than if we would take a vsock_id: int | None as a parameter here and then did something like

if vsock_id is not None:
  qemu_opts += "--vsock-foo-bar-{vsock_id}"

At the same time this additional abstration doesn't buy us much and qualifies as shallow module or "classitis".

Especially looking at the future where we might have different backends for the Machine class, it might not be helpful to scatter qemu concerns more.

I suggest to remove the VHostVsock class and use the simpler approach. (Respectfully, assuming i did not miss any concerns while i was reading the code)

Copy link
Member Author

@Ma27 Ma27 Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sum of LOC that we get with this abstraction is seemingly higher than if we would take a vsock_id: int | None as a parameter here and then did something like

Agreed.

The original problem I wanted to solve is that I had to pass (Path, Path, int) | None through a bunch of layers with if x is not None checks on every place I'll use this.

That being said, I think I ended up with a solution where the overhead is indeed not necessary anymore: one Path doesn't need to be exposed anyways [2] removing the ambiguity from the triple and the cleanup of VLans and machiens is done in the driver's context-manager, so I think we should use that vsocks as well[1].

Thanks for your input.

[1] That being said, allocating resources in __init__ and dropping them in __exit__ is kinda weird. I'd argue this should be pushed into __enter__ then.
[2] Whoops, both paths should be exposed. And I think that's where a dataclass makes sense, passing around a triple makes it ambiguous which of the two paths is being used. But still, it makes sense to remove the abstraction and turn this into a dumb data-class in the driver module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the approach a little bit. As documented in the last commit, there are two new classes:

  • VsockPair which contains three properties. The goal here is to not use triples for a host-path/guest-path/guest-cid since having two times Path in it has the potential of mixing up things.
  • VsockDeviceVsock abstracts away the creation of the socket pairs and the start of the process itself. I think that's not necessarily a detail, the driver should take care about (similar to e.g. starting up VLANs). This isn't its own context-manager anymore, it uses the context-manager rom the driver, like everything else which seems much cleaner.
  • The QEMU stuff is part of StartCommand, as suggested. That wayh, we only have to pass guest_path to the machine (and start command), nothing else which makes another part of the abstraction go away.
  • I noticed it's trivial now to write a test to make sure the ssh backdoor works, so we have a test now :)

Ma27 added 5 commits October 21, 2025 18:37
This allows us to e.g. use `mkRemovedOptionModule` which will come in
handy in the upcoming commits.
This option configures a memfd backend for the VM's memory with the size
of `virtualisation.memorySize` and uses that as default memory backend.

This is required for e.g. vhost-device-vsock.

The motivation for making this an option is that I decided to enable
this by default for all NixOS tests to avoid changing essential QEMU
options based on whether or not debugging is enabled. However, there
should be an easy way to turn it off which is what this option provides.
The context manager's purpose is to allocate its resources in `__enter__` and
release them again in `__exit__`.

Right now, the approach is merely a hack since we allocate everything in
the constructor, but use the context-manager protocol as way to reliably
terminate everything.

This isn't a functional change, but merely a correctness change by using
the methods the way they were intended.
`vhost-device-vsock`[1] is a custom implementation of AF_VSOCK, but the
application on the host-side uses a UNIX domain-socket. This gives us
the following nice properties:

* We don't need to do `--arg sandbox-paths /dev/vhost-vsock` anymore for
  debugging builds within the sandbox. That means, untrusted users can
  also debug these kinds of tests now.

* This prevents CID conflicts on the host-side, i.e. there's no need for
  using `sshBackdoor.vsockOffset` for tests anymore.

A big shout-out goes to Allison Karlitskaya, the developer of test.thing[2]
who talked about this approach to do AF_VSOCK on All Systems Go 2025.

This patch requires systemd 258[3] because this contains `vhost-mux` in
its SSH config which is needed to connect to the VMs from now on.

To not blow up the patches even more, this only uses AF_VSOCK for the
debugger. A potential follow-up for the future would be a removal of the
current `backdoor.service` and replace it entirely by this
functionality.

The internal implementation tries to be consistent with how VLANs and
machines are handled, i.e. the processes are started when the Driver's
context is entered and cleaned up in __exit__().

I decided to push the process management and creation of sockets for
vhost-device-vsock into its own class, that's an implementation detail
and not a concern for the test-driver. In fact, `vhost-device-vsock` is
something we can drop once QEMU implements native support for using
AF_UNIX on the host-side[4]. `VsockPair` is its own class since
returning e.g. a triple of `(Path, Path, Int)` would be ambiguous in
what is the guest and what the host path (and frankly, I found it hard
to distinguish the two when reading the docs of `vhost-device-vsock`
initially).

Finally, now that we can do the SSH backdoor without adding additional
devices to the sandbox, I figured, it's time to write a test-case for
it.

[1] https://github.com/rust-vmm/vhost-device/blob/main/vhost-device-vsock/README.md
[2] https://codeberg.org/lis/test.thing
[3] NixOS#427968
[4] https://gitlab.com/qemu-project/qemu/-/issues/2095
@Ma27 Ma27 requested review from roberth and tfc October 22, 2025 10:36
@Ma27
Copy link
Member Author

Ma27 commented Nov 9, 2025

Friendly ping @tfc

@Ma27 Ma27 removed the 2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off label Dec 10, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Dec 10, 2025
@Ma27
Copy link
Member Author

Ma27 commented Dec 11, 2025

Talked with @tfc about this yesterday: another plan for the test-driver is to implement more backends other than QEMU and this adds code that's specific to QEMU (e.g. cloud-hypervisor does uds to vsock on its own) into the driver. What's missing to get this done is to move this part into the Machine-class.

Not entirely sure in which order the changes land (also, because I'm not sure when I'll be able to allocate time on this again). But our common goal is to have this land in 26.05.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: testing Tooling for automated testing of packages and modules 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-nixos-tests This PR causes rebuilds for all NixOS tests and should normally target the staging branches.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants