Skip to content

nixos/test-driver: use breakpointHook to debug tests with pdb & ssh; add dump() function#392117

Closed
Ma27 wants to merge 2 commits intoNixOS:masterfrom
Ma27:testdriver-pdb-ssh-sandbox
Closed

nixos/test-driver: use breakpointHook to debug tests with pdb & ssh; add dump() function#392117
Ma27 wants to merge 2 commits intoNixOS:masterfrom
Ma27:testdriver-pdb-ssh-sandbox

Conversation

@Ma27
Copy link
Member

@Ma27 Ma27 commented Mar 22, 2025

Note: draft, depends on #392030, #390996 & #372979.

After a discussion with @tfc it was clear that the interactive part (#392030) is less controversial, so this is another PR on top. Only the last two commits are the relevant changes.


This uses the attach.sh from our breakpointHook to allow getting
into the sandbox of the VM test. In there, it's now possible to do two
things:

  • debugging the test script interactively with pdb[1] via

    telnet 127.0.0.1 4444

  • use the AF_VSOCK part to SSH into a VM.

Also, add a dump() function that outputs stuff, but with a log prefix to make it clear where it's coming from.

[1] https://docs.python.org/3/library/pdb.html

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@Ma27 Ma27 requested review from K900, RaitoBezarius, bew, ctheune and tfc March 22, 2025 15:09
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 6.topic: testing Tooling for automated testing of packages and modules 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. labels Mar 22, 2025
Copy link
Contributor

@bew bew Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced by this new dump function, the print function is basically universal to print something out and I don't see the point of having it not have the 'provenance' prefix by default 🤔

I believe the default behavior should be to have the prefix.
And for special cases we could expose the non-prefix print as raw_print/direct_print/original_print ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, on one hand I agree. OTOH I think it's kinda weird to override print() and give it special behavior (unless file= is set).

For me, this was the less intrusive/controversial change, but I'm very open to do what @bew suggested. What do the maintainers think?

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 2, 2025
@Ma27 Ma27 force-pushed the testdriver-pdb-ssh-sandbox branch from faba8f4 to e2d555e Compare April 26, 2025 09:07
@github-actions github-actions bot removed the 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. label Apr 26, 2025
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 26, 2025
@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. and removed 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 5001+ This PR causes many rebuilds on Linux and must target the staging branches. 10.rebuild-linux: 501+ This PR causes many rebuilds on Linux and should normally target the staging branches. labels Apr 26, 2025
@Ma27 Ma27 force-pushed the testdriver-pdb-ssh-sandbox branch from e2d555e to 5ec80ef Compare May 9, 2025 08:31
This function prefixes each line of the string given to it and prints it
into the log. This is useful to make it clear where the output is
actually coming from which may is expected to be useful in long VM-tests
with `print()` statements in between.

I decided to not replace `print()` here since `print` has a `file=`
argument that allows it to write anywhere, so it's not necessarily a
replacement.

Essentially,

    dump(machine.succeed("ip a"))

results in output like

    machine: must succeed: ip a
    machine: (finished: must succeed: ip a, in 0.01 seconds)
    (test-script) 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    (test-script) [...]
    (test-script) 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    (test-script) [...]
    machine: ...
@Ma27 Ma27 force-pushed the testdriver-pdb-ssh-sandbox branch from 5ec80ef to 68e3527 Compare May 11, 2025 13:38
@github-actions github-actions bot removed the 8.has: module (update) This PR changes an existing module in `nixos/` label May 11, 2025
This uses the `attach.sh` from our `breakpointHook` to allow getting
into the sandbox of the VM test. In there, it's now possible to do two
things:

* debugging the test script interactively with pdb[1] via

    telnet 127.0.0.1 4444

* use the AF_VSOCK part to SSH into a VM.

[1] https://docs.python.org/3/library/pdb.html
@Ma27 Ma27 force-pushed the testdriver-pdb-ssh-sandbox branch from 68e3527 to ed15e11 Compare May 11, 2025 17:24
@Ma27 Ma27 marked this pull request as ready for review May 11, 2025 20:53
@Ma27
Copy link
Member Author

Ma27 commented May 11, 2025

@tfc this would be the last round of the OceanSprint stuff, the more controversial one adding the breakpointHook (+pdb) to the sandbox for debugging inside it.

I know that you had reservations about this addition, but I figured that having the breakpointHook thing as last resort is quite useful, also for VM tests. So here it is :)

Ma27 added a commit to Ma27/nixpkgs that referenced this pull request May 13, 2025
Right now it wrongly seems as if you can set
`sshBackdoor.enable = true;` for each test and not only for debugging
purposes.

This is wrong however since you'd need to pass /dev/vhost-vsock into the
sandbox for this (which is also a prerequisite for NixOS#392117).

To make that clear, two things were changed:

* add a warning to the manual to communicate this.
* exit both interactive and non-interactive driver early if
  /dev/vhost-vsock is missing and the ssh backdoor is enabled.

  If that's the case, we pass a CLI flag to the driver already in the
  interactive case. This change also sets the flag for the
  non-interactive case.

  That way we also get a better error if somebody tries to enable this
  on a system that doesn't support that.
@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label May 17, 2025
@Ma27
Copy link
Member Author

Ma27 commented Jul 18, 2025

Superseded by #422066.

@Ma27 Ma27 closed this Jul 18, 2025
@Ma27 Ma27 deleted the testdriver-pdb-ssh-sandbox branch July 18, 2025 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: testing Tooling for automated testing of packages and modules 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants