Improve test VM workflow. by plietar · Pull Request #2 · reside-ic/packit-infra

plietar · 2024-09-26T10:18:54Z

The VM that is started by nix run .#start-vm now uses GitHub authentication, bringing it closer to the real thing. For this to work it needs a GitHub client ID and secret, which it fetches from Vault.

On the host machine, before starting the VM, a vault token is obtained using the vault login command. The token is passed to the VM as a firmware parameter, allowing it to be used inside the VM to fetch the secrets.

The file layout is tidied up a bit, and the VM tests are improved. These still use basic authentication, for now at least. I've added a GitHub Actions workflow to run the test.

The VM that is started by `nix run .#start-vm` now uses GitHub authentication, bringing it closer to the real thing. For this to work it needs a GitHub client ID and secret, which it fetches from Vault. On the host machine, before starting the VM, a vault token is obtained using the `vault login` command. The token is passed to the VM as a firmware parameter, allowing it to be used inside the VM to fetch the secrets. The file layout is tidied up a bit, and the VM tests are improved. These still use basic authentication, for now at least. I've added a GitHub Actions workflow to run the test.

M-Kusumgar

Looks good to me, quick question though, the kipling instance worked however for some reason the priority pathogens instance kept giving me a error fetching auth config, is this expected, i also have left a couple of other questions

M-Kusumgar · 2024-09-26T11:28:13Z

tests/default.nix

+      # The `virtualisation.vmVariant` setting we to import VM-specific settings
+      # doesn't for the test VMs.


i cannot decipher this comment XD

Wow where did all my verbs go

M-Kusumgar · 2024-09-26T11:30:21Z

vm.nix

so this is a bit like vagrant files?

M-Kusumgar · 2024-09-26T13:30:18Z

tests/default.nix

+    # It's suprisingly easy to run qemu without hardware acceleration and not
+    # notice it, which makes the VM so slow the tests tend to fail. This forces
+    # KVM acceleration and will fail to start if missing.
+    virtualisation.qemu.options = [ "-machine" "accel=kvm" ];


okay been reading up about kvm and qemu, i get the general gist but couldnt find exactly how kvm acceleration works, is it because qemu simulates a full machine and kvm hardware acceleration brings that abstraction closer to the metal of the actual machine the vm is running on?

I'm not an expert, but here's my attempt: QEMU is a general virtual machine framework. It has multiple backends, mostly TCG and KVM. Regardless of the backend, it needs to emulate lots of peripherals, eg. the network card and file storage.

TCG is an actual emulator, implemented as a JIT. It reads instructions from the guest machine and translates them into host machine instructions. It also needs to emulate a whole bunch of low level stuff (eg. the memory managment, interrupts, ...). Emulating stuff this way is super slow. Fine if you want to emulate a 90s video game console on a modern fast CPU, less fine if you want to emulate a modern fast CPU on a modern fast CPU.

KVM is the linux API for hardware accelerated VM. It's the Linux equivalent of Hyper-V I guess. It uses whatever the underlying CPU's acceleration is, on intel that's Intel VT. In that mode, guest instructions are run directly on the host CPU. The CPU has special settings to run in VM mode, and these instructions are correctly isolated from the host and each other. The performance of this is close to native (there's usually a small overhead, but not much). It's what everyone does these days (either via Hyper-V or KVM), cloud VMs would just be too slow without hardware acceleration.

Because KVM needs special hardware access, some linux distros, including Ubuntu, restrict the permissions a little bit. Not all though, from my reading it seems some of them have /dev/kvm as 666, so writable by everyone.

I spent way too long figuring out why the tests were slow but nix .#start-vm was fast. Turns out it was because the former runs inside the nix sandbox and doesn't have my permissions. This line makes it so that the tests fail with an error that is obvious, instead of failing because they time out for being way too slow.

M-Kusumgar · 2024-09-26T13:32:33Z

packages/packit/packit-api.nix

    nativeBuildInputs = [ perl ];
    installPhase = ''
      find $GRADLE_USER_HOME/caches/modules-2 -type f -regex '.*\.\(jar\|pom\|module\)' \
+         | LC_ALL=C sort \


not just this line but i have no clue what youre doing here, this is definitely bside the point of this PR but could you leave a comment about this or something just briefly explaining this

Ahah yes, this line is dark magic stolen from occurrences I found in nixpkgs. See for example
https://github.com/NixOS/nixpkgs/blob/8121f3559a98259a8e767dedf4eaf3939442c54d/pkgs/applications/file-managers/mucommander/default.nix#L39-L48

So Nix builds are either sandboxes with no network, or they have network access but need to have an exact hash output. If we wanted to build a package that fetches external dependencies using stuff gradle, cargo, npm, we'd have to set the hash of the output of the build, which is tedious since that would check any time our source changes just a tiny bit, or we change the build process a little.

The compromise in Nix is to split the build process in two, fetch the dependencies out of the sandbox and set a fixed hash, and then run the actual build in the sandbox and not have to set the output hash. This is fine for sensible package managers, but gradle is not one of them. It doesn't really have a proper way of just fetching the dependencies.

What we do here is build packit-api twice. After the first build, we throw away the build output but keep the cache in $GRADLE_USER_HOME/caches/modules-2, and do some preprocessing to make the cache have the same file layout as Maven does. We set a hash for this (sources.gradleDepsHash). Then on the second build we replace the maven references with the output of the first build (it's what the gradleInit below does).

Now the actual sort I added is because the same artifact might appear multiple times in the cache, with slightly different contents, eg. one was published to maven central and one to the gradle plugin repository. Absolute madness but here we are. Thanfully from what I could tell, the differences were very minor.

We can only have a single copy in our fake maven repo. find was acting non deterministically, and the order wasn't stable, and as a result we weren't always copying the same file. I was getting different result on my machine compared to CI. The sort helps keep that deterministic.

LC_ALL=C is slang for "set the locale to naive English". Sorting technically depends on the configure language, so that help keep the result obvious. It probably doesn't matter, since everything is ASCII and I doubt Nix lets the host system locale leak through.

I hate all of this, but it's the best I could find for now. nixpkgs has actually improved this quite a bit in the current master, but we need to wait until November for a stable release: NixOS/nixpkgs#272380

that was actually fucked up

README.md

M-Kusumgar · 2024-09-26T13:39:56Z

README.md

+sudo tee /etc/udev/rules.d/50-nixbld-kvm.rules <<EOF
+KERNEL=="kvm", RUN+="/bin/setfacl -m g:nixbld:rw $env{DEVNAME}"
+EOF


sorry for the questions, just trying to work out how this is equivalent, the /bin/setfacl -m g:nixbld:rw is fine and then we want /dev/kvm the $env{DEVNAME} gives the /dev bit but then how does the /kvm come into play, youve done KERNEL=="kvm" but couldnt find anything related to kernel on setfacl man page

This is a udev rule, which is how modern (last 10-15 years) Linux distros manage devices in /dev. /dev is now a temporary in-memory filesystem, and any changes to it don't persist reboot.

The rule says, for a device that has the label KERNEL=="kvm", run the following command. DEVNAME is set to /dev/kvm. This is more a udev thing than it is a setfacl one.

Co-authored-by: M-Kusumgar <98405247+M-Kusumgar@users.noreply.github.com>

M-Kusumgar

Looks good to me, thanks for all the explanations!

plietar requested a review from M-Kusumgar September 26, 2024 10:18

M-Kusumgar reviewed Sep 26, 2024

View reviewed changes

plietar and others added 2 commits September 26, 2024 16:16

Update README.md

aa16f70

Co-authored-by: M-Kusumgar <98405247+M-Kusumgar@users.noreply.github.com>

fix words

6f17543

M-Kusumgar approved these changes Sep 26, 2024

View reviewed changes

plietar merged commit aaa782c into main Sep 27, 2024

plietar deleted the fetch-vm-secrets branch September 27, 2024 00:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve test VM workflow.#2

Improve test VM workflow.#2
plietar merged 3 commits intomainfrom
fetch-vm-secrets

plietar commented Sep 26, 2024

Uh oh!

M-Kusumgar left a comment •

edited

Loading

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

plietar Sep 26, 2024

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

plietar Sep 26, 2024 •

edited

Loading

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

plietar Sep 26, 2024

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

Uh oh!

M-Kusumgar Sep 26, 2024

Uh oh!

plietar Sep 26, 2024

Uh oh!

M-Kusumgar left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# The `virtualisation.vmVariant` setting we to import VM-specific settings
		# doesn't for the test VMs.

Conversation

plietar commented Sep 26, 2024

Uh oh!

M-Kusumgar left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

plietar Sep 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

M-Kusumgar left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

M-Kusumgar left a comment •

edited

Loading

plietar Sep 26, 2024 •

edited

Loading