qemu, runInLinuxVM: fix KVM availability check#125948
Conversation
|
Result of 4 packages marked as broken and skipped:
2 packages skipped due to time constraints:
18 packages built successfully:
Result of 2 packages marked as broken and skipped:
1 package failed to build:25 packages skipped due to time constraints:
19 packages built successfully:
3 suggestions:
Note that build failures may predate this PR, and could be nondeterministic or hardware dependent. |
|
My gut feeling is that the wrapper should probably warn on stderr if KVM isn't available. It's a bit weird having a wrapper called "qemu-kvm" that doesn't actually use kvm, in any case, but that's the situation we're in :P I guess |
|
I can try emitting a warning on stderr from the wrapper and check how visible that is later this week. It seems the right component to make sure people know what's going on with KVM would actually be Nix and the handling of the We could make sure it works and takes the permissions of the build user into account. This seems to be the code for detecting the system feature: I have not tested this yet because somehow my VM without I think the way to enable non-accelreated execution should be manually adding the To make things more clear to users
it could explicitly prompt you to either
That way at least as part of a build things only take the slower route if you set some deliberate action after a prompt that informs you about the consequences. (There is also some code which suggests that supplementary groups should actually stick to the build users here: I'm thinking about opening an issue about this whole |
puffnfresh
left a comment
There was a problem hiding this comment.
Definitely an improvement on the current situation
Planning on finishing this? --run could be useful here |
Thanks for the hint. I was unsure how to best approach this and got busy with other things and your comment was very helpful. I added warnings now. |
dd8887d to
ad9a113
Compare
|
It seems the slow path of using QEMU without KVM when On my machine QEMU now gets stuck at This was working fine yesterday but I can't reproduce what changed. The following change fixes it. diff --git a/nixos/lib/qemu-common.nix b/nixos/lib/qemu-common.nix
index 84f9060acd6..ebcdd61f587 100644
--- a/nixos/lib/qemu-common.nix
+++ b/nixos/lib/qemu-common.nix
@@ -22,7 +22,7 @@ rec {
else throw "Unknown QEMU serial device for system '${pkgs.stdenv.hostPlatform.system}'";
qemuBinary = qemuPkg: {
- x86_64-linux = "${qemuPkg}/bin/qemu-kvm -cpu max";
+ x86_64-linux = "${qemuPkg}/bin/qemu-kvm -cpu qemu64";
armv7l-linux = "${qemuPkg}/bin/qemu-system-arm -enable-kvm -machine virt -cpu host";
aarch64-linux = "${qemuPkg}/bin/qemu-system-aarch64 -enable-kvm -machine virt,gic-version=host -cpu host";
powerpc64le-linux = "${qemuPkg}/bin/qemu-system-ppc64 -machine powernv";It's similar to what's reported in #141596 (comment) |
|
@mschwaig has anybody made a PR for that? LGTM — I doubt we have many CPU-bound NixOS tests, and that's all qemu-common is used for AFAICT. |
|
It looks like Would switching to I am not aware of another PR that changes the emulated CPU. If it just reduces the set of accelerated or supported instructions, I can add a commit for that here and I think we should merge this PR. Otherwise I'm not sure how to proceed. |
|
Working on this made me think it would make sense for a script that gets invoked before qemu to not only emit warnings but actually determine with which
The XML for describing CPUs and |
|
In the future I would also like to test this with your `qemu-6.2.0` branch @alyssais (#146526) to see if 6.2 will fix qemu getting stuck, so we could eventually revert to `-cpu max` in a future PR, but somehow I cannot fetch that branch from your fork right now.
That's very strange. Does git fetch https://github.com/NixOS/nixpkgs pull/146526/head work?
|
|
Working on this made me think it would make sense for a script that gets invoked before qemu to not only emit warnings but actually determine with which `-cpu` setting qemu gets invoked. For builds outside a VM the supported instruction has an impact on the output anyways, and `-max` already makes a machine-dependent determination of the emulated CPU based on the host CPU. If we made that determination ourselves we could
* write to the log what CPU was emulated,
* account for other factors, like access to `/dev/kvm` and
* tell qemu more specifically what to emulate.
The XML for describing CPUs and `host-model` concept from libvirt (as seen in [this presentation]( https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Kashyap-Chamarthy_Effective-Virtual-CPU-Configuration-OSS-EU2018.pdf) look interesting for that.
I think qemu-kvm originates from Fedora (they certainly also add it).
Maybe it would be worth looking into if they do anything like this?
One of the purposes of qemu-kvm is compatibility with other
distributions that provide it, so I'd be hesitant to change it too much
from what those other distros do.
|
Yes, that worked. Your commit from Since the release date for qemu will be December 7 at the earliest I will add a commit here to change change That workaround should fix #141596 as well and hopefully qemu 6.2.0 fixes the actual problem there as well. |
alyssais
left a comment
There was a problem hiding this comment.
The important change here is fixing the definition of WARNCOL, but I think everything else I've pointed out is also worth considering.
|
Looks good on a first glance, but I'm too tired to say for sure tonight and will have a closer look tomorrow. Could you squash all the warning stuff into a single commit, and add appropriate "Fixes:" lines to your commit messages please? |
KVM should only be considered abailable if /dev/kvm exists and is read-writable by the user that is trying to launch it. The previous check for existance only had the consequence that on some Linux distributions running VMs with Nix's QEMU only worked if KVM was NOT installed. fixes NixOS#124371
9b75f9d to
c9d7163
Compare
I have reordered and squashed the commits so there is one commit per topic, added which issues they fix and moved the emitted warning from |
alyssais
left a comment
There was a problem hiding this comment.
Sorry for the delay. I've now tested this and am happy with it apart from this small terminology issue.
This fixes the qemu-kvm wrapper we add for convenience silently not using KVM, when the system would support it by at least leaving an indication in the log that the build ran slower because it ran without KVM.
The flag -cpu max leaves QEMU 6.1.0 stuck on some systems, for example when /dev/kvm is not read-writable. This does not happen with -cpu qemu64. Getting stuck like that is a regression in 6.1.0 not yet present in 6.0.0 and should be fixed with 6.2.0 according to early testing with rc1. We should consider reverting this change when we merge QEMU 6.2.0. See NixOS#146526. fixes NixOS#141596
c9d7163 to
abbe8cb
Compare
|
@alyssais No worries, I have updated the terminology. I think it's better now. |
|
Backport failed for Please cherry-pick the changes locally. git fetch origin release-21.05
git worktree add -d .worktree/backport-125948-to-release-21.05 origin/release-21.05
cd .worktree/backport-125948-to-release-21.05
git checkout -b backport-125948-to-release-21.05
ancref=$(git merge-base 716815ce2a1fcb135843c7441648a59d62fb6eb6 abbe8cbc4843c213947abef70eb11f10ebea96f1)
git cherry-pick -x $ancref..abbe8cbc4843c213947abef70eb11f10ebea96f1 |
|
Backport failed for Please cherry-pick the changes locally. git fetch origin release-21.05
git worktree add -d .worktree/backport-125948-to-release-21.05 origin/release-21.05
cd .worktree/backport-125948-to-release-21.05
git checkout -b backport-125948-to-release-21.05
ancref=$(git merge-base 716815ce2a1fcb135843c7441648a59d62fb6eb6 abbe8cbc4843c213947abef70eb11f10ebea96f1)
git cherry-pick -x $ancref..abbe8cbc4843c213947abef70eb11f10ebea96f1 |
|
Successfully created backport PR #148015 for |
|
These changes broke the Darwin build, but simply because there's no However, I wonder what I believe |
see #148251 |
This all sounds good. I'd be happy with a PR for any or all of these things. |
|
I have opened #148305 which should remove the warning on other platforms, but was not able to test this change on Darwin myself. I don't know how we would need to adapt the wrapper and warning be useful on for Darwin, but it sounds like a nice idea. |
PR NixOS#125948 introduced a warning when KVM is not available, which should only be emitted on Linux. Similarly the -enable-kvm flag itself also never needs to be added on platforms other than Linux.
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/runnixostest-is-faster-when-interactive/69220/4 |
KVM should only be considered abailable if /dev/kvm exists and
is read-writable by the user that is trying to launch it.
The previous check for existance only had the consequence that
on some Linux distributions running VMs with Nix's QEMU only worked
if KVM was NOT installed.
See issue: #124371
To test this change I
x86_64NixOS.chmod o-rw /dev/kvmresult/bin/nixos-run-vmsWithout this change that test failed right away with a log like this:
while with the change the test proceeded without hardware acceleration via KVM.
I have also tested that this fixes the same issue when running QEMU inside the build sandbox as part of a NixOS Test. The NixOS Test checkbox is unchecked though, because there there is automatic NixOS Tests for the new behavior.
Motivation for this change
Things done
sandboxinnix.confon non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"./result/bin/)