Skip to content

xen: patch libxl to search for qemu-system-i386 properly#365890

Merged
SigmaSquadron merged 1 commit intoNixOS:masterfrom
hehongbo:xen-skip-qemu-check
Oct 10, 2025
Merged

xen: patch libxl to search for qemu-system-i386 properly#365890
SigmaSquadron merged 1 commit intoNixOS:masterfrom
hehongbo:xen-skip-qemu-check

Conversation

@hehongbo
Copy link
Contributor

@hehongbo hehongbo commented Dec 17, 2024

This fixes the error on HVM DomUs creation that we initially saw after switching from Fat Xen. It might be too ugly to merge, and I'd open it to discussions in @NixOS/xen-project .

What happened before, was we decided to build the hypervisor with --with-system-qemu but without a path to the QEMU binary as its string value. According to tools/configure.ac, that configures qemu_xen_path as qemu-system-i386 (with no clue where) if flagged as yes, or if a string value is specified, then configure the given value as is. But we can't wire that to pkgs.qemu_xen, since pkgs.xen is already a dependency of pkgs.qemu_xen, and doing so will make a circular dependency.

However, access() is used to check the existence of the QEMU binary. It's not tailored for binaries, and will not search anywhere in PATH, instead it searches within the current directory. If there is no executable named qemu-system-i386, then libxl__domain_build_info_setdefault() will receive a false positive and default to qemu-xen-traditional but we don't have one, as the hypervisor itself is built with --with-system-qemu. Skipping that check will maintain the value as qemu-system-i386, but we still have to take care of another one at libxl__spawn_local_dm(). Since we already have qemu-system-i386 available at PATH, I temporarily removed those checks.

That is sort of an edge case and I can imagine those regular distros might pass an FHS path to --with-system-qemu and carry on, but finding an executable in the current directory is not appropriate anyway. (Maybe worth upstreaming?)

Update

The patch is now upstreamed via f628129, let's pick it and use it here before it's shipped with future versions of Xen.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added the 6.topic: xen-project Issues and PRs related to the Xen Project Hypervisor. label Dec 17, 2024
@hehongbo
Copy link
Contributor Author

@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. labels Dec 17, 2024
Copy link
Contributor

@SigmaSquadron SigmaSquadron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, might be best to refactor the whole thing and upstream it to Xen. I'll take a look at the xl sources tomorrow.

@ofborg ofborg bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. labels Dec 17, 2024
@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch from b3db303 to ee8b441 Compare December 18, 2024 03:19
@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. and removed 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. labels Dec 18, 2024
@SigmaSquadron
Copy link
Contributor

SigmaSquadron commented Dec 18, 2024

Also, might be best to refactor the whole thing and upstream it to Xen. I'll take a look at the xl sources tomorrow.

I tried my hand at a solution in C. Can you tell it has been years since I wrote C?

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index edeadd57ef..6cb0c68838 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -115,7 +115,32 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
             const char *dm;
 
             dm = libxl__domain_device_model(gc, b_info);
-            rc = access(dm, X_OK);
+            char* q_path;
+            if (strncmp(dm, "/", 1) == 0) {
+                q_path = dm;
+            } else {
+                    char* path = getenv("PATH");
+                    char* paths[];
+                    char* token;
+                    token = strtok(path, ":");
+                    paths[0] = token;
+                    i = 0;
+                    while (token != NULL) {
+                        token = strtok(NULL, ":");
+                        paths[i] = strcat(token, dm);
+                        i++;
+                    }
+                    for (i = 0;
+                        i < sizeof(paths) / sizeof(paths[0]);
+                        i++)
+                    {
+                        if (access(paths[i], F_OK) == 0) {
+                            q_path = paths[i];
+                            break;
+                        }
+                    }
+            }
+            rc = access(q_path, X_OK);
             if (rc < 0) {
                 /* qemu-xen unavailable, use qemu-xen-traditional */
                 if (errno == ENOENT) {

This is barely pseudocode, really. I'll clean it up by the end of the week and upstream it if it works.

@hehongbo
Copy link
Contributor Author

hehongbo commented Dec 18, 2024

@SigmaSquadron Oh, I'm also investigating the same thing with similar idea over the past ~2 hour. 🤦‍♂️ I should reply to tell you earlier.

And here is my version.

--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -332,7 +332,35 @@ const char *libxl__domain_device_model(libxl__gc *gc,
             dm = libxl__abs_path(gc, "qemu-dm", libxl__private_bindir_path());
             break;
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-            dm = qemu_xen_path(gc);
+            const char *configured_dm = qemu_xen_path(gc);
+            if (strchr(configured_dm, '/')) {
+                dm = libxl__strdup(gc, configured_dm);
+            } else {
+                const char *path_env = getenv("PATH");
+                if (!path_env) {
+                    dm = libxl__strdup(gc, configured_dm);
+                } else {
+                    char *path_dup = libxl__strdup(gc, path_env);
+                    char *path = strtok(path_dup, ":");
+
+                    char fullpath[PATH_MAX];
+                    bool dm_found = false;
+                    while (path) {
+                        snprintf(fullpath, sizeof(fullpath), "%s/%s", path, configured_dm);
+                        if (access(fullpath, X_OK) == 0) {
+                            dm = libxl__strdup(gc, fullpath);
+                            dm_found = true;
+                            break;
+                        }
+                        path = strtok(NULL, ":");
+                    }
+
+                    if (!dm_found) {
+                        LOG(INFO, "%s is not found", configured_dm);
+                        dm = libxl__strdup(gc, configured_dm);
+                    }
+                }
+            }
             break;
         default:
             LOG(ERROR, "invalid device model version %d",

I'm also not familiar with C. But regarding your version, if you decide to search PATH in libxl__domain_build_info_setdefault(), it will only save you one failure from access() and prevent it from defaulting to qemu-xen-traditional (unless you alter it in place with b_info-> I guess?). Later at libxl__spawn_local_dm() Xen will access() the same thing again and cause another error. So I just tried modifying libxl__domain_device_model().

@SigmaSquadron
Copy link
Contributor

SigmaSquadron commented Dec 18, 2024

Hmm, are you certain that modifying dm won't introduce issues in other functions in libxl_{create,dm}.c? Does every dm call expect an absolute path?

@hehongbo
Copy link
Contributor Author

Hmm, are you certain that modifying dm won't introduce issues in other functions in libxl_{create,dm}.c?

I don't know if you're referring to the dm at libxl__domain_build_info_setdefault() or at libxl__domain_device_model() but I think it should fine in both cases.

In libxl__domain_device_model() it just declared dm at start as char*, and grab the value from qemu_xen_path (thus QEMU_XEN_PATH), so I guess everything happen within libxl__domain_device_model() are just informative and nothing get changed. Same thing for the dm at libxl__domain_build_info_setdefault().

However I'm not certain what will happen if we alter data via b_info->, so I prefer just change the libxl__domain_device_model() and it resolving to a proper qemu-system-i386 for the caller to check.

@hehongbo
Copy link
Contributor Author

And another fact I discovered is, Xen will just check the existence of qemu-system-i386 against current directory, but will execute it from PATH anyway.

@hehongbo
Copy link
Contributor Author

Does every dm call expect an absolute path?

At least to check file existence via access(), it should be an absolute path (or it will look for current directory otherwise).

As for executing QEMU after those long procedure of checking, it is done by libxl__spawn_local_dm(), and then maybe libxl__exec(), which will eventually use execvp() to execute QEMU, and execvp() can deal with executable in PATH.

@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch from ee8b441 to 1539d91 Compare December 18, 2024 11:24
@hehongbo
Copy link
Contributor Author

Removed the message from the former patch file (contains error). Let me know if we decide to use the patch that enables xl to find qemu-system-i386 in PATH, or just remove the check at first and wait for the response of upstream.

@ofborg ofborg bot requested a review from SigmaSquadron December 18, 2024 20:59
@ofborg ofborg bot added 11.by: package-maintainer This PR was created by a maintainer of all the package it changes. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. labels Dec 18, 2024
@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch 2 times, most recently from 30ea078 to 159d807 Compare December 22, 2024 14:33
@hehongbo hehongbo changed the title xen: skip checking QEMU when creating HVM domains xen: patch libxl to search for qemu-system-i386 properly Dec 22, 2024
@hehongbo hehongbo marked this pull request as ready for review December 22, 2024 14:34
@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch 2 times, most recently from b07ef81 to a752b4b Compare May 14, 2025 17:24
@hehongbo hehongbo requested a review from SigmaSquadron May 14, 2025 17:35
@nixpkgs-ci nixpkgs-ci bot added the 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in any of the changed packages. label Jun 25, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 8, 2025
@philiptaron
Copy link
Contributor

Rebase needed after landing #406638

@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch 2 times, most recently from a150c76 to fbb095e Compare July 12, 2025 02:16
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 12, 2025
@hehongbo
Copy link
Contributor Author

Done retrieving the patch from the old location.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 24, 2025
@philiptaron
Copy link
Contributor

In the meantime, I think @SigmaSquadron has gained commit rights. Rebase and merge?

@hehongbo
Copy link
Contributor Author

hehongbo commented Oct 9, 2025

In the meantime, I think @SigmaSquadron has gained commit rights. Rebase and merge?

Sorry for not noticing the merge conflict, I'll revisit this later today.

@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch from fbb095e to 3051301 Compare October 10, 2025 07:02
@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Oct 10, 2025
@hehongbo
Copy link
Contributor Author

Now it's ready to merge. @SigmaSquadron

The patch is already in 4.21, so we can just remove that when bumping the Xen version, like other XSA patches.

@hehongbo hehongbo force-pushed the xen-skip-qemu-check branch from 3051301 to f7beed3 Compare October 10, 2025 12:11
As upstreamed to Xen Project via `f6281291704aa356489f4bd927cc7348a920bd01`.

Signed-off-by: Hongbo <hehongbo@mail.com>
@SigmaSquadron SigmaSquadron added this pull request to the merge queue Oct 10, 2025
Merged via the queue into NixOS:master with commit 7e3649a Oct 10, 2025
27 of 31 checks passed
@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Oct 10, 2025

Backport failed for release-25.05, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release-25.05
git worktree add -d .worktree/backport-365890-to-release-25.05 origin/release-25.05
cd .worktree/backport-365890-to-release-25.05
git switch --create backport-365890-to-release-25.05
git cherry-pick -x 70741d83804d784297a6bfb2de35fd10d319c381

@hehongbo hehongbo deleted the xen-skip-qemu-check branch October 10, 2025 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: xen-project Issues and PRs related to the Xen Project Hypervisor. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 11.by: package-maintainer This PR was created by a maintainer of all the package it changes. 12.approvals: 1 This PR was reviewed and approved by one person. 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in any of the changed packages.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants