Add initial support for RISC-V architecture#963
Conversation
mvo5
left a comment
There was a problem hiding this comment.
Thank you! This looks nice! Ideally we would have a proper test for this but given that its not an official fedora arch yet it fine.
What diff do I need to apply to reproduce the cross-arch build failure btw? Does the normal "aarch64" (arm64) cross arch build work for you? The error indicates it might be related to the environment this runs in but I would like to reproduce to be sure.
Thank you for taking the time to review this! My apologies, I should have tested Here is the command I used: And the error was: To reproduce the original The most important arguments are: Thanks again for your help! |
|
Thanks for the extra info, that was most helpful! I can reproduce the issue now: (interestingly the cross-arch aarch64 test is working for f42) With the above diff applied I can run the cross build in pytest and get: $ sudo pytest -s -v ./test/test_build_cross.py::test_image_boots_cross[container_ref=docker.io/fedorariscv/bootc:42,image=raw,rootfs=ext4,target_arch=riscv64]
...
Installing image: docker://docker.io/fedorariscv/bootc:42
Unsupported syscall: 435
Unsupported syscall: 435
Unsupported syscall: 435
Unsupported syscall: 435
Unsupported syscall: 435
Initializing ostree layout
Unsupported syscall: 293
Unsupported ioctl: cmd=0xffffffffc0046686
ERROR Installing to filesystem: Creating ostree deployment: Creating imgstorage: Initializing images: Subprocess failed: ExitStatus(unix_wait_status(32000))
Unsupported syscall: 293
Unsupported target signal #61, ignored
Unsupported target signal #62, ignored
Unsupported target signal #63, ignored
Unsupported target signal #64, ignored
Unsupported syscall: 435
Unsupported syscall: 277
Error: failed to get new shm lock manager: failed to create 2048 locks in /libpod_lock: operation not supportedThe ioctl failure cmd=0xffffffffc0046686 is https://docs.kernel.org/filesystems/fsverity.html#fs-ioc-measure-verity it looks like the error happens in https://github.com/containers/podman/blob/main/libpod/lock/shm/shm_lock.c#L76 [edit: so that is the culprit. [edit2: it looks like the riscv64 implementation of pthread_mutex_init needs clone3 and qemu-user does not implement clone3 (yet), only the plain old clone()] |
|
However it still fails with in my test, it seems https://github.com/containers/podman/blob/main/libpod/lock/shm/shm_lock.c#L153 is causing this, I don't see anything from qemu-user in the "unimp" logs, so its a bit unclear where this is unsupported. [edit: it seems its the combination of PTHREAD_PROCESS_SHARED+PTHREAD_MUTEX_ROBUST that is not liked , which is looks like https://github.com/bminor/glibc/blob/glibc-2.41/nptl/pthread_mutex_init.c#L96 which is because it the |
|
So in summary, I am not sure if we can fix this with qemu-user (we could do what Colin suggested and do full-system emulation which would make this problem go away but it would be slower). The issue is that
Interestingly on aarch64 on fedora __ASSUME_SET_ROBUST_LIST is set no output from strings [edit: I will try to make a case on the qemu-user mailingst ist just implement set_robust as a NOP instead of the current -ENOSYS based on that on x86_64 and aarch64 this is effectively the behavior as the result of the syscall is never checked with __ASSUME_SET_ROBUST_LIST. The point of this patch is to avoid deadlocks which on a real system require a reboot but on qemu-user it just needs killing the qemu process so I would argue that having the support is better than the (rare) case when its not fully working (and then it can be dealt with with kill). But lets see how this goes :) |
This commit drops the `QEMU_LOG=+unimp` and replaces it with `QEMU_LOG=unimp`. The `+` format does not work and we found this in osbuild/bootc-image-builder#963 (comment)
This commit drops the `QEMU_LOG=+unimp` and replaces it with `QEMU_LOG=unimp`. The `+` format does not work and we found this in osbuild/bootc-image-builder#963 (comment)
|
Fwiw, I submited an RFC patch to the qemu-devel mailinglist that return "0" unconditionally for the set_robust_list() syscall. When I do this locally the podman locking works so I'm 90% confident that it solves the problem. I have no sense in how likely it is that the patch gets accepted though (my guess is not very likely), its arguably wrong but also in practice glibc does not care - it is a risk for non-glibc threading implementations though which would be a good reason to reject it (otoh most languages like rust/java use glibc it seems, go does not use set_robust_list and musl https://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_mutex_trylock.c#n30 also does not check the return code ] [edit: https://lists.nongnu.org/archive/html/qemu-devel/2025-07/msg01553.html] |
Thank you so much for digging into this so deeply! To be honest, this level of system debugging is beyond my current knowledge, so I apologize that I can't be of more help with the QEMU patch discussion. I've pretty much reached the limit of my abilities after submitting the initial PR. Thank you again for moving this forward! |
No worries, this is deep stuff! You PR and help with the reproducer was super valuable, thanks for that! |
|
This PR is stale because it had no activity for the past 30 days. Remove the "Stale" label or add a comment, otherwise this PR will be closed in 7 days. |
|
This PR was closed because it has been stalled for 30+7 days with no activity. |
mvo5
left a comment
There was a problem hiding this comment.
I would still like to merge this, it will not work in the cross-arch case because of the discussed qemu-user limitations (might be worth adding the link to the qemu-user patch in the commit message of the patch and/or in a comment in the code). But native building (or full system emulation) should work so lets include it.
Testing
Native Builds (Successful)
The build is successful with the following command on riscv64 physical machine (Note: Since riscv64 is not an officially supported architecture in Fedora,
Containerfilechanges to use a community-maintained base image:docker.io/fedorariscv/base:42. In the future, when fedora is officially supported soon, it can useregistry.fedoraproject.org/fedora:42directly. More details can be found in the following commit: link):bootupctlLimitationsCurrently, bootupctl cannot be used for the installation. Although rust-bootupd has merged riscv64 support in its codebase, a new version has not been tagged and released. This results in the following error:
A manual workaround is required: manually place
grubriscv64.efiandgrub.cfgin/boot/efi/EFI/fedoradirectory, placestartup.nshwith contentFS0:\EFI\Fedora\grubriscv64.efiin/boot/efidirectory and anothergrub.cfgin/bootdirectory to make the image bootable.Validation on RISC-V Hardware
A bootable bootc image was successfully created on a riscv64 physical machine. It can be launched using QEMU after installing the necessary packages (dnf install -y edk2-riscv64 qemu-system-riscv).
QEMU Launch Command:
For easy debugging, the default account and password is
root/riscv.Image Download Link: Fedora-Minimal-bootc-42-20250617000000.riscv64.QEMU.Generic.qcow2.gz
fastfetch information
Cross-Architecture Builds (Failed)
Attempting to build a riscv64 image from an x86_64 host currently fails(image builder's base image:
registry.fedoraproject.org/fedora:42). The build command:with an error messgaes: