Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest x86_64 image is broken too #294

Closed
splate07 opened this issue Nov 16, 2022 · 14 comments
Closed

Latest x86_64 image is broken too #294

splate07 opened this issue Nov 16, 2022 · 14 comments

Comments

@splate07
Copy link

splate07 commented Nov 16, 2022

I tried running the latest x86_64 live image (void-live-x86_64-20221001-xfce.iso) on real hardware via grub2 boot manager and it is broken. The root device cannot be found for some reason, and so the user is dropped into a debug shell.
You can't reproduce the same issue with the previous version of the image (void-live-x86_64-20210930-xfce.iso).

@Vaelatern
Copy link
Member

Is this image burned to the usb drive? I'm trying to make sense of the report.

@splate07
Copy link
Author

splate07 commented Nov 16, 2022

no, it is not
this iso file is simply placed in the root directory of the 5th partition of my hard drive.

I have this entry in my grub.cfg file

menuentry "void amd64 2022 xfce" {
set isofile="/void-live-x86_64-20221001-xfce.iso"
loopback loop (hd1,msdos5)$isofile
linux (loop)/boot/vmlinuz iso-scan/filename=$isofile root=live:CDLABEL=VOID_LIVE ro init=/sbin/init rd.luks=0 rd.md=0 rd.dm=0 rd.live.overlay.overlayfs=1
initrd (loop)/boot/initrd
}

that's it. It works for 2021 image, it doesn't work for 2022 image

@classabbyamp
Copy link
Member

classabbyamp commented Nov 16, 2022

I boot the isos fine via grub (with https://github.com/classabbyamp/glim). does the latest iso boot fine directly? this may be an issue with your computer

@splate07
Copy link
Author

splate07 commented Nov 16, 2022

I boot the isos fine via grub (with https://github.com/classabbyamp/glim). does the latest iso boot fine directly? this may be an issue with your computer

good for you
now try to boot void-live-x86_64-20221001-xfce.iso using https://github.com/classabbyamp/glim and see for yourself
btw, my menuetry is based on https://github.com/classabbyamp/glim/blob/master/grub2/inc-void.cfg

@classabbyamp
Copy link
Member

ok I am able to reproduce now, but it's very odd:

  • dracut is the same version as 2021
  • the iso file has the label VOID_LIVE in it, just like 2021
  • pretty much the only thing different is the kernel version and firmware

@repo-san
Copy link

I'm also having troubles with this.
booting the latest ISO through a grub2 menu entry doesn't work on my machine.
I also tried thias' GLIM with no success.
Both on bare metal and with qemu.
I even tried qemu with an EFI file, no dice.
dding the image to the same usb boots fine on bare metal, qemu and qemu with an EFI file (it goes to grub instead of syslinux).
my uneducated guess is that it could be this dracut commit:
void-linux/void-packages@eef5529
dracutdevs/dracut@87c4c17
and a relevant issue in void-packages:
void-linux/void-packages#38367

20210930 iso (boots using a grub2 menu entry):
dracut 53_2
kmod 27_3

20221001 iso (doesn't boot using a grub2 menu entry):
dracut 53_4
kmod 30_1

@0x5c
Copy link
Contributor

0x5c commented Jan 29, 2023

It's not dracut.

With rd.debug in the kcl, I took the rdsosreport.txt boot log generated by dracut for both the last working image and the first broken image (also adding rd.break to get a shell for the former).

In a diff between the two logs, 9 lines stand out in informational output before dracut starts searching for/mounting the root:
(in cat /proc/self/mountinfo output)

-29 26 7:0 / /run/initramfs/live ro,relatime - iso9660 /dev/loop0 ro,nojoliet,check=s,map=n,blocksize=2048,iocharset=utf8
-31 1 254:0 / /sysroot rw,relatime - ext3 /dev/mapper/live-rw rw

(in cat /proc/mounts output)

-/dev/loop0 /run/initramfs/live iso9660 ro,relatime,nojoliet,check=s,map=n,blocksize=2048,iocharset=utf8 0 0
-/dev/mapper/live-rw /sysroot ext3 rw,relatime 0 0

(in blkid output)

-/dev/loop0: BLOCK_SIZE="2048" UUID="2021-10-07-00-22-44-00" LABEL="VOID_LIVE" TYPE="iso9660" PTUUID="4e4d61a4" PTTYPE="dos"
-/dev/loop1: TYPE="squashfs"
-/dev/mapper/live-base: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" SEC_TYPE="ext2" BLOCK_SIZE="4096" TYPE="ext3"
-/dev/loop2: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" SEC_TYPE="ext2" BLOCK_SIZE="4096" TYPE="ext3"
-/dev/mapper/live-rw: UUID="65732de4-1bfe-479b-8269-be87b1fb8c8e" BLOCK_SIZE="4096" TYPE="ext3"

The loop devices are not even present when booting a newer image, which points to the kernel, which is also the only relevant package that had updates between the last working and first broken images (5.13.19_1 and 5.19.10_1 respectively).

This is confirmed when booting a freshly built image made with mklive's -v linux5.13 argument.

Now that I know it's the kernel, I'll try to narrow down what version between 5.13 and 5.19 broke this.

@0x5c
Copy link
Contributor

0x5c commented Jan 30, 2023

It's linux 5.19. The last version that boots properly from loopback is 5.18.

There doesn't seem to be relevant changes in the dotconfigs of those two versions, nor in the patches.

@LaszloGombos
Copy link

Some wild guesses:

@0x5c
Copy link
Contributor

0x5c commented Feb 2, 2023

Could it be a missing kernel module for the storage - e.g. mmc - https://www.reddit.com/r/voidlinux/comments/y03b8b/baytrail_stopped_booting_after_updating_to_519/

In my case at least, tests were done on a standard desktop computer, and the storage holding both the GLIM setup and the ISO image is a plain FAT32 partition (part type ID 0c, fs created with mkfs.vfat) on a normal USB FLASH drive (/dev/sdX) with MBR partition table.

It seems to me like all modules possibly involved in that are already loaded, and furthermore, at the time dracut drops me to a shell, the partition on the USB drive is indeed already mounted at /run/initramfs/isoscan and its contents are present in that directory as expected.

I'll also be setting up a testbench in QEMU to do further tests.

Is the loop module loaded/available ? Perhaps a missing "modprobe loop" somewhere ?

At the dracut debug shell, loop in indeed present in /proc/modules. However, none of cdrom, isofs, and squashfs are present in that list at that point.

These lines are present in dmesg logs of both working (5.18 and before) and failing (5.19+) images:

loop: module loaded
dracut: root was live:CDLABEL=VOID_LIVE, is now live:/dev/disk/by-label/VOID_LIVE

Manually mounting the ISO image correctly loads both cdrom and isofs, and the contents of the image are present at the mountpoint as expected. Further mounting the squashfs image also properly loads squashfs (again, contents present as expected).

However, after both mount operations, there isn't any info on the mounted filesystems in lsblk -f (fstype, fsver, Label!, UUID) ...until udevadm trigger is manually run. This kernel commit seems potentially relevant as is would introduce delays before being able to see the label after mounting the ISO torvalds/linux@498ef5c. I will be testing if adding a delay after the mount in dracut fixes the issue.

If you have further insights, I'll test those too

Note: if using a fedora image for testing, the kernel and initrd location in the ISO seem to have changed since the last time dracut.cmdline was modified. They are now present in /images/pxeboot/.

@0x5c
Copy link
Contributor

0x5c commented Feb 2, 2023

@LaszloGombos

One thing I forgot to mention in the previous message, is that the ISO image does (sometimes?*) stay mounted when dropping in the shell, but in the mounted-with-no-label state.

Since last message, I've also found a kinda-fix:
From a boot attempt where the mount was already present while in the shell, simply running udevadm trigger and leaving the shell lead to dracut successfully booting into the Void live image.

*kinda confused as to how it shows as mounted in some attempts while I recall other attempts not even having the loop module loaded in the shell (the issue is either a separate one in dracut/the images, or in my recollection of all of these attempts at booting)

@LaszloGombos
Copy link

LaszloGombos commented Feb 2, 2023

@0x5c

Please try to autoload modules that this use case needs from the bootloader command line arguments - e.g. "rd.driver.pre= loop,cdrom,isofs, squashfs"

You could also try this patch that debian carries: https://salsa.debian.org/debian/dracut/-/blob/master/debian/patches/udevsettle

@LaszloGombos
Copy link

Fedora bug report - https://bugzilla.redhat.com/show_bug.cgi?id=2131852

@0x5c
Copy link
Contributor

0x5c commented Feb 14, 2023

A fix for this has been merged upstream dracutdevs/dracut#2196, and there's a backport of it to the package void-linux/void-packages#42265
Once that's merged any new image shouldn't have problems with iso-scan anymore.

paper42 pushed a commit to void-linux/void-packages that referenced this issue Feb 14, 2023
Luc-Saccoccio pushed a commit to Luc-Saccoccio/void-packages that referenced this issue Mar 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants