-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugging kernel 5.x kexec issues add tools to ease kernel version bumping #1351
Conversation
Some notes on top of my head on next steps.
|
Added targets into modules/linux were used to generate oldconfigs for both 4.14.62 and 5.10.5, being then renamed to orig and mod along the way. Goal of all of this is to - Take defconfig format from 4.14.62 and remove unneeded compiled as modules but never packed (useless) - Investigate comparable configs between librem_common and xx30 - Troubleshoot why in some case i915 not being provided in OS initrd is ok (xx30 on 4.14=ok vs t440p on 5.10 based on librem_common and librems) - That being outside coreboot iommu=on iommu_igfx=off passed from coreboot to Heads linux payload and with/without to OS
As of now, only really important differences observed between librem and xx30 are - Some DRM addition for AST in librem and not required per xx30 - NVME addition in librem and not required per xx30 - Some additional processors support under librem that should not be there - Some additional sched support where I do not see anything usefull more then PERFORMANCE driven scheduler - SECURITY as base for SECURITYFS being activated but not being used - compression algos supported but unused - encryption compiled as modules but never packed - Hardware Randomization sources as TPM and Intel processor should be enabled to augment entropy and reduce delay before usable in kernel (more work needed there) This will be base for testing. Changes saved in oldconfoig format. Officially against defconfig format for github repo storage. Counterproductive.
kexec patch under tree takes into consideration that inteldrmfb hack added into kexec for 4.14 cannot apply anymore for 5.x Those changes changes linux-qemu.config to remove all unneeded, and take into account general advises to have vgaarb, fbdev and simple fb, as well as needed gpu (where qemu requires virtio related drivers) In all cases, kexec should be able to either be called reusing fb or resetting it. At this point, sure kexec patch won do it. That patch comes from older kexec, where nowadays it cannot be used as is. Some more tools added into modules/linux I mostly use: linux.modify_defconfig (to use defconfig, call menuconfig on it and save back in place) and linux.generate_and_save-oldconfig (to save whatever defconfig or modified oldconfig file into olconfig file where linux config is expected to be found per board config) Also kexec-boot was modified to add some additional kexec arguments in case of qemu usage, but nothing was successful, where traces are there to track what was attempted. At this point, going back to x230 to attempt to tweak vga options, reusing cleaning that was done on linux-qemu.conf to be reused there TODO: - Check advancements on kexec. Something better should be possible. - note that xen, multiboot and other usecases are differenciated in kexec-boot, where xen reuses vga. - other situation then xen doesn't pass any other kexec option to deal with vga, other then inteldrmfb patch in kexec as of today, which works for 4.14 kernel jumping into others newer, but doesn work for 5.x jumping to others as of today.
…l, but still not working
…ig_oldconfig_5.10.5_from_orig_4.14.62
Status qemu coreboot config: +CONFIG_GENERIC_LINEAR_FRAMEBUFFER=y momentarily qemu linux config: +CONFIG_X86_SYSFB=y seems to be requirement for non legacy VESA bringup, but seems to break Heads?!? removal of vga 16 removal of vesa leaving simplefb NOTE: CONFIG_DRM_FBDEV_EMULATION is required and provides linux console support on top of modesetting driver modules/linux/coreboot: add modify helpers to generate in place oldconfigs and defconfigs WiP: kexec-boot: modifying kexec passed boot options on qemu for Tinycore. A bit confused on why kexec call just freezes there. Next time is investigating how to gdb into qemu for firmware, since we should have output on host console even if qemu is not able to get vga, and vga should be able to be reused from kexec call since: user@heads-tests:/media/boot$ zcat core.gz | cpio -ivt | grep -e vga -e fb -e drm -e virt lrwxrwxrwx 1 root root 17 May 8 2022 usr/sbin/fbset -> ../../bin/busybox drwxr-xr-x 3 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/virt drwxr-xr-x 2 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/virt/lib -rw-r--r-- 1 root root 1779 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/virt/lib/irqbypass.ko.gz -rw-r--r-- 1 root root 4556 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/net/vmw_vsock/vmw_vsock_virtio_transport.ko.gz -rw-r--r-- 1 root root 6898 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/net/vmw_vsock/vmw_vsock_virtio_transport_common.ko.gz drwxr-xr-x 3 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev -rw-r--r-- 1 root root 9612 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/udlfb.ko.gz drwxr-xr-x 2 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/core -rw-r--r-- 1 root root 2149 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/core/sysfillrect.ko.gz -rw-r--r-- 1 root root 1274 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/core/fb_sys_fops.ko.gz -rw-r--r-- 1 root root 1782 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/core/sysimgblt.ko.gz -rw-r--r-- 1 root root 2268 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/core/syscopyarea.ko.gz -rw-r--r-- 1 root root 7035 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/video/fbdev/hyperv_fb.ko.gz -rw-r--r-- 1 root root 3082 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/gpio/gpio-virtio.ko.gz drwxr-xr-x 3 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/virt drwxr-xr-x 2 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/virt/vboxguest -rw-r--r-- 1 root root 12506 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virt/vboxguest/vboxguest.ko.gz drwxr-xr-x 2 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/virtio -rw-r--r-- 1 root root 8029 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_ring.ko.gz -rw-r--r-- 1 root root 3847 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_pci_modern_dev.ko.gz -rw-r--r-- 1 root root 3718 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio.ko.gz -rw-r--r-- 1 root root 7084 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_pci.ko.gz -rw-r--r-- 1 root root 3377 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_input.ko.gz -rw-r--r-- 1 root root 6258 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_balloon.ko.gz -rw-r--r-- 1 root root 1042 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_dma_buf.ko.gz -rw-r--r-- 1 root root 3551 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/virtio/virtio_mmio.ko.gz -rw-r--r-- 1 root root 6916 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/block/virtio_blk.ko.gz -rw-r--r-- 1 root root 17961 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/net/virtio_net.ko.gz drwxr-xr-x 2 root root 0 May 8 2022 lib/modules/5.15.10-tinycore/kernel/drivers/crypto/virtio -rw-r--r-- 1 root root 7020 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/crypto/virtio/virtio_crypto.ko.gz -rw-r--r-- 1 root root 5924 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/iommu/virtio-iommu.ko.gz -rw-r--r-- 1 root root 10007 Dec 21 2021 lib/modules/5.15.10-tinycore/kernel/drivers/char/virtio_console.ko.gz Last line is virtio_console that qemu is using.... So no clue at this point
a9ec050
to
6459416
Compare
As usual, the universe works in weird ways and other people are trying to understand how things work at the same time. |
Added debug capabilities inside of KERNEL_ADD in board config, also specifying ttys0 serial speed of 115200, which works now. Added debugging through kexec -d call as well if DEBUG output activated in board config Now Tinycore stalls at ethernet init, loop is established and resized multiple times, so it seems that tinycore has what he was searching for to prepare normal boot options in memory. Investigating vga options and will output some info on PR
79a8b0a
to
a39d71c
Compare
qemu-coreboot: readd linear framebuffer qemu-linux: -add firmware options to try to help kexec -remove low and legacy fb options to only leave simple fb kexec-boot: -go back to only use --console-vga, but kexec -d debug is not showing anything interesting when booting tinycore - TINYCORE SHOULD BOOT WITH SIMPLE FB BUT DOESN'T modules/kexec: passed to latest (2.0.26) patches/kexec 2.0.22 patch copied over for 2.0.26 without changes
linuxboot working Google docs to continue kexec issues digging and collaboration https://docs.google.com/document/d/15D0xImlLwvqqHk8QoaVyl918-RkDojo3mxCeknWW5OE/edit?usp=sharing |
Seems like expected to work solution might be to use 32 bit entry point of second kernel when graphics are involved (all heads use cases, all providing DRM+GPU drivers in kernel as today). Hopefully, BIOS involvement here would make it work, but still unknown why this would work... (As opposed to linuxboot on x86, we do not rely on efi, not even on libgfxinit in most cases... What BIOS calls are involved?)
Also.... Let's try to get rid of libgfxinit (and gnat madness) as a test for all those willing to push this forward. Without gnat dependency, nixos would be near grasp and consequently, reproducible builds could finally be possible and not so hard to accomplish and maintain. |
Knowledge on related concerns seem to have made its way through webboot commits already in the past: One of them is the following (old, talking about kernel 4.x), touching kernel graphics requirements and kexec (commit log is pretty instructing on how kernel 16->32 bits -> 64 bits works). THIS IS EXACTLY OUR ISSUE HERE: The other was about setting up kexec to enter kernel through 32 bit entry point to respect kernel asking BIOS for its config: |
This could also be a reason why tinycorelinux does not see the /dev/sd* disks (if the information is expected to come from BIOS) - just theorizing here |
…blob/78c6cf99a32d82b106cc373280df9512d6e25131/config-5.6.14, minus deactivating all uneeded network drivers but intel ones for Q35. Changes of kexec-boot to pass 32 bit entry point + noefi is based on understanding of below comments. Keeping traces of testings traces -- From linuxboot#1351 (comment) Knowledge on related concerns seem to have made its way through webboot commits already in the past: One of them is the following (old, talking about kernel 4.x), touching kernel graphics requirements and kexec (commit log is pretty instructing on how kernel 16->32 bits -> 64 bits works). THIS IS EXACTLY OUR ISSUE HERE: u-root/webboot@ac6873c The other was about setting up kexec to enter kernel through 32 bit entry point to respect kernel asking BIOS for its config: u-root/webboot@dfc1429
…blob/78c6cf99a32d82b106cc373280df9512d6e25131/config-5.6.14, minus deactivating all uneeded network drivers but intel ones for Q35. This was accomplished by using added linux module build statement, which was called through: make BOARD=qemu-coreboot-whiptail-tpm1 linux.modify_and_save_oldconfig_in_place Changes of kexec-boot to pass 32 bit entry point + noefi is based on understanding of below comments. Keeping traces of testings traces -- From linuxboot#1351 (comment) Knowledge on related concerns seem to have made its way through webboot commits already in the past: One of them is the following (old, talking about kernel 4.x), touching kernel graphics requirements and kexec (commit log is pretty instructing on how kernel 16->32 bits -> 64 bits works). THIS IS EXACTLY OUR ISSUE HERE: u-root/webboot@ac6873c The other was about setting up kexec to enter kernel through 32 bit entry point to respect kernel asking BIOS for its config: u-root/webboot@dfc1429
514a67d
to
f1edf70
Compare
…ake exposed compiler linux-qemu.config: have oldconfig format output exposed used compiler
No, those are static files coming from the initrd which you can see by exposing cpio content. I am not testing Tinycore on real hardware for the moment, trying to get qemu par with Heads 4.x kernel and simply using tinycore since what they pack there is really really limited. As exposed off channel:
Doing a savage comparison here between your expectation of having real /dev/sd* devices, but the point is that Tinycore exposed /dev/fb0 in the initrd, right? The initrd decompressed ramfs which then becomes / "thinks" that a fb0 exists, but the truth is, under qemu, that none does. So is the same thing for /dev/sd* drives. And the drives that qemu exposed through drivers, but for whatever reasons, do not update /dev Seems like Tinycore decides to not include devfs or does not map it correctly through /etc/fstab or whatever whatever, I have not digged into that. Simply using Tinycore as an exemple of an initrd which is supposed to switch from kexec'ed non efi, 32 bit entry point, discovere vga from exposed tables from bios, ninitialized vgaarb on fbcon (not happening) and then have simple fb discover, and use found vga adapter (not happening) to take over vgaarb tor simple fb driver. But that is not happening. My next step is to ditch tinycore for testing and build kernel myself there. Something weird is happening with Tinycore, on top of weird stuff being investigated. Seems like using Tinycore as test is not good test plan. |
Savagely disabling stuff, not interested in fixing Kconfig dependencies, just want ta working test case here... Otherwise: tail /home/user/heads/build/x86/log/linux.log ----- CC net/mac80211/chan.o CC net/mac80211/trace.o CC net/mac80211/mlme.o CC net/mac80211/tdls.o CC net/mac80211/ocb.o CC net/mac80211/airtime.o CC net/mac80211/led.o CC net/mac80211/debugfs.o CC net/mac80211/debugfs_sta.o CC net/mac80211/debugfs_netdev.o CC net/mac80211/debugfs_key.o CC net/mac80211/pm.o CC net/mac80211/rc80211_minstrel.o CC net/mac80211/rc80211_minstrel_ht.o CC net/mac80211/rc80211_minstrel_debugfs.o CC net/mac80211/rc80211_minstrel_ht_debugfs.o AR net/mac80211/built-in.a AR net/built-in.a make[1]: *** [../Makefile:185: __sub-make] Error 2 make[1]: Leaving directory '/home/user/heads/build/x86/linux-5.10.5/linux-qemu' make: *** [Makefile:419: /home/user/heads/build/x86/linux-5.10.5/linux-qemu/.build] Error 1
…i that are unneeded in our test case otherwise make[6]: *** No rule to make target '../../linux-firmware/iwlwifi-7265D-29.ucode', needed by 'drivers/base/firmware_loader/builtin/iwlwifi-7265D-29.ucode.gen.o'. Stop.
…ch Heads requires same for usb controller, keyboard and unrelated usb stuff in our use case. Interesting Kconfig dependency highlighten in that commit, all changes commit from menuconfig...
5f2b29a
to
b1d6e91
Compare
Of course. |
…-root/webboot@ac6873c. Doesn't change anything. I blame it on qemu at this point
And.... It was qemu....
The following 3 changes were required:
Gives access to Tinycore. EDIT: and importantly, calling |
To go back to basics here, since its always important to remember the original problem scope to attempt to isolate potential issues leading to a problem and the solution space.... As of master:
Now what was used as a test case:
As of now, this is where the tests went. What seems to have been part of the solution (while latest changes just attempted to reuse webboot qemu's linux config and adapted it to match Heads requirements, including some of the modules to be loadable, not built-in, where webboot reference config is practically packing mostly everything, and nothing compiled as modules):
Also to not be dismissed, calling |
- Tinycore does not support kexec2 (file loading, requires legacy syscall) fallback - Booting debian works, need to remove i915 to see if fbsimple can be kicked in and tune basic config with fallbacks if kexec call doesn't work
@saper confirmed issue was to not pass iommu kernel_add additional kernel options from board config to kexec from 4.14. Otherwise it works on master. Different story still from 5.x kernel to whatever version |
- qemu-coreboot-whiptail-tpm1 board config: remove vga-791 statement which is dismissed and should not be specified - kexec-boot: remove additional statements to elf. - kexec-2.0.26.patch: add more debug stuff on vga setup. Current state: - VESA is not setuped from host kernel detection. - kexec call shows : 2012 and other debug codes which are linked to reset-vga codepath (non-working)
…ime and free resource for nothing on this PR
Under current PR state, booting latest TinyCore http://tinycorelinux.net/13.x/x86/release/CorePlus-current.iso
And then from within, on host's recovery shell (type enter on boot from host's console):
produces the above important debug traces:
Confirmed by added debug traces in kexec patches under heads: |
In current state of PR, even if we specify to reuse vga through
kexec codepath to reset vga is still called:
|
Read a lot more. Suggestions are to see things as incompatible to test from qemu (q35, latest) from real hardware. This is really really not helping. Some notes found on the internet are explaining the differences between kexec codepath with noefi (Heads on coreboot usecase). Other stakeholders want to use the GOP from FSP to provide common ground and dodge the problem. Understandable but not helpful. I searched my old boxes to try to find my old "screwdriver" EHCI USB debugger and try to debug on x230. But that might or might not be helping. So. Again, what we know:
This article shares a deep diving experience (but doesn't help): |
…core 14.0 released today, tests with debian shows that again things are ok when drm+gpu replaces fb but not before on both qemu and real hardware
A reminder that on qemu vga side, not all modes provide vesa/vga compatibility https://www.kraxel.org/blog/2019/09/display-devices-in-qemu/ |
Note: #1378 will implement the working bits and this PR will be closed. @JonathonHall-Purism bissecting kernel code to see where things hanged. Also seems that the way forward for Heads is to make sure kexec path is to make sure we use |
Just a note regarding kexec-tools newer than v2.0.22. |
Confirmed not needed per 4f88f35 at #1381 Will close this PR because it contained way too much directions exploration. |
This is just a tracing test in my attempt to narrow down the problem of why kexec'ing from 4.14 kernel works even though i915 driver is not provided in Tinycore boot of dd'ed iso (simplest test I found) where it just hangs when kexec'ing from 5.x.
As of now, this PR