-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel BUG() on Linux 6.8 with Clang 17+ and LTO, rt5640 audio fails #2017
Comments
If it does not happen on 6.7 but it does with 6.8, are you able to bisect to see what change introduced this? The address it is faulting at seems rather suspect ( |
Instruction pointer is in strcmp, called from snd_byt_rt5640_mc_probe. Perhaps there's a bug somewhere near there? FWIW:
|
I'm using CONFIG_INIT_STACK_NONE, as part of getting what little
performance I can out of the Atom CPU of the tablet in question. I'll try
INIT_STACK_ALL_ZERO when I get the chance. I attached the kernel config in
the original post.
…On Wed, Apr 17, 2024, 1:36 PM Nathan Chancellor ***@***.***> wrote:
If it does not happen on 6.7 but it does with 6.8, are you able to bisect
to see what change introduced this? The address it is faulting at seems
rather suspect (00000000ffffffff), what is your CONFIG_INIT_STACK_ value?
—
Reply to this email directly, view it on GitHub
<#2017 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ATSSZDDEC7R3LU4IEVSBXTLY52XJZAVCNFSM6AAAAABGLX6ZDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRRHA2DQMZRGU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@nathanchance same failure with INIT_STACK_ZERO:
|
I was not necessarily expecting the INIT_STACK configuration to really matter but thank you for checking! I noticed https://git.kernel.org/linus/7d99a70b65951108d82e1618c67abe69c3ed7720 in the list of changes from 6.7 to 6.8, which seems potentially relevant here since it mentions fixing a |
@nathanchance reverting that change did not help, sadly. The failed machine code reported from the BUG() is exactly the same, too, so the failure must be elsewhere. Where did you get the list of changes? kernelnewbies.org hasn't been updated for 6.8 yet and I can't find it anywhere else. (I'm new to kernel debugging.) |
EDIT: Nvm my tree was out of date. Perhaps time to break out ubsan? |
I enabled UBSAN but it didn't catch anything, only two probably unrelated array index out of bounds in net/wireless/nl80211.c. I'll put them here anyways, though, just in case:
It probably has to do with this which is output right before: I did not enable any other debugging features and left UBSAN at its defaults. What else would you recommend I turn on? |
Probably worth notifying the maintainers of those drivers, but sounds orthogonal to the issue being tracked here. (If you use triple backticks in GitHub markdown to open and close your trace, it will retain the original line wrapping). Perhaps worth testing ASAN, too. Tough IIRC ASAN is incompatible with LTO. Did you verify you don't observe this without LTO? |
I think KASAN is now allowed with LTO: https://git.kernel.org/linus/349fde599db65d4827820ef6553e3f9ee75b8c7c |
Something that occurred to me is LTO may have inlined some other function that calls Can you try running your stack trace through
and see if that gives us any other idea what is going on here? |
On a Dell Venue 8 Pro 5830, Linux 6.8 with Clang and LTO triggers a BUG() and the rt5640 audio driver to fail. This happens with the older driver and with the newer SOF driver forced with
snd-intel-dspcfg.dsp_driver=3
on the kernel command line. It was fine on 6.7, but there were changes to the rt5640 driver in 6.8 that seem to have broken things. This happens with both clang 17 and 18, built from Arch Linux and from my Gentoo box, so that eliminates quite a few variables.Additionally, the system hangs on the final stages of reboot or poweroff. I don't know if it is related to this or a separate issue.
Here is my kernel configuration:
config-6.8.6-lto.txt
Being that this is a pretty uncommon piece of hardware, I am happy to test patches.
The text was updated successfully, but these errors were encountered: