-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bt: cannot transition from exception stack to current process stack #43
Comments
----- Original Message -----
When attempting to debug a kernel panic on ubuntu 16.04 LTS with hwe the
stack is unavailable.
Having manually built 7.2.7 I got a little further but still hit a dead end
with it failing with:
`bt: cannot transition from exception stack to current process stack:`
```
./crash /usr/lib/debug/boot/vmlinux-4.15.0-1041-gcp
/var/crash/201911112306/dump.201911112306
crash 7.2.7
Copyright (C) 2002-2019 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: kernel relocated [918MB]: patching 99648 gdb minimal_symbol values
KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-1041-gcp
DUMPFILE: /var/crash/201911112306/dump.201911112306 [PARTIAL DUMP]
CPUS: 8
DATE: Mon Nov 11 23:06:47 2019
UPTIME: 02:06:18
LOAD AVERAGE: 1.49, 1.69, 1.72
TASKS: 396
NODENAME: XXXXX
RELEASE: 4.15.0-1041-gcp
VERSION: #43-Ubuntu SMP Wed Aug 21 09:04:51 UTC 2019
MACHINE: x86_64 (2300 Mhz)
MEMORY: 30 GB
PANIC: "BUG: unable to handle kernel paging request at
ffffffffba678770"
PID: 13869
COMMAND: "PoolThread 6"
TASK: ffffa0245a690000 [THREAD_INFO: ffffa0245a690000]
CPU: 1
STATE: TASK_RUNNING (PANIC)
crash> bt 13869
PID: 13869 TASK: ffffa0245a690000 CPU: 1 COMMAND: "PoolThread 6"
#0 [fffffe0000033d60] machine_kexec at ffffffffba6669ce
#1 [fffffe0000033dc0] __crash_kexec at ffffffffba732bd9
#2 [fffffe0000033e88] panic at ffffffffba691a45
#3 [fffffe0000033f10] df_debug at ffffffffba66ae0d
#4 [fffffe0000033f28] do_double_fault at ffffffffba62f49a
#5 [fffffe0000033f50] double_fault at ffffffffbb000fe3
[exception RIP: __sprint_symbol+69]
RIP: ffffffffba731165 RSP: fffffe0000032fe8 RFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffffffba678770 RCX: fffffe0000032fe8
RDX: fffffe0000032ff0 RSI: fffffe0000032ff8 RDI: ffffffffba678770
RBP: fffffe0000033030 R8: fffffe0000033051 R9: fffffe0000033320
R10: fffffe0000033388 R11: ffffffffbbd5e80d R12: fffffe0000033051
R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffbb6a51b0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <DOUBLEFAULT exception stack> ---
#6 [fffffe0000032fe8] __sprint_symbol at ffffffffba731165
bt: cannot transition from exception stack to current process stack:
exception stack pointer: fffffe0000033d60
process stack pointer: fffffe0000033038
current stack base: ffffb2d48cfdc000
```
Last two entries from the log which may be relevant:
```
[29413.763776] unable to execute userspace code (SMEP?) (uid: 2000)
[29413.769982] BUG: unable to handle kernel paging request at ffffffffba678770
```
Any ideas on how to identify what user space stack caused the kernel to panic?
This is something that happens on a semi regular basis with this app.
It's not a user space stack issue that caused the double-fault, but rather
a kernel exception handler was running, and it generated another exception
that couldn't be handled/recovered-from. In this case, it looks like it
was trying to translate a kernel address and print its associated kernel
symbol name -- which apparently was ffffffffba678770. And that address
looks like it is a kernel symbol, which from the backtrace you can see is
located somewhere between machine_kexec and __crash_kexec. (from crash
enter "sym ffffffffba678770" to see exactly what it is). I'm not sure
I've ever seen a kernel paging request failure on a text address. What
does "vtop ffffffffba678770" show? It should translate to a 2MB mapped
page. Could the page table have been corrupted somehow?
Dave
|
Thanks for the feedback, here's the output from the requested commands:
|
I cannot make sense of how this scenario evolved. It appears that it |
As you surmised, it's presumably related to that X86_CR4_SMEP message. |
When attempting to debug a kernel panic on ubuntu 16.04 LTS with hwe the stack is unavailable.
Having manually built 7.2.7 I got a little further but still hit a dead end with it failing with:
bt: cannot transition from exception stack to current process stack:
Last two entries from the log which may be relevant:
Any ideas on how to identify what user space stack caused the kernel to panic?
This is something that happens on a semi regular basis with this app.
The text was updated successfully, but these errors were encountered: