-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support swapper process stack_trace? #462
Comments
There is a PID 0 for each CPU, so you need to use the |
See the CPU mask helpers for iterating over CPUs: https://drgn.readthedocs.io/en/latest/helpers.html#cpu-masks |
I still encounter the following problem when using idle_task, which may be related to bad_pc. Is there a way to trace back to the stack before bad_pc? `
The actual requirement is to look at the input parameters and local variables of each layer of the stack Thanks. |
Ah, it looks like we need an ARM64 version of 412ce95. I can give you a workaround tomorrow once I refresh my memory on the ARM64 calling convention. |
I just pushed a commit, 4a3ae32, that should fix this. If you're able to build drgn from source, please try it out. If you can't build drgn from source, you can use this script as a workaround: https://github.com/osandov/drgn/blob/main/contrib/stack_trace_call_fault.py. Save it somewhere, then from the drgn REPL, call Please let me know if either of these solutions work for you, and feel free to reopen if not. |
PID 0 is not unique in the Linux kernel; there is a task with PID 0 for each CPU. stack_trace(0) currently fails with a generic "task not found" error message, which can be confusing; see #462. Add a hint to use idle_task() to the error message when the given PID is 0. Signed-off-by: Omar Sandoval <[email protected]>
Rebuild drgn from source(drgn-4a3ae326f4855462d38b0c30c1c1f77f7ce4342e) and install it. Execute 'python3 -m drgn -c vmcore3 -s vmlinux' and encounter the following problem. I tried master and had the same problem. Prior to this, the drgn version was drgn-0.0.30 works. `$ python3 -m drgn -c vmcore -s vmlinux
` Use https://github.com/osandov/drgn/blob/main/contrib/stack_trace_call_fault.py directly.
` Another vmcore. `>>> execscript('stack_trace_call_fault.py')
` |
Since AArch64 uses a link register rather than storing the return address on the stack, this is a bit easier than on x86-64. Fixes #462. Signed-off-by: Omar Sandoval <[email protected]>
Since AArch64 uses a link register rather than storing the return address on the stack, this is a bit easier than on x86-64. Fixes #462. Signed-off-by: Omar Sandoval <[email protected]>
Thanks, a couple more questions/requests:
|
Add --log-level=debug
Create directories and cp vmlinux files:
It works.
Is there a parameter to specify this path /usr/lib/debug/lib/modules/xx.aarch64/vmlinux ? Thanks. |
Ok, this line explains it:
In the
And let me know if there's a line beginning with Finally, how was this vmcore collected? Was it with kdump, makedumpfile, etc? |
OSRELEASE=4.19.90.xxx.aarch64 == In fact the version is almost equal to longterm 4.19-y
makedumpfile. |
Okay, that kernel version was before the build ID was added to the VMCOREINFO, so that's my oversight not handling it. I'll fix that. In the meantime, were you able to test the Thank you for your patience! |
How to display in hexadecimal format? |
python3 -m drgn -c vmcore3 -s vmlinux
|
The interesting thing here is that the link register is garbage in both dumps (0 in the first, 0x1000000010101 in the second). This means we have no idea who called the bad PC. (The dmesg you pasted earlier also showed this, and crash apparently also failed to get a stack trace below the bad PC.) It's possible that something was too badly corrupted, or there is a bug in how 4.19 captures the registers. This makes it impossible to get a reliable trace automatically. One thing I've done in similar cases before is manually inspecting the contents of stack memory and guessing where the caller was. If you're able to share the vmcore and vmlinux (feel free to email me privately), I could take a look. If not, tomorrow I can try to come up with a script to get you started. |
One thing you can check quickly is
That will dump the contents of the stack below the stack pointer, annotating symbols, slab allocations, and more. (Tweak the second argument to read more or less if needed.) Please share that if you are able to. |
KERNEL STACK SIZE: 16384 stack: 0xffff8003a1d74000 ~ ffff8003a1d78000 : According to sp, the data before and after the stack are as follows, and no obvious stack out-of-bounds information is seen.
|
Currently using stack_trace to obtain the stack of the swapper thread encounters the following problems:
`>>> trace = stack_trace(0)
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "", line 1, in
File "/usr/local/lib/python3.8/dist-packages/drgn-0.0.30+unknown-py3.8-linux-x86_64.egg/drgn/init.py", line 288, in stack_trace
return get_default_prog().stack_trace(thread)
LookupError: task not found
`
Is there a way to trace back the swapper on each cpu?
Thanks.
The text was updated successfully, but these errors were encountered: