forked from crash-utility/crash
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xen: adjust to new scheduler structures #17
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The help/man page of the "net" command suggests that "-n" option can accept two kinds of argument: PID or task_struct pointer. However, the "net -n" command accepts an invalid argument and shows the namespace of the current context silently. For example: crash> net -n 1000000000 NET_DEVICE NAME IP ADDRESS(ES) ffff949dc11d7000 lo 127.0.0.1 ffff949dcc01c000 eno49 192.168.122.17 With the patch, emit an error expectedly. crash> net -n 1000000000 net: invalid task or pid value: 1000000000 Reported-by: Buland Kumar Singh <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>
This is a partial backport patch from gdb commit 834eaf9201c1 ("Fix crash in new DWARF indexer"). Without the patch, the "dis -rl" option may abort due to an assertion failure in gdb's dw2_find_pc_sect_compunit_symtab(): crash> dis -rl ffffffff96ad716c dwarf2/read.c:4928: internal-error: compunit_symtab* dw2_find_pc_sect_compunit_symtab(objfile*, bound_minimal_symbol, CORE_ADDR, obj_section*, int): Assertion `result != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) dwarf2/read.c:4928: internal-error: compunit_symtab* dw2_find_pc_sect_compunit_symtab(objfile*, bound_minimal_symbol, CORE_ADDR, obj_section*, int): Assertion `result != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Aborted (core dumped) Reported-by: Buland Kumar Singh <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>
Kernel commit d2bf38c088e0 ("driver core: remove private pointer from struct bus_type") removed the bus_type.p member, and the "kmem -n" option fails with the following error before displaying memory block information on Linux 6.3-rc1 and later kernels. kmem: invalid structure member offset: bus_type_p FILE: memory.c LINE: 17852 FUNCTION: init_memory_block() Search bus_kset.list instead for subsys_private of memory subsys. Signed-off-by: Kazuhito Hagio <[email protected]>
The size of the percpu stack area of Xen on x86_64 is 8 pages, not 2. This is the case since Xen commit 0b630aa340ec in 2007. While not really critical in its current usage, it should be corrected nevertheless. Signed-off-by: Juergen Gross <[email protected]>
Since many years now the stack address of each percpu stack is available via the stack_base[] array (Xen commit 3cb68d2b59ab made it visible). Use that instead of the indirect method via the percpu variables tss_init or tss_page, especially as the layout of tss_page has changed in Xen 4.16 (Xen commit 91d26ed304ff5), resulting in the stack no longer to be found. Signed-off-by: Juergen Gross <[email protected]>
There has been a significant modification regarding scheduler data in the Xen hypervisor (Xen commit d62fefa4d459). Adapt to new structures and removed fields. Note that this is only the bare minimum to not let crash error out when opening a vmcore in Xen mode with a recent Xen version. Signed-off-by: Juergen Gross <[email protected]>
fengjixuchui
pushed a commit
that referenced
this pull request
Sep 5, 2023
This patch adds KASLR support for Crash to analyze KASLR-ed vmcore since RISC-V Linux is already sufficiently prepared for KASLR [1]. With this patch, even if the Crash '--kaslr' option is not set or Linux CONFIG_RANDOMIZE_BASE is not configured, the 'derive_kaslr_offset()' function will always work to calculate 'kt->relocate' which serves to update the kernel virtual address. Testing in Qemu rv64 virt, kernel log outputed the kernel offset: [ 121.214447] SMP: stopping secondary CPUs [ 121.215445] Kernel Offset: 0x37c00000 from 0xffffffff80000000 [ 121.216312] Starting crashdump kernel... [ 121.216585] Will call new kernel at 94800000 from hart id 0 [ 121.216834] FDT image at 9c7fd000 [ 121.216982] Bye... Running crash with '-d 1' option and without '--kaslr' option, we get the right 'kt->relocate' and kernel link addr: $ ../crash/crash -d 1 vmlinux vmcore_kaslr_0815 ... KASLR: _stext from vmlinux: ffffffff80002000 _stext from vmcoreinfo: ffffffffb7c02000 relocate: 37c00000 (892MB) vmemmap : 0xff1c000000000000 - 0xff20000000000000 vmalloc : 0xff20000000000000 - 0xff60000000000000 mudules : 0xffffffff3952f000 - 0xffffffffb7c00000 lowmem : 0xff60000000000000 - kernel link addr : 0xffffffffb7c00000 ... KERNEL: /home/song/9_linux/linux/00_rv_kaslr/vmlinux DUMPFILE: /tmp/hello/vmcore_kaslr_0815 CPUS: 2 DATE: Tue Aug 15 16:36:15 CST 2023 UPTIME: 00:02:01 LOAD AVERAGE: 0.40, 0.23, 0.09 TASKS: 63 NODENAME: stage4.fedoraproject.org RELEASE: 6.5.0-rc3-00008-gad18dee423ac VERSION: #17 SMP Tue Aug 15 14:41:12 CST 2023 MACHINE: riscv64 (unknown Mhz) MEMORY: 511.8 MB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 160 COMMAND: "bash" TASK: ff6000000152bac0 [THREAD_INFO: ff6000000152bac0] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> [1]: https://lore.kernel.org/linux-riscv/[email protected]/ Signed-off-by: Song Shuai <[email protected]> Reviewed-by: Guo Ren <[email protected]>
fengjixuchui
pushed a commit
that referenced
this pull request
Mar 5, 2024
…usly There is an issue that, for kernel modules, "dis -rl" fails to display modules code line number data after execute "bt" command in crash. Without the patch: crsah> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp 0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx 0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi With the patch: crash> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp The root cause is, after kernel module been loaded by mod command, the symtable is not expanded in gdb side. crash bt or dis command will trigger such an expansion. However the symtable expansion is different for the 2 commands: The stack trace of "dis -rl" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f75e in dw2_expand_all_symtabs ... #8 0x00000000008f254d in gdb_get_line_number ... #9 0x00000000008f22af in gdb_command_funnel_1 ... #10 0x00000000008f2003 in gdb_command_funnel ... #11 0x00000000005b7f02 in gdb_interface ... #12 0x00000000005f5bd8 in get_line_number ... #13 0x000000000059e574 in cmd_dis ... The stack trace of "bt" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f8ed in dw2_lookup_symbol ... #8 0x00000000008e6d03 in lookup_symbol_via_quick_fns ... #9 0x00000000008e7153 in lookup_symbol_in_objfile ... #10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb ... #11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order ... #12 0x00000000008e754e in lookup_global_or_static_symbol ... #13 0x00000000008e75da in lookup_static_symbol ... #14 0x00000000008e632c in lookup_symbol_aux ... #15 0x00000000008e5a7a in lookup_symbol_in_language ... #16 0x00000000008e5b30 in lookup_symbol ... #17 0x00000000008f2a4a in gdb_get_datatype ... #18 0x00000000008f22c0 in gdb_command_funnel_1 ... crash-utility#19 0x00000000008f2003 in gdb_command_funnel ... crash-utility#20 0x00000000005b7f02 in gdb_interface ... crash-utility#21 0x00000000005f8a9f in datatype_info ... crash-utility#22 0x0000000000599947 in cpu_map_size ... crash-utility#23 0x00000000005a975d in get_cpus_online ... crash-utility#24 0x0000000000637a8b in diskdump_get_prstatus_percpu ... crash-utility#25 0x000000000062f0e4 in get_netdump_regs_x86_64 ... crash-utility#26 0x000000000059fe68 in back_trace ... crash-utility#27 0x00000000005ab1cb in cmd_bt ... For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand all symtable of the objfile, or "*.ko.debug" in our case. However for the stacktrace of "bt", it doesn't expand all, but only a subset of symtable which is enough to find a symbol by dw2_lookup_symbol(). As a result, the objfile->compunit_symtabs, which is the head of a single linked list of struct compunit_symtab, is not NULL but didn't contain all symtables. It will not be reinitialized in gdb_get_line_number() by "dis -rl" because !objfile_has_full_symbols(objfile) check will fail, so it cannot display the proper code line number data. Since objfile_has_full_symbols(objfile) check cannot ensure all symbols been expanded, this patch add a new member as a flag for struct objfile to record if all symbols have been expanded. The flag will be set only ofter expand_all_symtabs been called. Signed-off-by: Tao Liu <[email protected]>
fengjixuchui
pushed a commit
that referenced
this pull request
Mar 5, 2024
The patch introduces per-cpu overflow stacks for RISCV64 to let "bt" do backtrace on it and the 'help -m' command dispalys the addresss of each per-cpu overflow stack. TEST: a lkdtm DIRECT EXHAUST_STACK vmcore crash> bt PID: 1 TASK: ff600000000d8000 CPU: 1 COMMAND: "sh" #0 [ff6000001fc501c0] riscv_crash_save_regs at ffffffff8000a1dc #1 [ff6000001fc50320] panic at ffffffff808773ec #2 [ff6000001fc50380] walk_stackframe at ffffffff800056da PC: ffffffff80876a34 [memset+96] RA: ffffffff80563dc0 [recursive_loop+68] SP: ff2000000000fd50 CAUSE: 000000000000000f epc : ffffffff80876a34 ra : ffffffff80563dc0 sp : ff2000000000fd50 gp : ffffffff81515d38 tp : 0000000000000000 t0 : ff2000000000fd58 t1 : ff600000000d88c8 t2 : 6143203a6d74646b s0 : ff20000000010190 s1 : 0000000000000012 a0 : ff2000000000fd58 a1 : 1212121212121212 a2 : 0000000000000400 a3 : ff20000000010158 a4 : 0000000000000000 a5 : 725bedba92260900 a6 : 000000000130e0f0 a7 : 0000000000000000 s2 : ff2000000000fd58 s3 : ffffffff815170d8 s4 : ff20000000013e60 s5 : 000000000000000e s6 : ff20000000013e60 s7 : 0000000000000000 s8 : ff60000000861000 s9 : 00007fffc3641694 s10: 00007fffc3641690 s11: 00005555796ed240 t3 : 0000000000010297 t4 : ffffffff80c17810 t5 : ffffffff8195e7b8 t6 : ff20000000013b18 status: 0000000200000120 badaddr: ff2000000000fd58 cause: 000000000000000f orig_a0: 0000000000000000 --- <OVERFLOW stack> --- #3 [ff2000000000fd50] memset at ffffffff80876a34 #4 [ff20000000010190] recursive_loop at ffffffff80563e16 #5 [ff200000000105d0] recursive_loop at ffffffff80563e16 < recursive_loop ...> #16 [ff20000000013490] recursive_loop at ffffffff80563e16 #17 [ff200000000138d0] recursive_loop at ffffffff80563e16 #18 [ff20000000013d10] lkdtm_EXHAUST_STACK at ffffffff8088005e crash-utility#19 [ff20000000013d30] lkdtm_do_action at ffffffff80563292 crash-utility#20 [ff20000000013d40] direct_entry at ffffffff80563474 crash-utility#21 [ff20000000013d70] full_proxy_write at ffffffff8032fb3a crash-utility#22 [ff20000000013db0] vfs_write at ffffffff801d6414 crash-utility#23 [ff20000000013e60] ksys_write at ffffffff801d67b8 crash-utility#24 [ff20000000013eb0] __riscv_sys_write at ffffffff801d6832 crash-utility#25 [ff20000000013ec0] do_trap_ecall_u at ffffffff80884a20 crash> crash> help -m <snip> irq_stack_size: 16384 irq_stacks[0]: ff20000000000000 irq_stacks[1]: ff20000000008000 overflow_stack_size: 4096 overflow_stacks[0]: ff6000001fa7a510 overflow_stacks[1]: ff6000001fc4f510 crash> Signed-off-by: Song Shuai <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.