Dump maple tree offset variables by "help -o" #13

fengjixuchui · 2023-01-12T06:14:03Z

No description provided.

Kernel commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter"), which is contained in Linux 6.2-rc1 and later kernels, changed mm_struct.rss_stat from struct mm_rss_stat into an array of struct percpu_counter. Without the patch, "ps" and several commands fail with the following error message: ps: invalid structure member offset: mm_rss_stat_count FILE: memory.c LINE: 4724 FUNCTION: get_task_mem_usage() Signed-off-by: Kazuhito Hagio <[email protected]>

Recently the following failure has been observed on some vmcores when using the mount command: crash> mount MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff97a4818a3480 ffff979500013800 rootfs none / ffff97e4846ca700 ffff97e484653000 sysfs sysfs /sys ... ffff97b484753420 0 mount: invalid kernel virtual address: 0 type: "super_block buffer" The kernel virtual address of the super_block is zero when the mount command fails with the vfsmnt address 0xffff97b484753420. And the remaining mount information will be discarded. That is not expected. Check the address and skip it with a warning, if this is an invalid kernel virtual address, that can avoid truncating the remaining mount dumps. Reported-by: Dave Wysochanski <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>

This patch mainly added some environment configurations, macro definitions, specific architecture structures and some function declarations supported by the RISCV64 architecture. We can use the build command to get the simplest version crash tool: make target=RISCV64 -j2 Co-developed-by: Lifang Xia <[email protected]> Signed-off-by: Xianting Tian <[email protected]>

1. Add riscv64_init() implementation, do all necessary machine-specific setup, which will be called multiple times during initialization. 2. Add riscv64 sv39/48/57 pagetable macro definitions, the function of converting virtual address to a physical address via 4K page table. For 2M and 1G pagesize, they will be implemented in the future(currently not supported). 3. Add the implementation of the vtop command, which is used to convert a virtual address to a physical address(call the functions defined in 2). 4. Add the implementation to get virtual memory layout, va_bits, phys_ram_base from vmcoreinfo. As these configurations changes from time to time, we sent a Linux kernel patch to export these configurations, which can simplify the development of crash tool. The kernel commit: 649d6b1019a2 ("RISC-V: Add arch_crash_save_vmcoreinfo") 5. Add riscv64_get_smp_cpus() implementation, get the number of cpus. 6. Add riscv64_get_page_size() implementation, get page size. And so on. With this patch, we can enter crash command line, and run "vtop", "mod", "rd", "*", "p", "kmem" ... Tested on QEMU RISCV64 end and SoC platform of T-head Xuantie 910 CPU. KERNEL: vmlinux DUMPFILE: vmcore CPUS: 1 DATE: Fri Jul 15 10:24:25 CST 2022 UPTIME: 00:00:33 LOAD AVERAGE: 0.05, 0.01, 0.00 TASKS: 41 NODENAME: buildroot RELEASE: 5.18.9 VERSION: #30 SMP Fri Jul 15 09:47:03 CST 2022 MACHINE: riscv64 (unknown Mhz) MEMORY: 1 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 113 COMMAND: "sh" TASK: ff60000002269600 [THREAD_INFO: ff60000002269600] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> p mem_map mem_map = $1 = (struct page *) 0xff6000003effbf00 crash> p /x *(struct page *) 0xff6000003effbf00 $5 = { flags = 0x1000, { { { lru = { next = 0xff6000003effbf08, prev = 0xff6000003effbf08 }, { __filler = 0xff6000003effbf08, mlock_count = 0x3effbf08 } }, mapping = 0x0, index = 0x0, private = 0x0 }, crash> mod MODULE NAME BASE SIZE OBJECT FILE ffffffff0113e740 nvme_core ffffffff01133000 98304 (not loaded) [CONFIG_KALLSYMS] ffffffff011542c0 nvme ffffffff0114c000 61440 (not loaded) [CONFIG_KALLSYMS] crash> rd ffffffff0113e740 8 ffffffff0113e740: 0000000000000000 ffffffff810874f8 .........t...... ffffffff0113e750: ffffffff011542c8 726f635f656d766e .B......nvme_cor ffffffff0113e760: 0000000000000065 0000000000000000 e............... ffffffff0113e770: 0000000000000000 0000000000000000 ................ crash> vtop ffffffff0113e740 VIRTUAL PHYSICAL ffffffff0113e740 8254d740 PGD: ffffffff810e9ff8 => 2ffff001 P4D: 0000000000000000 => 000000002fffec01 PUD: 00005605c2957470 => 0000000020949801 PMD: 00007fff7f1750c0 => 0000000020947401 PTE: 0 => 209534e7 PAGE: 000000008254d000 PTE PHYSICAL FLAGS 209534e7 8254d000 (PRESENT|READ|WRITE|GLOBAL|ACCESSED|DIRTY) PAGE PHYSICAL MAPPING INDEX CNT FLAGS ff6000003f0777d8 8254d000 0 0 1 0 Tested-by: Yixun Lan <[email protected]> Signed-off-by: Xianting Tian <[email protected]>

Use generic_dis_filter() function to support dis command implementation. With this patch, we can get the disassembled code, crash> dis __crash_kexec 0xffffffff80088580 <__crash_kexec>: addi sp,sp,-352 0xffffffff80088582 <__crash_kexec+2>: sd s0,336(sp) 0xffffffff80088584 <__crash_kexec+4>: sd s1,328(sp) 0xffffffff80088586 <__crash_kexec+6>: sd s2,320(sp) 0xffffffff80088588 <__crash_kexec+8>: addi s0,sp,352 0xffffffff8008858a <__crash_kexec+10>: sd ra,344(sp) 0xffffffff8008858c <__crash_kexec+12>: sd s3,312(sp) 0xffffffff8008858e <__crash_kexec+14>: sd s4,304(sp) 0xffffffff80088590 <__crash_kexec+16>: auipc s2,0x1057 0xffffffff80088594 <__crash_kexec+20>: addi s2,s2,-1256 0xffffffff80088598 <__crash_kexec+24>: ld a5,0(s2) 0xffffffff8008859c <__crash_kexec+28>: mv s1,a0 0xffffffff8008859e <__crash_kexec+30>: auipc a0,0xfff Signed-off-by: Xianting Tian <[email protected]>

With the patch, we can get the irq info, crash> irq IRQ IRQ_DESC/_DATA IRQACTION NAME 0 (unused) (unused) 1 ff60000001329600 ff60000001d17180 "101000.rtc" 2 ff60000001329800 ff60000001d17680 "ttyS0" 3 ff60000001329a00 ff60000001c33c00 "virtio0" 4 ff60000001329c00 ff60000001c33f80 "virtio1" 5 ff6000000120f400 ff60000001216000 "riscv-timer" Signed-off-by: Xianting Tian <[email protected]>

1, Add the implementation to get stack frame from active & inactive task's stack. 2, Add 'bt -l' command support get a line number associated with a current pc address. 3, Add 'bt -f' command support to display all stack data contained in a frame With the patch, we can get the backtrace, crash> bt PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 #1 [ff20000010333cf0] panic at ffffffff806578c6 #2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c #3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604 #4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4 #5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8 #6 [ff20000010333e40] vfs_write at ffffffff80152bb2 #7 [ff20000010333e80] ksys_write at ffffffff80152eda #8 [ff20000010333ed0] sys_write at ffffffff80152f52 crash> bt -l PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/arch/riscv/kernel/crash_save_regs.S: 47 #1 [ff20000010333cf0] panic at ffffffff806578c6 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/kernel/panic.c: 276 ... ... crash> bt -f PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 [PC: ffffffff800078f8 RA: ffffffff806578c6 SP: ff20000010333b90 SIZE: 352] ff20000010333b90: ff20000010333bb0 ffffffff800078f8 ff20000010333ba0: ffffffff8008862c ff20000010333b90 ff20000010333bb0: ffffffff810dde38 ff6000000226c200 ff20000010333bc0: ffffffff8032be68 0720072007200720 ... ... Signed-off-by: Xianting Tian <[email protected]>

Add support form printing out the registers from the dump file. With the patch, we can get the regs, crash> help -r CPU 0: epc : 00ffffffa5537400 ra : ffffffff80088620 sp : ff2000001039bb90 gp : ffffffff810dde38 tp : ff60000002269600 t0 : ffffffff8032be5c t1 : 0720072007200720 t2 : 666666666666663c s0 : ff2000001039bcf0 s1 : 0000000000000000 a0 : ff2000001039bb98 a1 : 0000000000000001 a2 : 0000000000000010 a3 : 0000000000000000 a4 : 0000000000000000 a5 : ff60000001c7d000 a6 : 000000000000003c a7 : ffffffff8035c998 s2 : ffffffff810df0a8 s3 : ffffffff810df718 s4 : ff2000001039bb98 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80c4a468 s8 : 00fffffffde45410 s9 : 0000000000000007 s10: 00aaaaaad1640700 s11: 0000000000000001 t3 : ff60000001218f00 t4 : ff60000001218f00 t5 : ff60000001218000 t6 : ff2000001039b988 Signed-off-by: Xianting Tian <[email protected]>

Add riscv64_dump_machdep_table() implementation, display machdep_table. crash> help -m flags: 80 () kvbase: ff60000000000000 identity_map_base: ff60000000000000 pagesize: 4096 pageshift: 12 pagemask: fffffffffffff000 pageoffset: fff pgdir_shift: 48 ptrs_per_pgd: 512 ptrs_per_pte: 512 stacksize: 16384 hz: 250 memsize: 1071644672 (0x3fe00000) bits: 64 back_trace: riscv64_back_trace_cmd() processor_speed: riscv64_processor_speed() uvtop: riscv64_uvtop() kvtop: riscv64_kvtop() get_stack_frame: riscv64_get_stack_frame() get_stackbase: generic_get_stackbase() get_stacktop: generic_get_stacktop() translate_pte: riscv64_translate_pte() memory_size: generic_memory_size() vmalloc_start: riscv64_vmalloc_start() is_task_addr: riscv64_is_task_addr() verify_symbol: riscv64_verify_symbol() dis_filter: generic_dis_filter() dump_irq: generic_dump_irq() show_interrupts: generic_show_interrupts() get_irq_affinity: generic_get_irq_affinity() cmd_mach: riscv64_cmd_mach() get_smp_cpus: riscv64_get_smp_cpus() is_kvaddr: riscv64_is_kvaddr() is_uvaddr: riscv64_is_uvaddr() verify_paddr: generic_verify_paddr() init_kernel_pgd: NULL value_to_symbol: generic_machdep_value_to_symbol() line_number_hooks: NULL last_pgd_read: ffffffff810e9000 last_p4d_read: 81410000 last_pud_read: 81411000 last_pmd_read: 81412000 last_ptbl_read: 81415000 pgd: 560d586f3ab0 p4d: 560d586f4ac0 pud: 560d586f5ad0 pmd: 560d586f6ae0 ptbl: 560d586f7af0 section_size_bits: 27 max_physmem_bits: 56 sections_per_root: 0 machspec: 560d57d204a0 Signed-off-by: Xianting Tian <[email protected]>

With the patch we can get some basic machine state information, crash> mach MACHINE TYPE: riscv64 MEMORY SIZE: 1 GB CPUS: 1 PROCESSOR SPEED: (unknown) HZ: 250 PAGE SIZE: 4096 KERNEL STACK SIZE: 16384 Signed-off-by: Xianting Tian <[email protected]>

Verify the symbol to accept or reject a symbol from the kernel namelist. Signed-off-by: Xianting Tian <[email protected]>

The following kernel commits split slab info from struct page into struct slab in Linux 5.17. d122019bf061 ("mm: Split slab into its own type") 07f910f9b729 ("mm: Remove slab from struct page") Crash commit 5f390ed followed the change for SLUB, but crash still uses the offset of page.lru inappropriately. Luckily, it could work because it was the same value as the offset of slab.slab_list until Linux 6.1. However, kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head") in Linux 6.2-rc1 changed the offset of slab.slab_list. As a result, without the patch, "kmem -s|-S" options print the following errors and fail to print values correctly for kernels configured with CONFIG_SLUB. crash> kmem -S filp CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME kmem: filp: partial list slab: ffffcc650405ab88 invalid page.inuse: -1 ffff8fa0401eca00 232 1267 1792 56 8k filp ... KMEM_CACHE_NODE NODE SLABS PARTIAL PER-CPU ffff8fa0401cb8c0 0 56 24 8 NODE 0 PARTIAL: SLAB MEMORY NODE TOTAL ALLOCATED FREE kmem: filp: invalid partial list slab pointer: ffffcc650405ab88 Signed-off-by: Kazuhito Hagio <[email protected]>

… later Kernel commit d42f3245c7e2 ("mm: memcg: convert vmstat slab counters to bytes"), which is contained in Linux v5.9-rc1 and later kernels, renamed NR_SLAB_{RECLAIMABLE,UNRECLAIMABLE} to NR_SLAB_{RECLAIMABLE,UNRECLAIMABLE}_B. Without the patch, "kmem -i" command will display incorrect SLAB statistics: crash> kmem -i | grep -e PAGES -e SLAB PAGES TOTAL PERCENTAGE SLAB 89458 349.4 MB 0% of TOTAL MEM ^^^^^ ^^^^^ With the patch, the actual result is: crash> kmem -i | grep -e PAGES -e SLAB PAGES TOTAL PERCENTAGE SLAB 261953 1023.3 MB 0% of TOTAL MEM Reported-by: Buland Kumar Singh <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]> Signed-off-by: Kazuhito Hagio <[email protected]>

With glibc-2.23 and earlier (e.g. RHEL7), crash build fails with errors like this due to EM_RISCV undeclared: $ make -j 24 warn TARGET: X86_64 CRASH: 8.0.2++ GDB: 10.2 ... symbols.c: In function 'is_kernel': symbols.c:3746:8: error: 'EM_RISCV' undeclared (first use in this function) case EM_RISCV: ^ ... Define EM_RISCV as 243 [1][2] if not defined. [1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=94e73c95d9b5 [2] http://www.sco.com/developers/gabi/latest/ch4.eheader.html Signed-off-by: Kazuhito Hagio <[email protected]>

This is a backported patch from gdb. Without the patch, the following crash command may abort due to an assertion failure in the gdb's copy_type(): crash> px __per_cpu_start:0 gdbtypes.c:5505: internal-error: type* copy_type(const type*): Assertion `TYPE_OBJFILE_OWNED (type)' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) The gdb commit 8e2da1651879 ("Fix assertion failure in copy_type") solved the current issue. Reported-by: Buland Kumar Singh <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>

Kernel commit e36ce448a08d ("mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation"), which is contained in Linux 6.1 and later kernels, removed kmem_cache.freelist_cache member on kernels configured with CONFIG_SLAB=y. Without the patch, crash does not set SLAB_OVERLOAD_PAGE and "kmem -s|-S" options fail with the following error: kmem: invalid structure member offset: slab_list FILE: memory.c LINE: 12156 FUNCTION: verify_slab_v2() Use kmem_cache.freelist_size instead, which was introduced together with kmem_cache.freelist_cache by kernel commit 8456a648cf44. Signed-off-by: Kazuhito Hagio <[email protected]>

Kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head"), which is contained in Linux 6.2-rc1 and later kernels, changed the offset of slab.slab_list and now it's not equal to the offset of page.lru. Without the patch, "kmem -s|-S" options print errors and zeros for slab counters like this for kernels configured with CONFIG_SLAB=y. crash> kmem -s CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME kmem: rpc_inode_cache: partial list: page/slab: fffff31ac4125190 bad active counter: 99476865 kmem: rpc_inode_cache: partial list: page/slab: fffff31ac4125190 bad s_mem pointer: 100000003 kmem: rpc_inode_cache: full list: page/slab: fffff31ac4125150 bad active counter: 99476225 kmem: rpc_inode_cache: full list: page/slab: fffff31ac4125150 bad active counter: 99476225 kmem: rpc_inode_cache: full list: page/slab: fffff31ac4125150 bad s_mem pointer: 100000005 ffff930202adfb40 704 0 0 0 4k rpc_inode_cache ... Signed-off-by: Kazuhito Hagio <[email protected]>

There have been two ways to iterate vm_area_struct until Linux 6.0: 1) by rbtree, aka vma.vm_rb; 2) by linked list, aka vma.vm_{next,prev}. However with the maple tree patches[1][2] in Linux 6.1, vm_rb and vm_{next,prev} are removed from vm_area_struct. The vm_area_dump() in crash mainly uses the linked list for vma iteration, which will not work for this case. So the maple tree iteration needs to be ported to crash. For crash, currently it only iteratively reads the maple tree, no more rcu safe or maple tree modification features needed. So we only port a subset of kernel maple tree features. In addition, we need to modify the ported kernel source code, making it compatible with crash. This patch deals with the two issues: 1) Poring mt_dump() function and all its dependencies from kernel source to crash, to enable crash maple tree iteration, 2) adapting the ported code with crash. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=524e00b36e8c547f5582eef3fb645a8d9fc5e3df [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=763ecb035029f500d7e6dc99acd1ad299b7726a1 Signed-off-by: Tao Liu <[email protected]>

The maple tree is a new data structure for crash, so "tree" command needs to support it for users to dump and view the content of maple trees. This patch achieves this by using ported mt_dump() and its related functions from kernel and adapting them with "tree" command. Also introduce a new -v arg specifically for dumping the complete content of a maple tree: crash> tree -t maple 0xffff9034c006aec0 -v maple_tree(ffff9034c006aec0) flags 309, height 2 root 0xffff9034de70041e 0-18446744073709551615: node 0xffff9034de700400 depth 0 type 3 parent 0xffff9034c006aec1 contents:... 0-140112331583487: node 0xffff9034c01e8800 depth 1 type 1 parent 0xffff9034de700406 contents:... 0-94643156942847: (nil) 94643156942848-94643158024191: 0xffff9035131754c0 94643158024192-94643160117247: (nil) ... The existing options of "tree" command can work as well: crash> tree -t maple -r mm_struct.mm_mt 0xffff9034c006aec0 -p ffff9035131754c0 index: 1 position: root/0/1 ffff9035131751c8 index: 2 position: root/0/3 ffff9035131757b8 index: 3 position: root/0/4 ... crash> tree -t maple 0xffff9034c006aec0 -p -x -s vm_area_struct.vm_start,vm_end ffff9035131754c0 index: 1 position: root/0/1 vm_start = 0x5613d3c00000, vm_end = 0x5613d3d08000, ffff9035131751c8 index: 2 position: root/0/3 vm_start = 0x5613d3f07000, vm_end = 0x5613d3f0b000, ffff9035131757b8 index: 3 position: root/0/4 vm_start = 0x5613d3f0b000, vm_end = 0x5613d3f14000, .... Signed-off-by: Tao Liu <[email protected]>

do_maple_tree() is similar to do_radix_tree() and do_xarray(), which takes the same do_maple_tree_traverse entry as tree command. Signed-off-by: Tao Liu <[email protected]>

Since memory.c:vm_area_dump() will iterate all vma, this patch mainly introduces maple tree vma iteration to it. We extract the code which handles each vma into a function. If mm_struct_mmap exist, aka the linked list of vma iteration available, we goto the original way; if not and mm_struct_mm_mt exist, aka maple tree is available, then we goto the maple tree vma iteration. Signed-off-by: Tao Liu <[email protected]>

Signed-off-by: Tao Liu <[email protected]>

In the previous patches, some variables are added to offset_table and size_table, print them out with "help -o" command. Signed-off-by: Tao Liu <[email protected]>

Kernel commit 7d65f4a65532 ("irq: Consolidate do_softirq() arch overriden implementations") renamed the call_softirq to do_softirq_own_stack, and there is no exception frame also when coming from do_softirq_own_stack. Without the patch, crash may unnecessarily output an exception frame with a warning as below: crash> foreach bt ... PID: 0 TASK: ffff914f820a8000 CPU: 25 COMMAND: "swapper/25" #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763 #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423 #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9 #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8 [exception RIP: clone_endio+172] RIP: ffffffffc005c1ec RSP: ffffa1d403d08e98 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff915326fba230 RCX: 0000000000000018 RDX: ffffffffc0075400 RSI: 0000000000000000 RDI: ffff915326fba230 RBP: ffff915326fba1c0 R8: 0000000000001000 R9: ffff915308d6d2a0 R10: 000000a97dfe5e10 R11: ffffa1d40038fe98 R12: ffff915302babc40 R13: ffff914f94360000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod] #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954 #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968 #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95 #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a --- <IRQ stack> --- #12 [ffffa1d40038fe70] do_softirq_own_stack at ffffffffa7000f9a [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: 00000000 RAX: ffffffffa672eae5 RBX: ffffffffa83b34e0 RCX: ffffffffa672eb12 RDX: 0000000000000010 RSI: 8b7d6c8869010c00 RDI: 0000000000000085 RBP: 0000000000000286 R8: ffff914f820a8000 R9: ffffffffa67a94e0 R10: 0000000000000286 R11: ffffffffa66fb4c5 R12: ffffffffa67a898b R13: 0000000000000000 R14: fffffffffffffff8 R15: ffffffffa67a1e68 ORIG_RAX: 0000000000000000 CS: 0000 SS: ffffffffa672edff bt: WARNING: possibly bogus exception frame #13 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c #14 [ffffa1d40038ff50] secondary_startup_64_no_verify at ffffffffa6600116 ... Reported-by: Marco Patalano <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>

…usly There is an issue that, for kernel modules, "dis -rl" fails to display modules code line number data after execute "bt" command in crash. Without the patch: crsah> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp 0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx 0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi With the patch: crash> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp The root cause is, after kernel module been loaded by mod command, the symtable is not expanded in gdb side. crash bt or dis command will trigger such an expansion. However the symtable expansion is different for the 2 commands: The stack trace of "dis -rl" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f75e in dw2_expand_all_symtabs ... #8 0x00000000008f254d in gdb_get_line_number ... #9 0x00000000008f22af in gdb_command_funnel_1 ... #10 0x00000000008f2003 in gdb_command_funnel ... #11 0x00000000005b7f02 in gdb_interface ... #12 0x00000000005f5bd8 in get_line_number ... #13 0x000000000059e574 in cmd_dis ... The stack trace of "bt" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f8ed in dw2_lookup_symbol ... #8 0x00000000008e6d03 in lookup_symbol_via_quick_fns ... #9 0x00000000008e7153 in lookup_symbol_in_objfile ... #10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb ... #11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order ... #12 0x00000000008e754e in lookup_global_or_static_symbol ... #13 0x00000000008e75da in lookup_static_symbol ... #14 0x00000000008e632c in lookup_symbol_aux ... #15 0x00000000008e5a7a in lookup_symbol_in_language ... #16 0x00000000008e5b30 in lookup_symbol ... #17 0x00000000008f2a4a in gdb_get_datatype ... #18 0x00000000008f22c0 in gdb_command_funnel_1 ... crash-utility#19 0x00000000008f2003 in gdb_command_funnel ... crash-utility#20 0x00000000005b7f02 in gdb_interface ... crash-utility#21 0x00000000005f8a9f in datatype_info ... crash-utility#22 0x0000000000599947 in cpu_map_size ... crash-utility#23 0x00000000005a975d in get_cpus_online ... crash-utility#24 0x0000000000637a8b in diskdump_get_prstatus_percpu ... crash-utility#25 0x000000000062f0e4 in get_netdump_regs_x86_64 ... crash-utility#26 0x000000000059fe68 in back_trace ... crash-utility#27 0x00000000005ab1cb in cmd_bt ... For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand all symtable of the objfile, or "*.ko.debug" in our case. However for the stacktrace of "bt", it doesn't expand all, but only a subset of symtable which is enough to find a symbol by dw2_lookup_symbol(). As a result, the objfile->compunit_symtabs, which is the head of a single linked list of struct compunit_symtab, is not NULL but didn't contain all symtables. It will not be reinitialized in gdb_get_line_number() by "dis -rl" because !objfile_has_full_symbols(objfile) check will fail, so it cannot display the proper code line number data. Since objfile_has_full_symbols(objfile) check cannot ensure all symbols been expanded, this patch add a new member as a flag for struct objfile to record if all symbols have been expanded. The flag will be set only ofter expand_all_symtabs been called. Signed-off-by: Tao Liu <[email protected]>

This patch introduces per-cpu IRQ stacks for RISCV64 to let "bt" do backtrace on it and 'bt -E' search eframes on it, and the 'help -m' command displays the addresses of each per-cpu IRQ stack. TEST: a vmcore dumped via hacking the handle_irq_event_percpu() ( Why not using lkdtm INT_HW_IRQ_EN EXCEPTION ? There is a deadlock[1] in crash_kexec path if use that) crash> bt PID: 0 TASK: ffffffff8140db00 CPU: 0 COMMAND: "swapper/0" #0 [ff20000000003e60] __handle_irq_event_percpu at ffffffff8006462e #1 [ff20000000003ed0] handle_irq_event_percpu at ffffffff80064702 #2 [ff20000000003ef0] handle_irq_event at ffffffff8006477c #3 [ff20000000003f20] handle_fasteoi_irq at ffffffff80068664 #4 [ff20000000003f50] generic_handle_domain_irq at ffffffff80063988 #5 [ff20000000003f60] plic_handle_irq at ffffffff8046633e #6 [ff20000000003fb0] generic_handle_domain_irq at ffffffff80063988 #7 [ff20000000003fc0] riscv_intc_irq at ffffffff80465f8e #8 [ff20000000003fd0] handle_riscv_irq at ffffffff808361e8 PC: ffffffff80837314 [default_idle_call+50] RA: ffffffff80837310 [default_idle_call+46] SP: ffffffff81403da0 CAUSE: 8000000000000009 epc : ffffffff80837314 ra : ffffffff80837310 sp : ffffffff81403da0 gp : ffffffff814ef848 tp : ffffffff8140db00 t0 : ff2000000004bb18 t1 : 0000000000032c73 t2 : ffffffff81200a48 s0 : ffffffff81403db0 s1 : 0000000000000000 a0 : 0000000000000004 a1 : 0000000000000000 a2 : ff6000009f1e7000 a3 : 0000000000002304 a4 : ffffffff80c1c2d8 a5 : 0000000000000000 a6 : ff6000001fe01958 a7 : 00002496ea89dbf1 s2 : ffffffff814f0220 s3 : 0000000000000001 s4 : 000000000000003f s5 : ffffffff814f03d8 s6 : 0000000000000000 s7 : ffffffff814f00d0 s8 : ffffffff81526f10 s9 : ffffffff80c1d880 s10: 0000000000000000 s11: 0000000000000001 t3 : 0000000000003392 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000040 status: 0000000200000120 badaddr: 0000000000000000 cause: 8000000000000009 orig_a0: ffffffff80837310 --- <IRQ stack> --- #9 [ffffffff81403da0] default_idle_call at ffffffff80837314 #10 [ffffffff81403db0] do_idle at ffffffff8004d0a0 #11 [ffffffff81403e40] cpu_startup_entry at ffffffff8004d21e #12 [ffffffff81403e60] kernel_init at ffffffff8083746a #13 [ffffffff81403e70] arch_post_acpi_subsys_init at ffffffff80a006d8 #14 [ffffffff81403e80] console_on_rootfs at ffffffff80a00c92 crash> crash> bt -E CPU 0 IRQ STACK: KERNEL-MODE EXCEPTION FRAME AT: ff20000000003a48 PC: ffffffff8006462e [__handle_irq_event_percpu+30] RA: ffffffff80064702 [handle_irq_event_percpu+18] SP: ff20000000003e60 CAUSE: 000000000000000d epc : ffffffff8006462e ra : ffffffff80064702 sp : ff20000000003e60 gp : ffffffff814ef848 tp : ffffffff8140db00 t0 : 0000000000046600 t1 : ffffffff80836464 t2 : ffffffff81200a48 s0 : ff20000000003ed0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000118 a2 : 0000000000000052 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000010001 a6 : ff6000001fe01958 a7 : 00002496ea89dbf1 s2 : ff60000000941ab0 s3 : ffffffff814a0658 s4 : ff60000000089230 s5 : ffffffff814a0518 s6 : ffffffff814a0620 s7 : ffffffff80e5f0f8 s8 : ffffffff80fc50b0 s9 : ffffffff80c1d880 s10: 0000000000000000 s11: 0000000000000001 t3 : 0000000000003392 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000040 status: 0000000200000100 badaddr: 0000000000000078 cause: 000000000000000d orig_a0: ff20000000003ea0 CPU 1 IRQ STACK: (none found) crash> crash> help -m <snip> machspec: ced1e0 irq_stack_size: 16384 irq_stacks[0]: ff20000000000000 irq_stacks[1]: ff20000000008000 crash> [1]: https://lore.kernel.org/linux-riscv/[email protected]/ Signed-off-by: Song Shuai <[email protected]>

…ss range Previously, to find a module symbol and its offset by an arbitrary address, all symbols within the module will be iterated by address ascending order until the last symbol with a smaller address been noticed. However if the address is not within the module address range, e.g. the address is higher than the module's last symbol's address, then the module can be surely skipped, because its symbol iteration is unnecessary. This can speed up the kernel module symbols finding and improve the overall performance. Without the patch: $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux crash> bt 8993 PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0" #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8 #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9 #2 [ffff927569cd7768] __mutex_lock_slowpath at ffffffffb3db6ca7 #3 [ffff927569cd77c0] mutex_lock at ffffffffb3db602f #4 [ffff927569cd77d8] ucache_retrieve at ffffffffc0cf4409 [secfs2] ...snip the stacktrace of the same module... #11 [ffff927569cd7ba0] cskal_path_vfs_getattr_nosec at ffffffffc05cae76 [falcon_kal] ...snip... #13 [ffff927569cd7c40] _ZdlPv at ffffffffc086e751 [falcon_lsm_serviceable] ...snip... crash-utility#20 [ffff927569cd7ef8] unload_network_ops_symbols at ffffffffc06f11c0 [falcon_lsm_pinned_14713] crash-utility#21 [ffff927569cd7f50] system_call_fastpath at ffffffffb3dc539a RIP: 00007f2b28ed4023 RSP: 00007f2a45fe7f80 RFLAGS: 00000206 RAX: 0000000000000012 RBX: 00007f2a68302e00 RCX: 00007f2a682546d8 RDX: 0000000000000826 RSI: 00007eb57ea6a000 RDI: 00000000000000e3 RBP: 00007eb57ea6a000 R8: 0000000000000826 R9: 00000002670bdfd2 R10: 00000002670bdfd2 R11: 0000000000000293 R12: 00000002670bdfd2 R13: 00007f29d501a480 R14: 0000000000000826 R15: 00000002670bdfd2 ORIG_RAX: 0000000000000012 CS: 0033 SS: 002b crash> real 7m14.826s user 7m12.502s sys 0m1.091s With the patch: $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux crash> bt 8993 PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0" #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8 #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9 ...snip the same output... crash> real 0m8.827s user 0m7.896s sys 0m0.938s Signed-off-by: Tao Liu <[email protected]>

k-hagio and others added 23 commits December 16, 2022 13:24

RISCV64: Add 'mach' command support

3f47149

With the patch we can get some basic machine state information, crash> mach MACHINE TYPE: riscv64 MEMORY SIZE: 1 GB CPUS: 1 PROCESSOR SPEED: (unknown) HZ: 250 PAGE SIZE: 4096 KERNEL STACK SIZE: 16384 Signed-off-by: Xianting Tian <[email protected]>

RISCV64: Add the implementation of symbol verify

0d5ad12

Verify the symbol to accept or reject a symbol from the kernel namelist. Signed-off-by: Xianting Tian <[email protected]>

Add do_maple_tree() for maple tree operations

222176a

do_maple_tree() is similar to do_radix_tree() and do_xarray(), which takes the same do_maple_tree_traverse entry as tree command. Signed-off-by: Tao Liu <[email protected]>

Update the help text of "tree" command for maple tree

49f6c20

Signed-off-by: Tao Liu <[email protected]>

Dump maple tree offset variables by "help -o"

46344aa

In the previous patches, some variables are added to offset_table and size_table, print them out with "help -o" command. Signed-off-by: Tao Liu <[email protected]>

fengjixuchui merged commit 710d80b into fengjixuchui:master Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dump maple tree offset variables by "help -o" #13

Dump maple tree offset variables by "help -o" #13

fengjixuchui commented Jan 12, 2023

Dump maple tree offset variables by "help -o" #13

Dump maple tree offset variables by "help -o" #13

Conversation

fengjixuchui commented Jan 12, 2023