Extend field length of task attributes #1

fengjixuchui · 2022-07-03T08:23:24Z

No description provided.

read the per-cpu NT_PRSTATUS note contents if an invalid note is encountered. Without the patch, if an invalid note is found, all other notes were ignored, and subsequent "bt" attempts on the active tasks would fail. ([email protected], [email protected])

to read the per-cpu NT_PRSTATUS note contents if an invalid note is encountered. Without the patch, if an invalid note is found, all other notes were ignored, and subsequent "bt" attempts on the active tasks would fail. ([email protected], [email protected])

32-bit unsigned int, but crash was reading its 32-bit value into a 64-bit unsigned long stack variable. All extra bits that pre-existed in the upper 32-bits of the stack variable were passed along as part of a buffer size request; if the upper 32-bit bits were non-zero, then the command would fail with a dump of the internal buffer allocation stats followed by the message "log: cannot allocate any more memory!". ([email protected])

the new TCR_EL1.T1SZ vmcoreinfo entry, display its value during session initialization only when invoking crash with "-d1" or larger -d debug value. ([email protected])

that contain commit b6e43c0e3129ffe87e65c85f20fcbdf0eb86fba0, titled "arm64: remove __exception annotations". Without the patch, the ARM64 crash session fails during initialization with the error message "crash: cannot resolve __exception_text_start". ([email protected])

Without the patch, the crash session fails during initialization with the error message "crash: vmlinux and vmcore do not match!". ([email protected])

were taken from S390X KASLR kernels. ([email protected])

and LKCD dumpfiles that were taken from S390X KASLR kernels to avoid calling an s390x-specific function from generic code. ([email protected])

the crash library fails with a stream of error messages indicating "multiple definition of 'diskdump_flags'" ([email protected])

build of the embedded gdb module fails with an error message that indicates "multiple definition of 'tdesc_aarch64'". ([email protected])

may be truncated, ending with the error message "log: invalid log_buf entry encountered". ([email protected])

the virtual memory region between the end of the vmalloc region and the beginning of the vmemmap region. Without the patch, reads of virtual addresses within that region are not recognized properly and will fail. ([email protected])

shared object extension modules that are located in the directories that are part of the normal search path that is used when a shared object is loaded without a fully-qualified pathname. ([email protected])

contain commit 3539b96e041c06e4317082816d90ec09160aeb11, titled "bpf: group memory related fields in struct bpf_map_memory". Without the patch, the options prints "(unknown)" for MEMLOCK and UID. ([email protected])

name string. ([email protected])

memory located at extraordinarily high addresses. In a system with a physical address range from 0x602770ecf000 to 0x6027ffffffff, the crash utility fails during session initialization due to an integer overflow, ending with the error message "crash: vmlinux and vmcore do not match!". ([email protected])

display of a single data structure member. Without the patch, the option only supported the raw display of a complete data structure. ([email protected])

the minimum display size from the size of a per-architecture long (32-bits or 64-bits) down to 8-bits, 16-bits or 32-bits when the requested size is equal to one of the smaller sizes. ([email protected])

line option for Linux 5.4 and later dumpfiles, which require the kernel's dynamically-determined "vabits_actual" value for virtual address translation. Without the patch, the crash session fails during initialization with the error message "crash: cannot determine VA_BITS_ACTUAL". This option will become unnecessary when the proposed TCR_EL1.T1SZ vmcoreinfo entry entry is incorporated into the kernel. ([email protected])

with CONFIG_SLAB_FREELIST_HARDENED enabled. Without the patch, there will be error messages of the type "kmem: <cache name> slab: <address> invalid freepointer: <obfuscated address>" for caches created during SLUB bootstrap, as they are likely to have s->random == 0. ([email protected])

swapped to the zswap compressed swap cache, an attempt will be made to find and decompress the page. ([email protected])

system. Without the patch, if the [pid|task] has been created since the last internal task table refresh, the command fails with the error message "mount: invalid task or pid value: <value>". ([email protected])

timestamp value of each message into human readable format. ([email protected])

appended with an ".llvm.<number>" string. As a result, commands such as "irq" fail with the error message irq: neither irq_desc, _irq_desc, irq_desc_ptrs or irq_desc_tree symbols exist". This patch adds the LLVM-generated string to the other strings that are stripped from symbols before they are stored. ([email protected])

as in-kernel feature. The value of CONFIG_ARM64_KERNELPACMASK will be exported as a vmcoreinfo entry, and will be used with text return addresses on the kernel stack. ([email protected])

(1) Linux kernel patch "arm64: mm: Introduce vabits_actual" introduced "physvirt_offset", which is not equal to (PHYS_OFFSET - PAGE_OFFSET) when KASLR is enabled. physvirt_offset is caculated in arch/arm64/mm/init.c before memstart_addr (PHYS_OFFSET) is randomized. Let arm64_VTOP() and arm64_PTOV() use physvirt_offset instead, whose default value is set to (phys_offset - page_offset) (2) For ARM64 RAM dumps without any vmcoreinfo and KASLRpassed as argument, " _stext_vmlinux" is not set. This causes incorrect calculation of vmalloc_start with VA_BITS_ACTUAL. (3) For ARM64 RAM dumps For ramdumps without vmcoreinfo, get CONFIG_ARM64_VA_BITS from in-kernel config. Without this, vmemmap size is calculated incorrectly. (4) Fix the vmemmap_start to match with what the kernel uses. ([email protected])

([email protected])

Currently gdb's "ptype" command does not print the details of unnamed structure and union deeper than second level in a structure, it prints only "{...}" instead. And crash's "struct" and similar commands also inherit this behavior, so we cannot get the full information of them. To print the details of them, change the show variable when it is an unnamed one like crash-7.x. Without the patch: crash> struct -o page struct page { [0] unsigned long flags; union { struct {...}; struct {...}; ... With the patch: crash> struct -o page struct page { [0] unsigned long flags; union { struct { [8] struct list_head lru; [24] struct address_space *mapping; [32] unsigned long index; [40] unsigned long private; }; struct { [8] dma_addr_t dma_addr; }; ... Signed-off-by: Kazuhito Hagio <[email protected]>

Since Linux 5.16-rc1, which kernel commit 9a14d6ce4135 ("block: remove debugfs blk_mq_ctx dispatched/merged/completed attributes") removed the members from struct blk_mq_ctx, crash has not displayed disk I/O statistics for multiqueue (blk-mq) devices. Let's parse the sbitmap in blk-mq layer to support it. Signed-off-by: Lianbo Jiang <[email protected]> Signed-off-by: Kazuhito Hagio <[email protected]>

Kernel commit 4e5cc99e1e48 ("blk-mq: manage hctx map via xarray") removed the "queue_hw_ctx" member from struct request_queue at Linux v5.18-rc1, and replaced it with a struct xarray "hctx_table". Without the patch, the "dev -d|-D" options will print an error: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL READ WRITE dev: invalid structure member offset: request_queue_queue_hw_ctx With the patch: crash> dev -d MAJOR GENDISK NAME REQUEST_QUEUE TOTAL READ WRITE 8 ffff8e99d0a1ae00 sda ffff8e9c14c59980 10 6 4 Signed-off-by: Lianbo Jiang <[email protected]>

The information of the "bpf" and "sbitmapq" commands is missing in the man page of the crash utility. Let's add it to the man page. Signed-off-by: Lianbo Jiang <[email protected]>

The sbitmap_queue.ws_active member was added by kernel commit 5d2ee7122c73 ("sbitmap: optimize wakeup check") at Linux 5.0. Without the patch, on earlier kernels the "sbitmapq" command fails with the following error: crash> sbitmapq ffff8f1a3611cf10 sbitmapq: invalid structure member offset: sbitmap_queue_ws_active FILE: sbitmap.c LINE: 393 FUNCTION: sbitmap_queue_context_load() Signed-off-by: Kazuhito Hagio <[email protected]>

The sbitmap_word.cleared member was added by kernel commit ea86ea2cdced ("sbitmap: ammortize cost of clearing bits") at Linux 5.0. Without the patch, on earlier kernels the "sbitmapq" command fails with the following error: crash> sbitmapq ffff8f1a3611cf10 sbitmapq: invalid structure member offset: sbitmap_word_cleared FILE: sbitmap.c LINE: 92 FUNCTION: __sbitmap_weight() Signed-off-by: Kazuhito Hagio <[email protected]>

The sbitmap_queue.min_shallow_depth member was added by kernel commit a327553965de ("sbitmap: fix missed wakeups caused by sbitmap_queue_get_shallow()") at Linux 4.18. Without the patch, on earlier kernels the "sbitmapq" command fails with the following error: crash> sbitmapq ffff89bb7638ee50 sbitmapq: invalid structure member offset: sbitmap_queue_min_shallow_depth FILE: sbitmap.c LINE: 398 FUNCTION: sbitmap_queue_context_load() Signed-off-by: Kazuhito Hagio <[email protected]>

There have been a few reports that the "dev -d|-D" options displayed incorrect I/O stats due to racy blk_mq_ctx.rq_* counters. To fix it, make the options parse sbitmap to count I/O stats on Linux 4.18 and later kernels, which include RHEL8 ones. To do this, adjust to the blk_mq_tags structure of Linux 5.10 through 5.15 kernels, which contain kernel commit 222a5ae03cdd ("blk-mq: Use pointers for blk_mq_tags bitmap tags") and do not contain ae0f1a732f4a ("blk-mq: Stop using pointers for blk_mq_tags bitmap tags"). Signed-off-by: Kazuhito Hagio <[email protected]>

The current struct wait_queue_head was renamed by kernel commit 9d9d676f595b ("sched/wait: Standardize internal naming of wait-queue heads") at Linux 4.13. Without the patch, on earlier kernels the "sbitmapq" command fails with the following error: crash> sbitmapq ffff8801790b3b50 depth = 128 busy = 0 bits_per_word = 32 ... sbitmapq: invalid structure member offset: wait_queue_head_head FILE: sbitmap.c LINE: 344 FUNCTION: sbitmap_queue_show() Signed-off-by: Kazuhito Hagio <[email protected]>

commit 364b2e4 ("sbitmapq: remove struct and member validation in sbitmapq_init()") allowed the use of the "sbitmapq" command unconditionally. Without the patch, the command fails with the following error on kernels without sbitmap: crash> sbitmapq ffff88015796e550 sbitmapq: invalid structure member offset: sbitmap_queue_sb FILE: sbitmap.c LINE: 385 FUNCTION: sbitmap_queue_context_load() Now the command supports Linux 4.9 and later kernels since it was abstracted out, so it can be limited by the non-existence of the sbitmap structure. Signed-off-by: Kazuhito Hagio <[email protected]>

The following kernel commits eventually removed the bdev_map array in Linux v5.11 kernel: e418de3abcda ("block: switch gendisk lookup to a simple xarray") 22ae8ce8b892 ("block: simplify bdev/disk lookup in blkdev_get") Without the patch, the "dev" command fails to dump block device data with the following error: crash> dev ... dev: blkdevs or all_bdevs: symbols do not exist To get block device's gendisk, search blockdev_superblock.s_inodes instead of bdev_map. Signed-off-by: Kazuhito Hagio <[email protected]>

Nowadays, some machines have many CPU cores and memory, and some distributions have a larger kernel.pid_max parameter, e.g. 7 digits. This impairs the readability of a few commands, especially "ps" and "ps -l|-m" options. Let's extend the field length of the task attributes, PID, CPU, VSZ, and RSS to improve the readability. Without the patch: crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM ... 2802197 2699997 2 ffff916f63c40000 IN 0.0 307212 10688 timer 2802277 1 0 ffff9161a25bb080 IN 0.0 169040 2744 gpg-agent 2806711 3167854 10 ffff9167fc498000 IN 0.0 127208 6508 su 2806719 2806711 1 ffff91633c3a48c0 IN 0.0 29452 6416 bash 2988346 1 5 ffff916f7c629840 IN 2.8 9342476 1917384 qemu-kvm With the patch: crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM ... 2802197 2699997 2 ffff916f63c40000 IN 0.0 307212 10688 timer 2802277 1 0 ffff9161a25bb080 IN 0.0 169040 2744 gpg-agent 2806711 3167854 10 ffff9167fc498000 IN 0.0 127208 6508 su 2806719 2806711 1 ffff91633c3a48c0 IN 0.0 29452 6416 bash 2988346 1 5 ffff916f7c629840 IN 2.8 9342476 1917384 qemu-kvm Signed-off-by: Kazuhito Hagio <[email protected]>

The previous implementation to locate the call instruction is to strstr "call", then check whether the previous char is ' ' or '\t'. The implementation is problematic. For example it cannot resolve the following disassembly string: "0xffffffffc0995378 <nfs41_callback_svc+344>:\tcall 0xffffffff8ecfa4c0 <schedule>\n" strstr will locate the "_call" and char check fails, as a result, extract_hex fails to get the calling address. NOTE: the issue is more likely to be reproduced when patch[1] applied. Because without patch[1], the disassembly string will be as follows, so the issue is no longer reproducible. "0xffffffffc0995378:\tcall 0xffffffff8ecfa4c0 <schedule>\n" Before the patch: crash> bt 1472 PID: 1472 TASK: ffff8c121fa72f70 CPU: 18 COMMAND: "nfsv4.1-svc" #0 [ffff8c16231a3db8] __schedule at ffffffff8ecf9ef3 #1 [ffff8c16231a3e40] schedule at ffffffff8ecfa4e9 After the patch: crash> bt 1472 PID: 1472 TASK: ffff8c121fa72f70 CPU: 18 COMMAND: "nfsv4.1-svc" #0 [ffff8c16231a3db8] __schedule at ffffffff8ecf9ef3 #1 [ffff8c16231a3e40] schedule at ffffffff8ecfa4e9 #2 [ffff8c16231a3e50] nfs41_callback_svc at ffffffffc099537d [nfsv4] #3 [ffff8c16231a3ec8] kthread at ffffffff8e6b966f #4 [ffff8c16231a3f50] ret_from_fork at ffffffff8ed07898 This patch fix the issue by strstr "\tcall" and " call", to locate the correct call instruction. [1]: https://listman.redhat.com/archives/crash-utility/2022-August/010085.html Signed-off-by: Tao Liu <[email protected]>

1, Add the implementation to get stack frame from active & inactive task's stack. 2, Add 'bt -l' command support get a line number associated with a current pc address. 3, Add 'bt -f' command support to display all stack data contained in a frame With the patch, we can get the backtrace, crash> bt PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 #1 [ff20000010333cf0] panic at ffffffff806578c6 #2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c #3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604 #4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4 #5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8 #6 [ff20000010333e40] vfs_write at ffffffff80152bb2 #7 [ff20000010333e80] ksys_write at ffffffff80152eda #8 [ff20000010333ed0] sys_write at ffffffff80152f52 crash> bt -l PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/arch/riscv/kernel/crash_save_regs.S: 47 #1 [ff20000010333cf0] panic at ffffffff806578c6 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/kernel/panic.c: 276 ... ... crash> bt -f PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 [PC: ffffffff800078f8 RA: ffffffff806578c6 SP: ff20000010333b90 SIZE: 352] ff20000010333b90: ff20000010333bb0 ffffffff800078f8 ff20000010333ba0: ffffffff8008862c ff20000010333b90 ff20000010333bb0: ffffffff810dde38 ff6000000226c200 ff20000010333bc0: ffffffff8032be68 0720072007200720 ... ... Signed-off-by: Xianting Tian <[email protected]>

Currently, the "bt" command may print a bogus exception frame and the remaining frame will be truncated on x86_64 when using the "virsh send-key <kvm guest> KEY_LEFTALT KEY_SYSRQ KEY_C" command to trigger a panic from the KVM host. For example: crash> bt PID: 0 TASK: ffff9e7a47e32f00 CPU: 3 COMMAND: "swapper/3" #0 [ffffba7900118bb8] machine_kexec at ffffffff87e5c2c7 #1 [ffffba7900118c08] __crash_kexec at ffffffff87f9500d #2 [ffffba7900118cd0] panic at ffffffff87edfff9 #3 [ffffba7900118d50] sysrq_handle_crash at ffffffff883ce2c1 ... #16 [ffffba7900118fd8] handle_edge_irq at ffffffff87f559f2 #17 [ffffba7900118ff0] asm_call_on_stack at ffffffff88800fa2 --- <IRQ stack> --- #18 [ffffba790008bda0] asm_call_on_stack at ffffffff88800fa2 RIP: ffffffffffffffff RSP: 0000000000000124 RFLAGS: 00000003 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffffffff88800c1e RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000001 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: ffffffff88760555 R12: ffffba790008be08 R13: ffffffff87f18002 R14: ffff9e7a47e32f00 R15: ffff9e7bb6198e00 ORIG_RAX: 0000000000000000 CS: 0003 SS: 0000 bt: WARNING: possibly bogus exception frame crash> The following related kernel commits cause the current issue, crash needs to adjust the value of irq_eframe_link. Related kernel commits: [1] v5.8: 931b94145981 ("x86/entry: Provide helpers for executing on the irqstack") [2] v5.8: fa5e5c409213 ("x86/entry: Use idtentry for interrupts") [3] v5.12: 52d743f3b712 ("x86/softirq: Remove indirection in do_softirq_own_stack()") Signed-off-by: Lianbo Jiang <[email protected]> Signed-off-by: Kazuhito Hagio <[email protected]>

Kernel commit 7d65f4a65532 ("irq: Consolidate do_softirq() arch overriden implementations") renamed the call_softirq to do_softirq_own_stack, and there is no exception frame also when coming from do_softirq_own_stack. Without the patch, crash may unnecessarily output an exception frame with a warning as below: crash> foreach bt ... PID: 0 TASK: ffff914f820a8000 CPU: 25 COMMAND: "swapper/25" #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763 #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423 #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9 #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8 [exception RIP: clone_endio+172] RIP: ffffffffc005c1ec RSP: ffffa1d403d08e98 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff915326fba230 RCX: 0000000000000018 RDX: ffffffffc0075400 RSI: 0000000000000000 RDI: ffff915326fba230 RBP: ffff915326fba1c0 R8: 0000000000001000 R9: ffff915308d6d2a0 R10: 000000a97dfe5e10 R11: ffffa1d40038fe98 R12: ffff915302babc40 R13: ffff914f94360000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod] #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954 #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968 #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95 #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a --- <IRQ stack> --- #12 [ffffa1d40038fe70] do_softirq_own_stack at ffffffffa7000f9a [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: 00000000 RAX: ffffffffa672eae5 RBX: ffffffffa83b34e0 RCX: ffffffffa672eb12 RDX: 0000000000000010 RSI: 8b7d6c8869010c00 RDI: 0000000000000085 RBP: 0000000000000286 R8: ffff914f820a8000 R9: ffffffffa67a94e0 R10: 0000000000000286 R11: ffffffffa66fb4c5 R12: ffffffffa67a898b R13: 0000000000000000 R14: fffffffffffffff8 R15: ffffffffa67a1e68 ORIG_RAX: 0000000000000000 CS: 0000 SS: ffffffffa672edff bt: WARNING: possibly bogus exception frame #13 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c #14 [ffffa1d40038ff50] secondary_startup_64_no_verify at ffffffffa6600116 ... Reported-by: Marco Patalano <[email protected]> Signed-off-by: Lianbo Jiang <[email protected]>

On kernels configured with CONFIG_RANDOMIZE_KSTACK_OFFSET=y and random_kstack_offset=on, a random offset is added to task stacks with __kstack_alloca() at the beginning of do_syscall_64() and other syscall entry functions. This eventually does the following instruction. <do_syscall_64+32>: sub %rax,%rsp On the other hand, crash uses only a part of data for ORC unwinder to unwind stacks and if an ip value doesn't have a usable ORC data, it caluculates the frame size with parsing the assembly of the function. However, crash cannot calculate the frame size correctly with the instruction above, and prints stale return addresses like this: crash> bt 1 PID: 1 TASK: ffff9c250023b880 CPU: 0 COMMAND: "systemd" #0 [ffffb7e5c001fc80] __schedule at ffffffff91ae2b16 #1 [ffffb7e5c001fd00] schedule at ffffffff91ae2ed3 #2 [ffffb7e5c001fd18] schedule_hrtimeout_range_clock at ffffffff91ae7ed8 #3 [ffffb7e5c001fda8] ep_poll at ffffffff913ef828 #4 [ffffb7e5c001fe48] do_epoll_wait at ffffffff913ef943 #5 [ffffb7e5c001fe80] __x64_sys_epoll_wait at ffffffff913f0130 #6 [ffffb7e5c001fed0] do_syscall_64 at ffffffff91ad7169 #7 [ffffb7e5c001fef0] do_syscall_64 at ffffffff91ad7179 << #8 [ffffb7e5c001ff10] syscall_exit_to_user_mode at ffffffff91adaab2 << stale entries #9 [ffffb7e5c001ff20] do_syscall_64 at ffffffff91ad7179 << #10 [ffffb7e5c001ff50] entry_SYSCALL_64_after_hwframe at ffffffff91c0009b RIP: 00007f258d9427ae RSP: 00007fffda631d60 RFLAGS: 00000293 ... To fix this, enhance the use of ORC data. The ORC unwinder often uses %rbp value, so keep it from exception frames and inactive task stacks. Signed-off-by: Kazuhito Hagio <[email protected]>

Kernel commit fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two"), which is contained in Linux 6.4 and later kernels, changed ORC_TYPE_CALL macro from 0 to 2. As a result, the "bt" command cannot use ORC entries, and can display stale entries in a call trace. crash> bt 1 PID: 1 TASK: ffff93cd06294180 CPU: 51 COMMAND: "systemd" #0 [ffffb72bc00cbc98] __schedule at ffffffff86e52aae #1 [ffffb72bc00cbd00] schedule at ffffffff86e52f6a #2 [ffffb72bc00cbd18] schedule_hrtimeout_range_clock at ffffffff86e58ef5 #3 [ffffb72bc00cbd88] ep_poll at ffffffff8669624d #4 [ffffb72bc00cbe28] do_epoll_wait at ffffffff86696371 #5 [ffffb72bc00cbe30] do_timerfd_settime at ffffffff8669902b << #6 [ffffb72bc00cbe60] __x64_sys_epoll_wait at ffffffff86696bf0 #7 [ffffb72bc00cbeb0] do_syscall_64 at ffffffff86e3feb9 #8 [ffffb72bc00cbee0] __task_pid_nr_ns at ffffffff863330d7 << #9 [ffffb72bc00cbf08] syscall_exit_to_user_mode at ffffffff86e466b2 << stale entries #10 [ffffb72bc00cbf18] do_syscall_64 at ffffffff86e3fec9 << #11 [ffffb72bc00cbf50] entry_SYSCALL_64_after_hwframe at ffffffff870000aa Also, kernel commit ffb1b4a41016 added a member to struct orc_entry. Although this does not affect the crash's unwinder, its debugging information can be displayed incorrectly. To fix these, (1) introduce "kernel_orc_entry_6_4" structure corresponding to 6.4 and abstruction layer "orc_entry" structure in crash, (2) switch ORC_TYPE_CALL to 2 or 0 with kernel's orc_entry structure. Related orc_entry history: v4.14 39358a033b2e introduced struct orc_entry v4.19 d31a580266ee added orc_entry.end member v6.3 ffb1b4a41016 added orc_entry.signal member v6.4 fb799447ae29 removed end member and changed type member to 3 bits Signed-off-by: Kazuhito Hagio <[email protected]>

Without the patch, do_mt_entry() can call dump_struct_members_for_tree() with a NULL entry, and parse_for_member_extended() will cause a segmentation fault during strncpy(). This is caused by "tree -t maple -s struct.member.member" style multiple level member access: crash> tree -t maple -s irq_desc.irq_data.irq sparse_irqs ffff936980188400 irq_data.irq = 0, ffff93698018be00 irq_data.irq = 1, ... ffff936980f38e00 irq_data.irq = 19, Segmentation fault (core dumped) (gdb) bt #0 0x00007faaf8e51635 in __strncpy_avx2 () from /lib64/libc.so.6 #1 0x00000000005e5927 in parse_for_member_extended (dm=dm@entry=0x7ffcb9e6d860, ... #2 0x0000000000603c45 in dump_struct_member (s=s@entry=0x128cde0 <shared_bufs+1024> ... #3 0x0000000000513cf5 in dump_struct_members_for_tree (td=td@entry=0x7ffcb9e6eeb0, ... #4 0x0000000000651f15 in do_mt_entry (entry=0, min=min@entry=20, max=max@entry=119, ... ... Signed-off-by: Kazuhito Hagio <[email protected]>

Currently, the symbol ".rodata" may not be found in some vmlinux, and the strings command will still be used to get the linux banner string, but this gets two strings as below: # strings vmlinux | grep "Linux version" Linux version 6.5.0-0.rc2.17.fc39.x86_64 ... GNU ld version 2.40-9.fc39) # SMP PREEMPT_DYNAMIC Linux version 6.5.0-0.rc2.17.fc39.x86_64 ... GNU ld version 2.40-9.fc39) #1 SMP PREEMPT_DYNAMIC Mon Jul 17 14:57:35 UTC 2023 In the verify_namelist(), the while-loop will only determine if the first linux banner string above matches and break the loop. But actually the second string above is correct one. Eventually, crash starts up with the following warning: # ./crash -s vmlinux vmcore WARNING: kernel version inconsistency between vmlinux and dumpfile # ./crash -s WARNING: kernel version inconsistency between vmlinux and live memory Let's always try to match the correct one, otherwise still prints a warning as before. Signed-off-by: Lianbo Jiang <[email protected]>

…usly There is an issue that, for kernel modules, "dis -rl" fails to display modules code line number data after execute "bt" command in crash. Without the patch: crsah> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp 0xffffffffc0f60eb6 <lpfc_nlp_get+6>: push %rbx 0xffffffffc0f60eb7 <lpfc_nlp_get+7>: test %rdi,%rdi With the patch: crash> mod -S crash> bt PID: 1500 TASK: ff2bd8b093524000 CPU: 16 COMMAND: "lpfc_worker_0" #0 [ff2c9f725c39f9e0] machine_kexec at ffffffff8e0686d3 ...snip... #8 [ff2c9f725c39fcc0] __lpfc_sli_release_iocbq_s4 at ffffffffc0f2f425 [lpfc] ...snip... crash> dis -rl ffffffffc0f60f82 /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6756 0xffffffffc0f60eb0 <lpfc_nlp_get>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] /usr/src/debug/kernel-4.18.0-425.13.1.el8_7/linux-4.18.0-425.13.1.el8_7.x86_64/drivers/scsi/lpfc/lpfc_hbadisc.c: 6759 0xffffffffc0f60eb5 <lpfc_nlp_get+5>: push %rbp The root cause is, after kernel module been loaded by mod command, the symtable is not expanded in gdb side. crash bt or dis command will trigger such an expansion. However the symtable expansion is different for the 2 commands: The stack trace of "dis -rl" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f75e in dw2_expand_all_symtabs ... #8 0x00000000008f254d in gdb_get_line_number ... #9 0x00000000008f22af in gdb_command_funnel_1 ... #10 0x00000000008f2003 in gdb_command_funnel ... #11 0x00000000005b7f02 in gdb_interface ... #12 0x00000000005f5bd8 in get_line_number ... #13 0x000000000059e574 in cmd_dis ... The stack trace of "bt" for symtable expanding: #0 0x00000000008d8d9f in add_compunit_symtab_to_objfile ... #1 0x00000000006d3293 in buildsym_compunit::end_symtab_with_blockvector ... #2 0x00000000006d336a in buildsym_compunit::end_symtab_from_static_block ... #3 0x000000000077e8e9 in process_full_comp_unit ... #4 process_queue ... #5 dw2_do_instantiate_symtab ... #6 0x000000000077ed67 in dw2_instantiate_symtab ... #7 0x000000000077f8ed in dw2_lookup_symbol ... #8 0x00000000008e6d03 in lookup_symbol_via_quick_fns ... #9 0x00000000008e7153 in lookup_symbol_in_objfile ... #10 0x00000000008e73c6 in lookup_symbol_global_or_static_iterator_cb ... #11 0x00000000008b99c4 in svr4_iterate_over_objfiles_in_search_order ... #12 0x00000000008e754e in lookup_global_or_static_symbol ... #13 0x00000000008e75da in lookup_static_symbol ... #14 0x00000000008e632c in lookup_symbol_aux ... #15 0x00000000008e5a7a in lookup_symbol_in_language ... #16 0x00000000008e5b30 in lookup_symbol ... #17 0x00000000008f2a4a in gdb_get_datatype ... #18 0x00000000008f22c0 in gdb_command_funnel_1 ... crash-utility#19 0x00000000008f2003 in gdb_command_funnel ... crash-utility#20 0x00000000005b7f02 in gdb_interface ... crash-utility#21 0x00000000005f8a9f in datatype_info ... crash-utility#22 0x0000000000599947 in cpu_map_size ... crash-utility#23 0x00000000005a975d in get_cpus_online ... crash-utility#24 0x0000000000637a8b in diskdump_get_prstatus_percpu ... crash-utility#25 0x000000000062f0e4 in get_netdump_regs_x86_64 ... crash-utility#26 0x000000000059fe68 in back_trace ... crash-utility#27 0x00000000005ab1cb in cmd_bt ... For the stacktrace of "dis -rl", it calls dw2_expand_all_symtabs() to expand all symtable of the objfile, or "*.ko.debug" in our case. However for the stacktrace of "bt", it doesn't expand all, but only a subset of symtable which is enough to find a symbol by dw2_lookup_symbol(). As a result, the objfile->compunit_symtabs, which is the head of a single linked list of struct compunit_symtab, is not NULL but didn't contain all symtables. It will not be reinitialized in gdb_get_line_number() by "dis -rl" because !objfile_has_full_symbols(objfile) check will fail, so it cannot display the proper code line number data. Since objfile_has_full_symbols(objfile) check cannot ensure all symbols been expanded, this patch add a new member as a flag for struct objfile to record if all symbols have been expanded. The flag will be set only ofter expand_all_symtabs been called. Signed-off-by: Tao Liu <[email protected]>

Same as the Linux commit f766f77a74f5 ("riscv/stacktrace: Fix stack output without ra on the stack top"). When a function doesn't have a callee, then it will not push ra into the stack, such as lkdtm functions, so correct the FP of the second frame and use pt_regs to get the right PC of the second frame. Before this patch, the `bt -f` outputs only the first frame with the wrong PC and FP of next frame: ``` crash> bt -f PID: 1 TASK: ff600000000e0000 CPU: 1 COMMAND: "sh" #0 [ff20000000013cf0] lkdtm_EXCEPTION at ffffffff805303c0 [PC: ffffffff805303c0 RA: ff20000000013d10 SP: ff20000000013cf0 SIZE: 16] <- wrong next PC ff20000000013cf0: 0000000000000001 ff20000000013d10 <- next FP ff20000000013d00: ff20000000013d40 crash> ``` After this patch, the `bt` outputs the full frames: ``` crash> bt PID: 1 TASK: ff600000000e0000 CPU: 1 COMMAND: "sh" #0 [ff20000000013cf0] lkdtm_EXCEPTION at ffffffff805303c0 #1 [ff20000000013d00] lkdtm_do_action at ffffffff8052fe36 #2 [ff20000000013d10] direct_entry at ffffffff80530018 #3 [ff20000000013d40] full_proxy_write at ffffffff80305044 #4 [ff20000000013d80] vfs_write at ffffffff801b68b4 #5 [ff20000000013e30] ksys_write at ffffffff801b6c4a #6 [ff20000000013e80] __riscv_sys_write at ffffffff801b6cc4 #7 [ff20000000013e90] do_trap_ecall_u at ffffffff80836798 crash> ``` Acked-by: Kazuhito Hagio <[email protected]> Signed-off-by: Song Shuai <[email protected]>

This patch introduces per-cpu IRQ stacks for RISCV64 to let "bt" do backtrace on it and 'bt -E' search eframes on it, and the 'help -m' command displays the addresses of each per-cpu IRQ stack. TEST: a vmcore dumped via hacking the handle_irq_event_percpu() ( Why not using lkdtm INT_HW_IRQ_EN EXCEPTION ? There is a deadlock[1] in crash_kexec path if use that) crash> bt PID: 0 TASK: ffffffff8140db00 CPU: 0 COMMAND: "swapper/0" #0 [ff20000000003e60] __handle_irq_event_percpu at ffffffff8006462e #1 [ff20000000003ed0] handle_irq_event_percpu at ffffffff80064702 #2 [ff20000000003ef0] handle_irq_event at ffffffff8006477c #3 [ff20000000003f20] handle_fasteoi_irq at ffffffff80068664 #4 [ff20000000003f50] generic_handle_domain_irq at ffffffff80063988 #5 [ff20000000003f60] plic_handle_irq at ffffffff8046633e #6 [ff20000000003fb0] generic_handle_domain_irq at ffffffff80063988 #7 [ff20000000003fc0] riscv_intc_irq at ffffffff80465f8e #8 [ff20000000003fd0] handle_riscv_irq at ffffffff808361e8 PC: ffffffff80837314 [default_idle_call+50] RA: ffffffff80837310 [default_idle_call+46] SP: ffffffff81403da0 CAUSE: 8000000000000009 epc : ffffffff80837314 ra : ffffffff80837310 sp : ffffffff81403da0 gp : ffffffff814ef848 tp : ffffffff8140db00 t0 : ff2000000004bb18 t1 : 0000000000032c73 t2 : ffffffff81200a48 s0 : ffffffff81403db0 s1 : 0000000000000000 a0 : 0000000000000004 a1 : 0000000000000000 a2 : ff6000009f1e7000 a3 : 0000000000002304 a4 : ffffffff80c1c2d8 a5 : 0000000000000000 a6 : ff6000001fe01958 a7 : 00002496ea89dbf1 s2 : ffffffff814f0220 s3 : 0000000000000001 s4 : 000000000000003f s5 : ffffffff814f03d8 s6 : 0000000000000000 s7 : ffffffff814f00d0 s8 : ffffffff81526f10 s9 : ffffffff80c1d880 s10: 0000000000000000 s11: 0000000000000001 t3 : 0000000000003392 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000040 status: 0000000200000120 badaddr: 0000000000000000 cause: 8000000000000009 orig_a0: ffffffff80837310 --- <IRQ stack> --- #9 [ffffffff81403da0] default_idle_call at ffffffff80837314 #10 [ffffffff81403db0] do_idle at ffffffff8004d0a0 #11 [ffffffff81403e40] cpu_startup_entry at ffffffff8004d21e #12 [ffffffff81403e60] kernel_init at ffffffff8083746a #13 [ffffffff81403e70] arch_post_acpi_subsys_init at ffffffff80a006d8 #14 [ffffffff81403e80] console_on_rootfs at ffffffff80a00c92 crash> crash> bt -E CPU 0 IRQ STACK: KERNEL-MODE EXCEPTION FRAME AT: ff20000000003a48 PC: ffffffff8006462e [__handle_irq_event_percpu+30] RA: ffffffff80064702 [handle_irq_event_percpu+18] SP: ff20000000003e60 CAUSE: 000000000000000d epc : ffffffff8006462e ra : ffffffff80064702 sp : ff20000000003e60 gp : ffffffff814ef848 tp : ffffffff8140db00 t0 : 0000000000046600 t1 : ffffffff80836464 t2 : ffffffff81200a48 s0 : ff20000000003ed0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000118 a2 : 0000000000000052 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000010001 a6 : ff6000001fe01958 a7 : 00002496ea89dbf1 s2 : ff60000000941ab0 s3 : ffffffff814a0658 s4 : ff60000000089230 s5 : ffffffff814a0518 s6 : ffffffff814a0620 s7 : ffffffff80e5f0f8 s8 : ffffffff80fc50b0 s9 : ffffffff80c1d880 s10: 0000000000000000 s11: 0000000000000001 t3 : 0000000000003392 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000040 status: 0000000200000100 badaddr: 0000000000000078 cause: 000000000000000d orig_a0: ff20000000003ea0 CPU 1 IRQ STACK: (none found) crash> crash> help -m <snip> machspec: ced1e0 irq_stack_size: 16384 irq_stacks[0]: ff20000000000000 irq_stacks[1]: ff20000000008000 crash> [1]: https://lore.kernel.org/linux-riscv/[email protected]/ Signed-off-by: Song Shuai <[email protected]>

The patch introduces per-cpu overflow stacks for RISCV64 to let "bt" do backtrace on it and the 'help -m' command dispalys the addresss of each per-cpu overflow stack. TEST: a lkdtm DIRECT EXHAUST_STACK vmcore crash> bt PID: 1 TASK: ff600000000d8000 CPU: 1 COMMAND: "sh" #0 [ff6000001fc501c0] riscv_crash_save_regs at ffffffff8000a1dc #1 [ff6000001fc50320] panic at ffffffff808773ec #2 [ff6000001fc50380] walk_stackframe at ffffffff800056da PC: ffffffff80876a34 [memset+96] RA: ffffffff80563dc0 [recursive_loop+68] SP: ff2000000000fd50 CAUSE: 000000000000000f epc : ffffffff80876a34 ra : ffffffff80563dc0 sp : ff2000000000fd50 gp : ffffffff81515d38 tp : 0000000000000000 t0 : ff2000000000fd58 t1 : ff600000000d88c8 t2 : 6143203a6d74646b s0 : ff20000000010190 s1 : 0000000000000012 a0 : ff2000000000fd58 a1 : 1212121212121212 a2 : 0000000000000400 a3 : ff20000000010158 a4 : 0000000000000000 a5 : 725bedba92260900 a6 : 000000000130e0f0 a7 : 0000000000000000 s2 : ff2000000000fd58 s3 : ffffffff815170d8 s4 : ff20000000013e60 s5 : 000000000000000e s6 : ff20000000013e60 s7 : 0000000000000000 s8 : ff60000000861000 s9 : 00007fffc3641694 s10: 00007fffc3641690 s11: 00005555796ed240 t3 : 0000000000010297 t4 : ffffffff80c17810 t5 : ffffffff8195e7b8 t6 : ff20000000013b18 status: 0000000200000120 badaddr: ff2000000000fd58 cause: 000000000000000f orig_a0: 0000000000000000 --- <OVERFLOW stack> --- #3 [ff2000000000fd50] memset at ffffffff80876a34 #4 [ff20000000010190] recursive_loop at ffffffff80563e16 #5 [ff200000000105d0] recursive_loop at ffffffff80563e16 < recursive_loop ...> #16 [ff20000000013490] recursive_loop at ffffffff80563e16 #17 [ff200000000138d0] recursive_loop at ffffffff80563e16 #18 [ff20000000013d10] lkdtm_EXHAUST_STACK at ffffffff8088005e crash-utility#19 [ff20000000013d30] lkdtm_do_action at ffffffff80563292 crash-utility#20 [ff20000000013d40] direct_entry at ffffffff80563474 crash-utility#21 [ff20000000013d70] full_proxy_write at ffffffff8032fb3a crash-utility#22 [ff20000000013db0] vfs_write at ffffffff801d6414 crash-utility#23 [ff20000000013e60] ksys_write at ffffffff801d67b8 crash-utility#24 [ff20000000013eb0] __riscv_sys_write at ffffffff801d6832 crash-utility#25 [ff20000000013ec0] do_trap_ecall_u at ffffffff80884a20 crash> crash> help -m <snip> irq_stack_size: 16384 irq_stacks[0]: ff20000000000000 irq_stacks[1]: ff20000000008000 overflow_stack_size: 4096 overflow_stacks[0]: ff6000001fa7a510 overflow_stacks[1]: ff6000001fc4f510 crash> Signed-off-by: Song Shuai <[email protected]>

On recent x86_64 kernels, the check of caller function (BT_CHECK_CALLER) does not work correctly due to inappropriate direct_call_targets. As a result, the correct frame is ignored and the remaining frames will be truncated. Skip the caller check if ORC unwinder is available, as the check is not necessary with it. Without the patch: crash> bt 493113 PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND: "sriov_fec_daemo" #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8 RIP: 000000000047cdbb RSP: 000000c0000975a8 RFLAGS: 00000216 ... With the patch: crash> bt 493113 PID: 493113 TASK: ff2e34ecbd3ca2c0 CPU: 27 COMMAND: "sriov_fec_daemo" #0 [ff77abc4e81cfb08] __schedule at ffffffff81b239cb #1 [ff77abc4e81cfb70] schedule at ffffffff81b23e2d #2 [ff77abc4e81cfb88] schedule_timeout at ffffffff81b2c9e8 #3 [ff77abc4e81cfbf0] __wait_for_common at ffffffff81b24abb #4 [ff77abc4e81cfc68] vfio_unregister_group_dev at ffffffffc10e76ae [vfio] #5 [ff77abc4e81cfca8] vfio_pci_core_unregister_device at ffffffffc11bb599 [vfio_pci_core] #6 [ff77abc4e81cfcc0] vfio_pci_remove at ffffffffc103e045 [vfio_pci] #7 [ff77abc4e81cfcd0] pci_device_remove at ffffffff815d7513 ... Reported-by: Crystal Wood <[email protected]> Signed-off-by: Kazuhito Hagio <[email protected]>

…ss range Previously, to find a module symbol and its offset by an arbitrary address, all symbols within the module will be iterated by address ascending order until the last symbol with a smaller address been noticed. However if the address is not within the module address range, e.g. the address is higher than the module's last symbol's address, then the module can be surely skipped, because its symbol iteration is unnecessary. This can speed up the kernel module symbols finding and improve the overall performance. Without the patch: $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux crash> bt 8993 PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0" #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8 #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9 #2 [ffff927569cd7768] __mutex_lock_slowpath at ffffffffb3db6ca7 #3 [ffff927569cd77c0] mutex_lock at ffffffffb3db602f #4 [ffff927569cd77d8] ucache_retrieve at ffffffffc0cf4409 [secfs2] ...snip the stacktrace of the same module... #11 [ffff927569cd7ba0] cskal_path_vfs_getattr_nosec at ffffffffc05cae76 [falcon_kal] ...snip... #13 [ffff927569cd7c40] _ZdlPv at ffffffffc086e751 [falcon_lsm_serviceable] ...snip... crash-utility#20 [ffff927569cd7ef8] unload_network_ops_symbols at ffffffffc06f11c0 [falcon_lsm_pinned_14713] crash-utility#21 [ffff927569cd7f50] system_call_fastpath at ffffffffb3dc539a RIP: 00007f2b28ed4023 RSP: 00007f2a45fe7f80 RFLAGS: 00000206 RAX: 0000000000000012 RBX: 00007f2a68302e00 RCX: 00007f2a682546d8 RDX: 0000000000000826 RSI: 00007eb57ea6a000 RDI: 00000000000000e3 RBP: 00007eb57ea6a000 R8: 0000000000000826 R9: 00000002670bdfd2 R10: 00000002670bdfd2 R11: 0000000000000293 R12: 00000002670bdfd2 R13: 00007f29d501a480 R14: 0000000000000826 R15: 00000002670bdfd2 ORIG_RAX: 0000000000000012 CS: 0033 SS: 002b crash> real 7m14.826s user 7m12.502s sys 0m1.091s With the patch: $ time echo "bt 8993" | ~/crash-dev/crash vmcore vmlinux crash> bt 8993 PID: 8993 TASK: ffff927569cc2100 CPU: 2 COMMAND: "WriterPool0" #0 [ffff927569cd76f0] __schedule at ffffffffb3db78d8 #1 [ffff927569cd7758] schedule_preempt_disabled at ffffffffb3db8bf9 ...snip the same output... crash> real 0m8.827s user 0m7.896s sys 0m0.938s Signed-off-by: Tao Liu <[email protected]>

… line 1. Add loongarch64_init() implementation, do all necessary machine-specific setup, which will be called multiple times during initialization. 2. Add the implementation of the vtop command, which is used to convert a virtual address to a physical address. When entering the crash command line, the corresponding symbols in the kernel will be read, and at the same time, the conversion of virtual and real addresses will also be used, so the vtop command is a prerequisite for entering the crash command line. 3. Add loongarch64_get_smp_cpus() implementation, get the number of online cpus. 4. Add loongarch64_get_page_size() implementation, get page size. 5. Add to get processor speed. Obtain the processor speed from the kernel symbol "cpu_clock_freq". 6. Add loongarch64_verify_symbol() implementation, accept or reject a symbol from the kernel namelist. With this patch, we can enter crash command line. Tested on Loongson-3C5000 platform. For help, type "help". Type "apropos word" to search for commands related to "word"... KERNEL: /usr/lib/debug/lib/modules/5.10.0-60.103.0.130.oe2203.loongarch64/vmlinux DUMPFILE: /proc/kcore CPUS: 16 DATE: Mon Aug 21 14:33:19 CST 2023 UPTIME: 05:01:34 LOAD AVERAGE: 0.43, 0.11, 0.17 TASKS: 265 NODENAME: localhost.localdomain RELEASE: 5.10.0-60.103.0.130.oe2203.loongarch64 VERSION: #1 SMP Fri Jul 21 12:48:08 UTC 2023 MACHINE: loongarch64 (2200 Mhz) MEMORY: 64 GB PID: 114499 COMMAND: "crash" TASK: 900000009676ff00 [THREAD_INFO: 90000000981a8000] CPU: 12 STATE: TASK_RUNNING (ACTIVE) Co-developed-by: Youling Tang <[email protected]> Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Ming Wang <[email protected]>

- Add basic support for the 'bt' command. - LooongArch64: Add 'bt -f' command support - LoongArch64: Add 'bt -l' command support E.g. With this patch: crash> bt PID: 1832 TASK: 900000009a552100 CPU: 11 COMMAND: "bash" #0 [900000009beffb60] __cpu_possible_mask at 90000000014168f0 #1 [900000009beffb60] __crash_kexec at 90000000002e7660 #2 [900000009beffcd0] panic at 9000000000f0ec28 #3 [900000009beffd60] sysrq_handle_crash at 9000000000a2c188 #4 [900000009beffd70] __handle_sysrq at 9000000000a2c85c #5 [900000009beffdc0] write_sysrq_trigger at 9000000000a2ce10 #6 [900000009beffde0] proc_reg_write at 90000000004ce454 #7 [900000009beffe00] vfs_write at 900000000043e838 #8 [900000009beffe40] ksys_write at 900000000043eb58 #9 [900000009beffe80] do_syscall at 9000000000f2da54 #10 [900000009beffea0] handle_syscall at 9000000000221440 crash> ... Co-developed-by: Youling Tang <[email protected]> Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Ming Wang <[email protected]>

Dave Anderson added 30 commits December 15, 2019 12:24

When determining the ARM64 kernel's "vabits_actual" value by reading

5e975dd

the new TCR_EL1.T1SZ vmcoreinfo entry, display its value during session initialization only when invoking crash with "-d1" or larger -d debug value. ([email protected])

Fix for support of ELF format kdump vmcores from S390X KASLR kernels.

7c2d41e

Without the patch, the crash session fails during initialization with the error message "crash: vmlinux and vmcore do not match!". ([email protected])

Fix for support of S390X standalone dumpfiles and LKCD dumpfiles that

6e033fe

were taken from S390X KASLR kernels. ([email protected])

Rework the previous patch for support of S390X standalone dumpfiles

c6b1971

and LKCD dumpfiles that were taken from S390X KASLR kernels to avoid calling an s390x-specific function from generic code. ([email protected])

Fix for a gcc-10 compilation error. Without the patch, the build of

6c1c8ac

the crash library fails with a stream of error messages indicating "multiple definition of 'diskdump_flags'" ([email protected])

crash-7.2.7 -> crash-7.2.8

24f4801

Mark start of 7.2.8 development phase with version 7.2.7++

47b457c

Fix for an ARM64 gcc-10 compilation error. Without the patch, the

e770735

build of the embedded gdb module fails with an error message that indicates "multiple definition of 'tdesc_aarch64'". ([email protected])

Fix for the "log" command. Without the patch, the command's output

dcd6e6b

may be truncated, ending with the error message "log: invalid log_buf entry encountered". ([email protected])

Introduction of a new "extend -s" option, which shows all available

5dfbc7a

shared object extension modules that are located in the directories that are part of the normal search path that is used when a shared object is loaded without a fully-qualified pathname. ([email protected])

Fix for the "bpf -m|-M" options on Linux 5.3 and later kernels that

af71d71

contain commit 3539b96e041c06e4317082816d90ec09160aeb11, titled "bpf: group memory related fields in struct bpf_map_memory". Without the patch, the options prints "(unknown)" for MEMLOCK and UID. ([email protected])

Enhancement to the "bpf -p|-P" options to display the eBPF program

007f844

name string. ([email protected])

Enhancement of the "struct -r" option to support the raw memory

42fba65

display of a single data structure member. Without the patch, the option only supported the raw display of a complete data structure. ([email protected])

Modify the display behavior of the "struct -r" option so as to scale

8c28b56

the minimum display size from the size of a per-architecture long (32-bits or 64-bits) down to 8-bits, 16-bits or 32-bits when the requested size is equal to one of the smaller sizes. ([email protected])

If readmem() receives a user-space address in a page that has been

b12bdd3

swapped to the zswap compressed swap cache, an attempt will be made to find and decompress the page. ([email protected])

Fix for the "mount -n [pid|task]" option when running on a live

601bcce

system. Without the patch, if the [pid|task] has been created since the last internal task table refresh, the command fails with the error message "mount: invalid task or pid value: <value>". ([email protected])

Introduction of the "log -T" option, which translates the leading

c86250b

timestamp value of each message into human readable format. ([email protected])

Prepare for the introduction of ARM64 8.3 Pointer Authentication

41d6118

as in-kernel feature. The value of CONFIG_ARM64_KERNELPACMASK will be exported as a vmcoreinfo entry, and will be used with text return addresses on the kernel stack. ([email protected])

Replace people.redhat.com references with github equivalents.

0f29a8a

([email protected])

k-hagio and others added 12 commits June 1, 2022 08:48

Doc: update man page for the "bpf" and "sbitmapq" commands

c672d7a

The information of the "bpf" and "sbitmapq" commands is missing in the man page of the crash utility. Let's add it to the man page. Signed-off-by: Lianbo Jiang <[email protected]>

fengjixuchui merged commit c05d86c into fengjixuchui:master Jul 3, 2022

fengjixuchui mentioned this pull request Feb 14, 2023

Fix for "bt" command printing "bogus exception frame" warning #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend field length of task attributes #1

Extend field length of task attributes #1

fengjixuchui commented Jul 3, 2022

Extend field length of task attributes #1

Extend field length of task attributes #1

Conversation

fengjixuchui commented Jul 3, 2022