Investigate orphan sections #61

ojeda · 2020-12-16T23:56:32Z

Since synchronizing with v5.10 (#60) we are getting:

ld: warning: orphan section `.text.__rust_probestack' from `drivers/char/rust_example/rust_example.o' being placed in section `.text.__rust_probestack'
ld: warning: orphan section `.eh_frame' from `drivers/char/rust_example/rust_example.o' being placed in section `.eh_frame'

See e.g. https://github.com/Rust-for-Linux/linux/runs/1561327749

The text was updated successfully, but these errors were encountered:

ojeda · 2020-12-17T00:04:03Z

A guess is that some kernel structures changed in v5.10 and now, when using the bindings, we have crossed the stack size threshold in some function, thus calling the probe.

adamrk · 2021-01-17T16:16:50Z

I think this might be related to some actual errors I'm seeing:

ERROR: modpost: "__rust_probestack" [drivers/char/rust_example_4.ko] undefined!

(see the failed runs here: https://github.com/adamrk/linux/pull/1/checks?check_run_id=1711541791).
The symbol is defined in rust/complier_builtins.o and vmlinux:

$ objdump --syms  rust/vmlinux | rg probestack
ffffffff81801393 l    d  .text.__rust_probestack	0000000000000000 .text.__rust_probestack
ffffffff81801393 g     F .text.__rust_probestack	0000000000000039 .hidden __rust_probestack

Could this issue be the .hidden attribute? That seems to be explicitly set in the definition.

I'm happy to dig into this some more, but I'm kind of stuck at this point so any suggestions would be helpful.

adamrk · 2021-01-20T08:36:46Z

Looks like the modpost error isn't directly related. Opened #73 instead.

adamrk · 2021-01-20T09:01:05Z

Maybe the issue here is just that .text.__rust_probestack is a non-standard section? Making this change to vmlinux.lds.h takes care of the warning and seems to work fine, but it doesn't seem correct to add logic there just for this special case:

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index b2b3d81b1535..42d46683f188 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -590,7 +590,7 @@
 #define TEXT_TEXT                                                      \
                ALIGN_FUNCTION();                                       \
                *(.text.hot .text.hot.*)                                \
-               *(TEXT_MAIN .text.fixup)                                \
+               *(TEXT_MAIN .text.fixup .text.__rust_probestack)        \
                *(.text.unlikely .text.unlikely.*)                      \
                *(.text.unknown .text.unknown.*)                        \
                NOINSTR_TEXT                                            \

Looks like __rust_probstack is intentionally put in a separate section (source), but I'm not sure why. Maybe if we end up copying needed parts compiler_builtins into the tree (one of the options in #69) we can change it to store __rust_probestack in the regular .text section?

Or is there a way to add a separate linker script file that specifically handles this?

bjorn3 · 2021-01-20T09:10:28Z

The separate section was added as part of moving from a naked function to global_asm! in rust-lang/compiler-builtins#328 to add CFI information to help unwinding inside a debugger. The separate section matches how rustc compiles with -Zfunction-sections=yes by default.

Does Linux have it's own probestack function? You may be able to alias __rust_probestack to it and completely avoid linking to compiler-builtins. Instead using whatever Linux uses for providing gcc/LLVM compiler intrinsics. Alternatively you could wait for rust-lang/rust#80838 to land and switch to inline asm stack probes.

ojeda · 2021-01-20T12:03:09Z

AFAICS both GCC and Clang generate the probing inline by default, so we should likely wait for rust-lang/rust#80838.

Having said that, I don't see stack clash protection used nor proposed for the kernel yet, in fact I see a related patch that required disabling it explicitly -- @nickdesaulniers @kees?

nickdesaulniers · 2021-01-20T21:04:32Z

Having said that, I don't see stack clash protection used nor proposed for the kernel yet, in fact I see a related patch that required disabling it explicitly

Maybe a userspace centric feature that can be disabled?

ld: warning: orphan section .text.__rust_probestack' from drivers/char/rust_example/rust_example.o' being placed in section `.text.__rust_probestack'
ERROR: modpost: "__rust_probestack" [drivers/char/rust_example_4.ko] undefined!

So looks like a missing function (symbol) AND missplaced section? Adding that section makes sense if __rust_probestack is required; not sure what that would look like in the kernel.

ld: warning: orphan section .eh_frame' from drivers/char/rust_example/rust_example.o' being placed in section `.eh_frame'

In C compilers, that's controlled by -fno-asynchronous-unwind-info which we generally use in the kernel, except for one unwinder that 32b ARM uses (by default). There should be a corresponding flag for rustc (or, if not, go make it!).

EDIT: sorry for closing/reopened. "Fat fingers"

bjorn3 · 2021-01-20T21:58:09Z

In C compilers, that's controlled by -fno-asynchronous-unwind-info which we generally use in the kernel, except for one unwinder that 32b ARM uses (by default). There should be a corresponding flag for rustc (or, if not, go make it!).

This unwind info is not generated by the compiler. It is part of the assembly that implements __rust_probestack. The compiler doesn't have any control over this.

So looks like a missing function (symbol) AND missplaced section? Adding that section makes sense if __rust_probestack is required; not sure what that would look like in the kernel.

Same applies here.

Maybe a userspace centric feature that can be disabled?

I believe the target spec has an option to disable stackprobes.

adamrk · 2021-02-12T09:33:24Z

I think this is resolved with the latest rustc version changes.

ojeda · 2021-02-12T10:09:40Z

Yeah, when compiled with a high enough LLVM (11.0.1), we should now be getting inlined stack probes now since #85. The CI compiles with 11.1.0 at the moment.

However, the warnings went away before, since #80 was merged, since there was no more compiler_builtins.

The histogram logic was allowing events with char * pointers to be used as normal strings. But it was easy to crash the kernel with: # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger And open some files, and boom! BUG: unable to handle page fault for address: 00007f2ced0c3280 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280 FS: 00007f2ced0f8580(0000) GS:ffff93825a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0 Call Trace: event_hist_trigger+0x463/0x5f0 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? kernel_init_free_pages+0x6d/0x90 ? preempt_count_sub+0x9b/0xd0 ? kernel_init_free_pages+0x6d/0x90 ? get_page_from_freelist+0x12c4/0x1680 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 ftrace_syscall_enter+0x264/0x2c0 syscall_trace_enter.constprop.0+0x1ee/0x210 do_syscall_64+0x1c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Where it triggered a fault on strlen(key) where key was the filename. The reason is that filename is a char * to user space, and the histogram code just blindly dereferenced it, with obvious bad results. I originally tried to use strncpy_from_user/kernel_nofault() but found that there's other places that its dereferenced and not worth the effort. Just do not allow "char *" to act like strings. Link: https://lkml.kernel.org/r/[email protected] Cc: Ingo Molnar <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Tzvetomir Stoyanov <[email protected]> Cc: [email protected] Acked-by: Namhyung Kim <[email protected]> Acked-by: Tom Zanussi <[email protected]> Fixes: 79e577c ("tracing: Support string type key properly") Fixes: 5967bd5 ("tracing: Let filter_assign_type() detect FILTER_PTR_STRING") Signed-off-by: Steven Rostedt (VMware) <[email protected]>

Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event. Command to trigger the warning: # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5 Performance counter stats for 'sleep 5': 0 thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ 5.002117947 seconds time elapsed 0.000131000 seconds user 0.001063000 seconds sys Below is snippet of the warning in dmesg: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec preempt_count: 2, expected: 0 4 locks held by perf-exec/2869: #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90 #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0 #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510 #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510 irq event stamp: 4806 hardirqs last enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0 hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510 softirqs last enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61 Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV Call Trace: dump_stack_lvl+0x98/0xe0 (unreliable) __might_resched+0x2f8/0x310 __mutex_lock+0x6c/0x13f0 thread_imc_event_add+0xf4/0x1b0 event_sched_in+0xe0/0x210 merge_sched_in+0x1f0/0x600 visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0 ctx_flexible_sched_in+0xcc/0x140 ctx_sched_in+0x20c/0x2a0 ctx_resched+0x104/0x1c0 perf_event_exec+0x340/0x510 begin_new_exec+0x730/0xef0 load_elf_binary+0x3f8/0x1e10 ... do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0 WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0 CPU: 36 PID: 2869 Comm: sleep Tainted: G W 6.2.0-rc2-00011-g1247637727f2 #61 Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV NIP: c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670 REGS: c00000004d2134e0 TRAP: 0700 Tainted: G W (6.2.0-rc2-00011-g1247637727f2) MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 48002824 XER: 00000000 CFAR: c00000000013fb64 IRQMASK: 1 The above warning triggered because the current imc-pmu code uses mutex lock in interrupt disabled sections. The function mutex_lock() internally calls __might_resched(), which will check if IRQs are disabled and in case IRQs are disabled, it will trigger the warning. Fix the issue by changing the mutex lock to spinlock. Fixes: 8f95faa ("powerpc/powernv: Detect and create IMC device") Reported-by: Michael Petlan <[email protected]> Reported-by: Peter Zijlstra <[email protected]> Signed-off-by: Kajol Jain <[email protected]> [mpe: Fix comments, trim oops in change log, add reported-by tags] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]

ojeda added • kbuild Related to building the kernel, `make`, `Kbuild`, `Kconfig` options... - optional labels Dec 16, 2020

ojeda mentioned this issue Dec 16, 2020

Use rustc directly instead of cargo #52

Merged

ojeda mentioned this issue Jan 20, 2021

Test that __rust_probestack can be linked #74

Closed

nickdesaulniers closed this as completed Jan 20, 2021

nickdesaulniers reopened this Jan 20, 2021

ojeda closed this as completed Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate orphan sections #61

Investigate orphan sections #61

ojeda commented Dec 16, 2020

ojeda commented Dec 17, 2020 •

edited

Loading

adamrk commented Jan 17, 2021

adamrk commented Jan 20, 2021

adamrk commented Jan 20, 2021

bjorn3 commented Jan 20, 2021

ojeda commented Jan 20, 2021 •

edited

Loading

nickdesaulniers commented Jan 20, 2021 •

edited

Loading

bjorn3 commented Jan 20, 2021

adamrk commented Feb 12, 2021

ojeda commented Feb 12, 2021

Investigate orphan sections #61

Investigate orphan sections #61

Comments

ojeda commented Dec 16, 2020

ojeda commented Dec 17, 2020 • edited Loading

adamrk commented Jan 17, 2021

adamrk commented Jan 20, 2021

adamrk commented Jan 20, 2021

bjorn3 commented Jan 20, 2021

ojeda commented Jan 20, 2021 • edited Loading

nickdesaulniers commented Jan 20, 2021 • edited Loading

bjorn3 commented Jan 20, 2021

adamrk commented Feb 12, 2021

ojeda commented Feb 12, 2021

ojeda commented Dec 17, 2020 •

edited

Loading

ojeda commented Jan 20, 2021 •

edited

Loading

nickdesaulniers commented Jan 20, 2021 •

edited

Loading