forked from freebsd/freebsd-src
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolve fetch https #7
Open
ScottieY
wants to merge
12
commits into
emaste:wipbsd-12
Choose a base branch
from
ScottieY:dev-fetch
base: wipbsd-12
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
updated documents
off and quiet mode does not seem to work anymore
duration pitch works need to redo some of kbd logic
need to be refined
emaste
pushed a commit
that referenced
this pull request
May 24, 2018
There's no need to perform the interrupt unbind while holding the blkback lock, and doing so leads to the following LOR: lock order reversal: (sleepable after non-sleepable) 1st 0xfffff8000802fe90 xbbd1 (xbbd1) @ /usr/src/sys/dev/xen/blkback/blkback.c:3423 2nd 0xffffffff81fdf890 intrsrc (intrsrc) @ /usr/src/sys/x86/x86/intr_machdep.c:224 stack backtrace: #0 0xffffffff80bdd993 at witness_debugger+0x73 #1 0xffffffff80bdd814 at witness_checkorder+0xe34 #2 0xffffffff80b7d798 at _sx_xlock+0x68 #3 0xffffffff811b3913 at intr_remove_handler+0x43 #4 0xffffffff811c63ef at xen_intr_unbind+0x10f #5 0xffffffff80a12ecf at xbb_disconnect+0x2f #6 0xffffffff80a12e54 at xbb_shutdown+0x1e4 #7 0xffffffff80a10be4 at xbb_frontend_changed+0x54 #8 0xffffffff80ed66a4 at xenbusb_back_otherend_changed+0x14 #9 0xffffffff80a2a382 at xenwatch_thread+0x182 #10 0xffffffff80b34164 at fork_exit+0x84 freebsd#11 0xffffffff8101ec9e at fork_trampoline+0xe Reported by: Nathan Friess <[email protected]> Sponsored by: Citrix Systems R&D
emaste
pushed a commit
that referenced
this pull request
May 29, 2019
It appeared that using NET_EPOCH_WAIT() while holding global BPF lock can lead to another panic: spin lock 0xfffff800183c9840 (turnstile lock) held by 0xfffff80018e2c5a0 (tid 100325) too long panic: spin lock held too long ... #0 sched_switch (td=0xfffff80018e2c5a0, newtd=0xfffff8000389e000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2133 #1 0xffffffff80bf9912 in mi_switch (flags=256, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439 #2 0xffffffff80c21db7 in sched_bind (td=<optimized out>, cpu=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2704 #3 0xffffffff80c34c33 in epoch_block_handler_preempt (global=<optimized out>, cr=0xfffffe00005a1a00, arg=<optimized out>) at /usr/src/sys/kern/subr_epoch.c:394 #4 0xffffffff803c741b in epoch_block (global=<optimized out>, cr=<optimized out>, cb=<optimized out>, ct=<optimized out>) at /usr/src/sys/contrib/ck/src/ck_epoch.c:416 #5 ck_epoch_synchronize_wait (global=0xfffff8000380cd80, cb=<optimized out>, ct=<optimized out>) at /usr/src/sys/contrib/ck/src/ck_epoch.c:465 #6 0xffffffff80c3475e in epoch_wait_preempt (epoch=0xfffff8000380cd80) at /usr/src/sys/kern/subr_epoch.c:513 #7 0xffffffff80ce970b in bpf_detachd_locked (d=0xfffff801d309cc00, detached_ifp=<optimized out>) at /usr/src/sys/net/bpf.c:856 #8 0xffffffff80ced166 in bpf_detachd (d=<optimized out>) at /usr/src/sys/net/bpf.c:836 #9 bpf_dtor (data=0xfffff801d309cc00) at /usr/src/sys/net/bpf.c:914 To fix this add the check to the catchpacket() that BPF descriptor was not detached just before we acquired BPFD_LOCK(). Reported by: slavash Tested by: slavash MFC after: 1 week
emaste
pushed a commit
that referenced
this pull request
Dec 30, 2019
With WITNESS enabled we see the following warning: lock order reversal: (sleepable after non-sleepable) 1st 0xffffffd0847c7210 fu540spi0 (fu540spi0) @ /usr/home/kp/axiado/hornet-freebsd/src/sys/riscv/sifive/fu540_spi.c:297 2nd 0xffffffc00372bb30 Clock topology lock (Clock topology lock) @ /usr/home/kp/axiado/hornet-freebsd/src/sys/dev/extres/clk/clk.c:1137 stack backtrace: #0 0xffffffc0002a579e at witness_checkorder+0xb72 #1 0xffffffc0002a5556 at witness_checkorder+0x92a #2 0xffffffc000254c7a at _sx_slock_int+0x66 #3 0xffffffc00025537a at _sx_slock+0x8 #4 0xffffffc000123022 at clk_get_freq+0x38 #5 0xffffffc0005463e4 at __clzdi2+0x2bb8 #6 0xffffffc00014af58 at randomdev_getkey+0x76e #7 0xffffffc0001278b0 at simplebus_add_device+0x7ee #8 0xffffffc00027c9a8 at device_attach+0x2e6 #9 0xffffffc00027c634 at device_probe_and_attach+0x7a #10 0xffffffc00027d76a at bus_generic_attach+0x10 freebsd#11 0xffffffc00014aab0 at randomdev_getkey+0x2c6 freebsd#12 0xffffffc00027c9a8 at device_attach+0x2e6 freebsd#13 0xffffffc00027c634 at device_probe_and_attach+0x7a freebsd#14 0xffffffc00027d76a at bus_generic_attach+0x10 freebsd#15 0xffffffc000278bd2 at config_intrhook_oneshot+0x52 freebsd#16 0xffffffc000278b3e at config_intrhook_establish+0x146 freebsd#17 0xffffffc000278cf2 at config_intrhook_disestablish+0xfe The clock topology lock can sleep, which means we cannot attempt to acquire it while holding the non-sleepable mutex. Fix that by retrieving the clock speed once, during attach and not every time during SPI transaction setup. Submitted by: kp Sponsored by: Axiado
emaste
pushed a commit
that referenced
this pull request
Jul 6, 2020
Add a more compact display format for kern.tty_info_kstacks inspired by procstat -kk. Set it as a default one. # sysctl kern.tty_info_kstacks=1 kern.tty_info_kstacks: 0 -> 1 # sleep 2 ^T load: 0.17 cmd: sleep 623 [nanslp] 0.72r 0.00u 0.00s 0% 2124k #0 0xffffffff80c4443e at mi_switch+0xbe #1 0xffffffff80c98044 at sleepq_catch_signals+0x494 #2 0xffffffff80c982c2 at sleepq_timedwait_sig+0x12 #3 0xffffffff80c43af3 at _sleep+0x193 #4 0xffffffff80c50e31 at kern_clock_nanosleep+0x1a1 #5 0xffffffff80c5119b at sys_nanosleep+0x3b #6 0xffffffff810ffc69 at amd64_syscall+0x119 #7 0xffffffff810d5520 at fast_syscall_common+0x101 sleep: about 1 second(s) left out of the original 2 ^C # sysctl kern.tty_info_kstacks=2 kern.tty_info_kstacks: 1 -> 2 # sleep 2 ^T load: 0.24 cmd: sleep 625 [nanslp] 0.81r 0.00u 0.00s 0% 2124k mi_switch+0xbe sleepq_catch_signals+0x494 sleepq_timedwait_sig+0x12 sleep+0x193 kern_clock_nanosleep+0x1a1 sys_nanosleep+0x3b amd64_syscall+0x119 fast_syscall_common+0x101 sleep: about 1 second(s) left out of the original 2 ^C Suggested by: avg Reviewed by: mjg Relnotes: yes Sponsored by: Mysterious Code Ltd. Differential Revision: https://reviews.freebsd.org/D25487
emaste
pushed a commit
that referenced
this pull request
Aug 22, 2020
Instantiate Error in Target::GetEntryPointAddress() only when necessary When Target::GetEntryPointAddress() calls exe_module->GetObjectFile()->GetEntryPointAddress(), and the returned entry_addr is valid, it can immediately be returned. However, just before that, an llvm::Error value has been setup, but in this case it is not consumed before returning, like is done further below in the function. In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core: * thread #1, name = 'testcase', stop reason = breakpoint 1.1 frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5 1 int main(int argc, char *argv[]) 2 { -> 3 return 0; 4 } (lldb) p argc Program aborted due to an unhandled Error: Error value was Success. (Note: Success values must still be checked prior to being destroyed). Thread 1 received signal SIGABRT, Aborted. thr_kill () at thr_kill.S:3 3 thr_kill.S: No such file or directory. (gdb) bt #0 thr_kill () at thr_kill.S:3 #1 0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52 #2 0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 #3 0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112 #4 0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267 #5 0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67 #6 0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114 #7 0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97 #8 0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604 #9 0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347 #10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383 freebsd#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301 freebsd#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331 freebsd#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190 freebsd#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372 freebsd#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414 freebsd#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646 freebsd#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003 freebsd#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762 freebsd#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760 freebsd#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548 freebsd#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903 freebsd#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946 freebsd#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169 freebsd#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675 freebsd#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890 Fix the incorrect error catch by only instantiating an Error object if it is necessary. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D86355 This should fix lldb aborting as described in the scenario above. Reported by: dmgk PR: 248745
emaste
pushed a commit
that referenced
this pull request
Nov 10, 2021
Merge commit 1ce07cd614be from llvm git (by me): Instantiate Error in Target::GetEntryPointAddress() only when necessary When Target::GetEntryPointAddress() calls exe_module->GetObjectFile()->GetEntryPointAddress(), and the returned entry_addr is valid, it can immediately be returned. However, just before that, an llvm::Error value has been setup, but in this case it is not consumed before returning, like is done further below in the function. In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core: * thread #1, name = 'testcase', stop reason = breakpoint 1.1 frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5 1 int main(int argc, char *argv[]) 2 { -> 3 return 0; 4 } (lldb) p argc Program aborted due to an unhandled Error: Error value was Success. (Note: Success values must still be checked prior to being destroyed). Thread 1 received signal SIGABRT, Aborted. thr_kill () at thr_kill.S:3 3 thr_kill.S: No such file or directory. (gdb) bt #0 thr_kill () at thr_kill.S:3 #1 0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52 #2 0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 #3 0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112 #4 0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267 #5 0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67 #6 0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114 #7 0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97 #8 0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604 #9 0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347 #10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383 freebsd#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301 freebsd#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331 freebsd#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190 freebsd#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372 freebsd#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414 freebsd#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646 freebsd#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003 freebsd#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762 freebsd#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760 freebsd#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548 freebsd#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903 freebsd#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946 freebsd#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169 freebsd#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675 freebsd#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890 Fix the incorrect error catch by only instantiating an Error object if it is necessary. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D86355 This should fix lldb aborting as described in the scenario above. Reported by: dmgk PR: 248745 Approved by: so Security: FreeBSD-EN-21:07.lldb (cherry picked from commit eb41eed)
emaste
pushed a commit
that referenced
this pull request
Feb 8, 2022
This LOR happens when reading from a file backed MD device: lock order reversal: 1st 0xfffffe00431eaac0 pbufwait (pbufwait, lockmgr) @ /cobra/src/sys/vm/vm_pager.c:471 2nd 0xfffff80003f17930 ufs (ufs, lockmgr) @ /cobra/src/sys/dev/md/md.c:977 lock order pbufwait -> ufs attempted at: #0 0xffffffff80c78ead at witness_checkorder+0xbdd #1 0xffffffff80bd6a52 at lockmgr_lock_flags+0x182 #2 0xffffffff80f52d5c at ffs_lock+0x6c #3 0xffffffff80d0f3f4 at _vn_lock+0x54 #4 0xffffffff80708629 at mdstart_vnode+0x499 #5 0xffffffff807060ec at md_kthread+0x20c #6 0xffffffff80bbfcd0 at fork_exit+0x80 #7 0xffffffff810b809e at fork_trampoline+0xe This LOR was previously blessed by witness before commit 531f8cf ("Use dedicated lock name for pbufs"). Instead of blessing ufs and pbufwait, use LK_NOWAIT to prevent recording the lock order. LK_NOWAIT will be a nop here as the lock is dropped in pbuf_dtor(). The takes the same approach as 5875b94 ("buf_alloc(): lock the buffer with LK_NOWAIT"). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D34183
emaste
pushed a commit
that referenced
this pull request
Apr 3, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 freebsd#11 0xffffffff8065a28c at lookup+0x45c freebsd#12 0xffffffff806594b9 at namei+0x259 freebsd#13 0xffffffff80676a33 at kern_statat+0xf3 freebsd#14 0xffffffff8067712f at sys_fstatat+0x2f freebsd#15 0xffffffff808ae50c at amd64_syscall+0x10c freebsd#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c freebsd#11 0xffffffff80660e59 at namei+0x259 freebsd#12 0xffffffff8067e3d3 at kern_statat+0xf3 freebsd#13 0xffffffff8067eacf at sys_fstatat+0x2f freebsd#14 0xffffffff808b5ecc at amd64_syscall+0x10c freebsd#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
emaste
pushed a commit
that referenced
this pull request
Jul 17, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 freebsd#11 0xffffffff8065a28c at lookup+0x45c freebsd#12 0xffffffff806594b9 at namei+0x259 freebsd#13 0xffffffff80676a33 at kern_statat+0xf3 freebsd#14 0xffffffff8067712f at sys_fstatat+0x2f freebsd#15 0xffffffff808ae50c at amd64_syscall+0x10c freebsd#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c freebsd#11 0xffffffff80660e59 at namei+0x259 freebsd#12 0xffffffff8067e3d3 at kern_statat+0xf3 freebsd#13 0xffffffff8067eacf at sys_fstatat+0x2f freebsd#14 0xffffffff808b5ecc at amd64_syscall+0x10c freebsd#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
emaste
pushed a commit
that referenced
this pull request
Jul 31, 2023
Avoid locking issues when if_allmulti() calls the driver's if_ioctl, because that may acquire sleepable locks (while we hold a non-sleepable rwlock). Fortunately there's no pressing need to hold the mroute lock while we do this, so we can postpone the call slightly, until after we've released the lock. This avoids the following WITNESS warning (with iflib drivers): lock order reversal: (sleepable after non-sleepable) 1st 0xffffffff82f64960 IPv4 multicast forwarding (IPv4 multicast forwarding, rw) @ /usr/src/sys/netinet/ip_mroute.c:1050 2nd 0xfffff8000480f180 iflib ctx lock (iflib ctx lock, sx) @ /usr/src/sys/net/iflib.c:4525 lock order IPv4 multicast forwarding -> iflib ctx lock attempted at: #0 0xffffffff80bbd6ce at witness_checkorder+0xbbe #1 0xffffffff80b56d10 at _sx_xlock+0x60 #2 0xffffffff80c9ce5c at iflib_if_ioctl+0x2dc #3 0xffffffff80c7c395 at if_setflag+0xe5 #4 0xffffffff82f60a0e at del_vif_locked+0x9e #5 0xffffffff82f5f0d5 at X_ip_mrouter_set+0x265 #6 0xffffffff80bfd402 at sosetopt+0xc2 #7 0xffffffff80c02105 at kern_setsockopt+0xa5 #8 0xffffffff80c02054 at sys_setsockopt+0x24 #9 0xffffffff81046be8 at amd64_syscall+0x138 #10 0xffffffff8101930b at fast_syscall_common+0xf8 See also: https://redmine.pfsense.org/issues/12079 Reviewed by: mjg Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D41209
emaste
pushed a commit
that referenced
this pull request
Sep 16, 2023
netlink(4) calls back into the driver during detach and it attempts to start an internal synchronized op recursively, causing an interruptible hang. Fix it by failing the ioctl if the VI has been marked as DOOMED by cxgbe_detach. Here's the stack for the hang for reference. #6 begin_synchronized_op #7 cxgbe_media_status #8 ifmedia_ioctl #9 cxgbe_ioctl #10 if_ioctl freebsd#11 get_operstate_ether freebsd#12 get_operstate freebsd#13 dump_iface freebsd#14 rtnl_handle_ifevent freebsd#15 rtnl_handle_ifnet_event freebsd#16 rt_ifmsg freebsd#17 if_unroute freebsd#18 if_down freebsd#19 if_detach_internal freebsd#20 if_detach freebsd#21 ether_ifdetach freebsd#22 cxgbe_vi_detach freebsd#23 cxgbe_detach freebsd#24 DEVICE_DETACH MFC after: 3 days Sponsored by: Chelsio Communications
emaste
pushed a commit
that referenced
this pull request
Oct 13, 2023
netlink(4) calls back into the driver during detach and it attempts to start an internal synchronized op recursively, causing an interruptible hang. Fix it by failing the ioctl if the VI has been marked as DOOMED by cxgbe_detach. Here's the stack for the hang for reference. #6 begin_synchronized_op #7 cxgbe_media_status #8 ifmedia_ioctl #9 cxgbe_ioctl #10 if_ioctl freebsd#11 get_operstate_ether freebsd#12 get_operstate freebsd#13 dump_iface freebsd#14 rtnl_handle_ifevent freebsd#15 rtnl_handle_ifnet_event freebsd#16 rt_ifmsg freebsd#17 if_unroute freebsd#18 if_down freebsd#19 if_detach_internal freebsd#20 if_detach freebsd#21 ether_ifdetach freebsd#22 cxgbe_vi_detach freebsd#23 cxgbe_detach freebsd#24 DEVICE_DETACH MFC after: 3 days Sponsored by: Chelsio Communications (cherry picked from commit 3814249)
emaste
pushed a commit
that referenced
this pull request
Oct 25, 2023
netlink(4) calls back into the driver during detach and it attempts to start an internal synchronized op recursively, causing an interruptible hang. Fix it by failing the ioctl if the VI has been marked as DOOMED by cxgbe_detach. Here's the stack for the hang for reference. #6 begin_synchronized_op #7 cxgbe_media_status #8 ifmedia_ioctl #9 cxgbe_ioctl #10 if_ioctl freebsd#11 get_operstate_ether freebsd#12 get_operstate freebsd#13 dump_iface freebsd#14 rtnl_handle_ifevent freebsd#15 rtnl_handle_ifnet_event freebsd#16 rt_ifmsg freebsd#17 if_unroute freebsd#18 if_down freebsd#19 if_detach_internal freebsd#20 if_detach freebsd#21 ether_ifdetach freebsd#22 cxgbe_vi_detach freebsd#23 cxgbe_detach freebsd#24 DEVICE_DETACH Sponsored by: Chelsio Communications Approved by: re (kib) (cherry picked from commit 3814249) (cherry picked from commit 3287f64)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.