-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use an AtomicPtr for PulseStream's drain_timer #72
Conversation
There is a race condition between `drained_cb` and `PulseStream::stop` that happens reliably on Firefox CI with rust 1.56 beta (LLVM 13) and PGO instrumentation. Here's how it goes: - in the Firefox AudioIPC Server RPC thread, `PulseStream::stop` is called - `PulseStream::stop enters the loop waiting for drain, and blocks on `mainloop.wait` - Later, some other thread calls `drained_cb`, which resets `drain_timer`, and signals the mainloop. - Back the other AudioIPC Server RPC thread, `mainloop.wait` returns, looping back to the test for `drain_timer`... which this thread doesn't know had been updated yet, so it blocks on `mainloop.wait` again.
Note that the real problem here is that there are two live mutable references to the |
Nice find, thanks for fixing this! |
Why does Both
|
The deeper underlying problem is outlined in #72 (comment) and allows the compiler to make the assumption that drain_timer doesn't change in the loop. This is not a miscompilation, it's a rust code safety violation. The atomic papers over it. |
You're right - I wasn't aware that Rust makes that assumption. /firefox-94.0/third_party/rust/cubeb-pulse/src/backend/stream.rs:
640 while !self.drain_timer.is_null() {
0x0000000007191796 <+598>: cmpq $0x0,0x40(%r14)
0x000000000719179b <+603>: je 0x71917b7 <_ZN94_$LT$cubeb_pulse..backend..stream..PulseStream$u20$as$u20$cubeb_backend..traits..StreamOps$GT$4stop17h888579641f57ae07E+631>
0x000000000719179d <+605>: lea 0x979c(%rip),%rbx # 0x719af40 <_ZN5pulse17threaded_mainloop16ThreadedMainloop4wait17hf5e4f1c3711458a0E>
0x00000000071917a4 <+612>: nopw %cs:0x0(%rax,%rax,1)
0x00000000071917ae <+622>: xchg %ax,%ax
641 self.context.mainloop.wait();
0x00000000071917b0 <+624>: mov %r13,%rdi
0x00000000071917b3 <+627>: call *%rbx
640 while !self.drain_timer.is_null() {
0x00000000071917b5 <+629>: jmp 0x71917b0 <_ZN94_$LT$cubeb_pulse..backend..stream..PulseStream$u20$as$u20$cubeb_backend..traits..StreamOps$GT$4stop17h888579641f57ae07E+624> |
How did you create the disassembly? |
|
There is a race condition between
drained_cb
andPulseStream::stop
thathappens reliably on Firefox CI with rust 1.56 beta (LLVM 13) and PGO
instrumentation. Here's how it goes:
PulseStream::stop
iscalled
PulseStream::stop
enters the loop waiting for drain, and blocks onmainloop.wait
drained_cb
, which resetsdrain_timer
,and signals the mainloop.
mainloop.wait
returns,looping back to the test for
drain_timer
... which this threaddoesn't know had been updated yet, so it blocks on
mainloop.wait
again.