Skip to content

fix: use shared_ptr in ThreadedAsyncOperation to prevent SIGBUS on macOS#21625

Open
ludamad wants to merge 1 commit intonextfrom
fix/threaded-async-op-sigbus-next
Open

fix: use shared_ptr in ThreadedAsyncOperation to prevent SIGBUS on macOS#21625
ludamad wants to merge 1 commit intonextfrom
fix/threaded-async-op-sigbus-next

Conversation

@ludamad
Copy link
Collaborator

@ludamad ludamad commented Mar 16, 2026

Fixes use-after-free in ThreadedAsyncOperation (#21138) that causes SIGBUS on macOS and silent memory corruption on Linux. v4 is handled by reverting: #21630.

Root cause: TSFN BlockingCall (napi_tsfn_blocking) only blocks on queue insertion, NOT on callback completion. The callback runs asynchronously on the JS main thread, so delete this on the worker thread raced with the callback reading member fields. macOS's magazine malloc aggressively unmaps freed pages, turning this into a consistent SIGBUS. Linux glibc keeps pages mapped, so the race is silent.

Fix: manage ThreadedAsyncOperation via shared_ptr (enable_shared_from_this). Both the worker thread lambda and the TSFN callback capture a shared_ptr, so the object lives until both are done. Verified clean under ASAN with 1000+ concurrent operations (heap-use-after-free confirmed on buggy code, clean on fix).

Full post mortem

@ludamad ludamad added the ci-barretenberg Run all barretenberg/cpp checks. label Mar 16, 2026
@ludamad ludamad force-pushed the fix/threaded-async-op-sigbus-next branch 2 times, most recently from 310a821 to 1f5c0c4 Compare March 16, 2026 19:03
TSFN BlockingCall (napi_tsfn_blocking) only blocks on queue insertion,
NOT on callback completion. The callback runs asynchronously on the JS
thread, so `delete this` on the worker thread raced with the callback
reading member fields — confirmed via ASAN.

Fix: manage ThreadedAsyncOperation via shared_ptr (enable_shared_from_this).
Both the worker thread lambda and the TSFN callback capture a shared_ptr,
so the object lives until both are done. Verified clean under ASAN with
1000 concurrent operations.
@ludamad ludamad force-pushed the fix/threaded-async-op-sigbus-next branch from 1f5c0c4 to 8e2d14f Compare March 16, 2026 19:32
@ludamad ludamad changed the title fix: use NonBlockingCall in ThreadedAsyncOperation to prevent SIGBUS on macOS fix: use shared_ptr in ThreadedAsyncOperation to prevent SIGBUS on macOS Mar 16, 2026
ludamad added a commit that referenced this pull request Mar 16, 2026
Reverts #21138 on v4. ThreadedAsyncOperation has a use-after-free that
causes SIGBUS on macOS and silent memory corruption on Linux. Restoring
AsyncOperation (libuv pool) with the original deadlock-prevention
semaphore (UV_THREADPOOL_SIZE / 2) until a proper fix lands on next
(#21625).

[Post
mortem](https://gist.github.com/ludamad/443afe321853389a08693c4ff73676f7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-barretenberg Run all barretenberg/cpp checks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant