Reduce allocations per lock request#52
Merged
sasha-s merged 1 commit intosasha-s:mainfrom Mar 17, 2026
Merged
Conversation
sasha-s
reviewed
Mar 17, 2026
| if stopped && !shouldDisableTimerPool() { | ||
| e.stack = nil | ||
| e.ptr = nil | ||
| pendingPool.Put(e) |
Owner
There was a problem hiding this comment.
Maybe clear gid as well for consistency?
| } | ||
| return e | ||
| } | ||
|
|
Owner
There was a problem hiding this comment.
maybe add something like
// deregister marks the lock as acquired and cancels the deadlock timer.
// Must be called exactly once per register call. The entry pointer is
// stack-local in lock(), so concurrent or duplicate calls cannot occur.
6923d91 to
bdb618c
Compare
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TLDR: this change refactors the deadlock detection mechanism in a Go deadlock-detection library, replacing a goroutine-per-lock design with a timer-callback + object pool design.
Old Design (goroutine + channel per lock)
Every time a lock is contended, the old code did this:
make(chan struct{})- allocate a new channelgo checkDeadlock(stack, ptr, currentID, ch)- spawn a new goroutineselectloop, waiting for either a timer tick (potential deadlock) or the channel to close (lock acquired)close(ch)- signal the goroutine to exit once the lock is acquiredThis means every lock acquisition allocates a channel and spawns a goroutine.
go test -bench=. -benchmem -count=3 ./...New Design (AfterFunc + pooled entries)
The new code introduces a pendingEntry struct and a deadlockWatcher:
register()— grabs a pendingEntry from a sync.Pool (or creates one), populates it, and calls time.AfterFunc(timeout, e.checkFn) to schedule a callbacklockFn()— acquires the actual lockderegister()— atomically marks e.done = 1 and stops the timer; if the timer was successfully stopped, the entry goes back to the poolThe timer callback (checkFn) checks
atomic.LoadInt32(&e.done)- if the lock was already acquired, it returns immediately (no-op). Otherwise it reports the deadlock.go test -bench=. -benchmem -count=3 ./...We see an improvement in speed and reduction in allocations.
How it reduces allocations
No goroutine per lock (biggest win)
The old code spawned go checkDeadlock(...) for every lock. Each goroutine allocates a stack (starts at ~2-8KB) and adds scheduler overhead. The new code uses time.AfterFunc, which registers a callback with the Go runtime's timer heap — no dedicated goroutine sits around waiting.
No channel per lock
The old code created make(chan struct{}) on every lock. Channels are heap-allocated structs with internal queues. The new code replaces this synchronization with a single atomic.StoreInt32(&e.done, 1) - a zero-allocation atomic write.
Pooled pendingEntry (including the timer)
The old code only pooled time.Timer objects. The new code pools the entire pendingEntry, which bundles:
When an entry comes back from the pool, e.timer.Reset(...) reuses the existing timer rather than allocating a new one.
This change is