Fix deadlock in DrainAsync causing Redis scheduling test failure by jeremydmiller · Pull Request #2294 · JasperFx/wolverine

jeremydmiller · 2026-03-12T11:19:34Z

Summary

Fixes a deadlock in DurableReceiver.DrainAsync() and BufferedReceiver.DrainAsync() introduced by Latch receivers on ApplicationStopping for immediate graceful shutdown #2288
The WaitForCompletionAsync call now only executes during actual shutdown (when Latch() was called separately via OnApplicationStopping), not during pipeline-triggered pauses

Root Cause

PR #2288 added WaitForCompletionAsync to DrainAsync() to wait for in-flight messages during graceful shutdown. However, DrainAsync() is also called from within the handler pipeline when a rate-limited message triggers PauseListenerContinuation:

_receiver block execute(message B)
  → pipeline.InvokeAsync()
    → RateLimitContinuation → ReScheduleAsync (stores in Redis sorted set)
    → PauseListenerContinuation → agent.PauseAsync()
      → StopAndDrainAsync()
        → receiver.DrainAsync()
          → _receiver.WaitForCompletionAsync()  ← waits for message B to finish
                                                  but message B is waiting for THIS to return

This circular dependency causes a 30-second deadlock (bounded by DrainTimeout), after which the rate-limited message's retry window has long passed, and the test's 20-second polling timeout expires first.

Fix

Use the _latched flag to distinguish the two call paths:

Shutdown: OnApplicationStopping calls Latch() first → _latched is already true when DrainAsync() runs → safe to wait
Pipeline pause: no prior Latch() call → _latched is false when DrainAsync() runs → skip the wait to avoid deadlock

Test plan

rate_limited_messages_are_delayed_with_native_scheduling passes consistently (was failing every run)
All 87 Wolverine.Redis.Tests pass
All 1160 CoreTests pass

Closes #2291

🤖 Generated with Claude Code

When a rate-limited message triggers PauseListenerContinuation, the pause calls StopAndDrainAsync → DrainAsync from within the receiver block's execute function. The WaitForCompletionAsync added in #2288 waits for in-flight items to finish, but the current message IS an in-flight item — creating a deadlock that times out after DrainTimeout (30s), causing the Redis rate limiting test to fail. Fix: only wait for completion when Latch() was previously called (indicating shutdown via OnApplicationStopping), not when DrainAsync is the first to set _latched (indicating a pipeline-triggered pause). Closes #2291 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jeremydmiller merged commit b000f3e into main Mar 12, 2026
5 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix deadlock in DrainAsync causing Redis scheduling test failure#2294

Fix deadlock in DrainAsync causing Redis scheduling test failure#2294
jeremydmiller merged 1 commit intomainfrom
fix/2291-redis-scheduling-graceful-shutdown

jeremydmiller commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jeremydmiller commented Mar 12, 2026

Summary

Root Cause

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant