Skip to content

Latch receivers on ApplicationStopping for immediate graceful shutdown#2288

Merged
jeremydmiller merged 1 commit intomainfrom
fix/2282-graceful-shutdown
Mar 11, 2026
Merged

Latch receivers on ApplicationStopping for immediate graceful shutdown#2288
jeremydmiller merged 1 commit intomainfrom
fix/2282-graceful-shutdown

Conversation

@jeremydmiller
Copy link
Member

Summary

  • Hook into IHostApplicationLifetime.ApplicationStopping to immediately latch all message receivers the moment SIGTERM fires, preventing queued messages from being processed after the shutdown signal
  • Reorder StopAsync to drain endpoints before releasing ownership and tearing down agents, so in-flight handlers complete before messages are released back to the inbox
  • Add bounded WaitForCompletionAsync in DurableReceiver and BufferedReceiver with configurable DrainTimeout (default 30s) to prevent indefinite hangs during shutdown
  • Add Latch() methods to BufferedReceiver, ListeningAgent, and DurableLocalQueue for immediate latching without full drain
  • New GracefulShutdown and RollingRestart chaos test scripts with 6 test methods

Context

Addresses the scenario from #2282 where messages already in Wolverine's internal processing queues continue to be executed between the SIGTERM signal and Wolverine's own IHostedService.StopAsync being called. Since .NET stops hosted services in reverse registration order, other services may stop first, during which time Wolverine keeps processing. The ApplicationStopping hook fires immediately on SIGTERM, closing this timing gap.

Test plan

  • CoreTests: 1160 passed, 0 failed
  • PostgreSQL: 330 passed, 5 failed (pre-existing compliance test flakiness)
  • SQL Server: 295 passed, 0 failed

Closes #2282

🤖 Generated with Claude Code

…ing during shutdown

Hook into IHostApplicationLifetime.ApplicationStopping so all message receivers
are latched the moment SIGTERM fires, rather than waiting for IHostedService.StopAsync
which may be delayed by other hosted services stopping first. This prevents messages
already in internal queues from being picked up after the shutdown signal.

Also reorders StopAsync to drain endpoints before releasing ownership, adds bounded
WaitForCompletionAsync calls in DurableReceiver and BufferedReceiver with configurable
DrainTimeout (default 30s), and includes GracefulShutdown/RollingRestart chaos tests.

Closes #2282

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jeremydmiller jeremydmiller merged commit 79d769c into main Mar 11, 2026
4 of 11 checks passed
jeremydmiller added a commit that referenced this pull request Mar 12, 2026
When a rate-limited message triggers PauseListenerContinuation, the
pause calls StopAndDrainAsync → DrainAsync from within the receiver
block's execute function. The WaitForCompletionAsync added in #2288
waits for in-flight items to finish, but the current message IS an
in-flight item — creating a deadlock that times out after DrainTimeout
(30s), causing the Redis rate limiting test to fail.

Fix: only wait for completion when Latch() was previously called
(indicating shutdown via OnApplicationStopping), not when DrainAsync
is the first to set _latched (indicating a pipeline-triggered pause).

Closes #2291

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wolverine instance graceful shutdown

1 participant