Skip to content

Fix #2588: durable outbox policy ignored for conventionally-routed senders#2596

Merged
jeremydmiller merged 3 commits intomainfrom
investigation/2588-durable-outbox-not-applied
Apr 27, 2026
Merged

Fix #2588: durable outbox policy ignored for conventionally-routed senders#2596
jeremydmiller merged 3 commits intomainfrom
investigation/2588-durable-outbox-not-applied

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

Summary

  • Closes UseDurableOutboxOnAllSendingEndpoints does not work for EF Core #2588
  • UseDurableOutboxOnAllSendingEndpoints() (and any AllSenders policy that gates on endpoint.Subscriptions.Any()) was silently ignored for conventionally-routed sender endpoints whenever a handler was also registered for the same message type. Cascading messages bypassed the EF Core / Marten outbox transaction and went out inline.
  • Add IMessageRoutingConvention.PreregisterSenders (default no-op). Wolverine now calls it from discoverListenersFromConventions so sender subscription metadata + delayed configuration are applied BEFORE BrokerTransport.InitializeAsync calls Compile() on the endpoint. Fixes the race where listener-discovery side-effects (e.g. RabbitMqMessageRoutingConvention.ApplyListenerRoutingDefaults) pre-create the sender exchange and the _hasCompiled flag locks out the policy.
  • Bonus: split the describe Sending Endpoints table to actually show only senders (via EndpointCollection.ActiveSendingAgents). Listener queues and sender exchanges/topics/queues now appear in their respective tables; an endpoint that plays both roles can show in both.
  • Per-transport regression coverage: RabbitMQ, Azure Service Bus (queue + topic broadcasting), Amazon SQS, GCP Pub/Sub. Plus the original EF Core repro.

Notes

  • Three pre-existing disable_listener_by_lambda tests (SQS / GCP / ASB) had over-specified assertions (endpoint.ShouldBeNull()) that broke once we eagerly pre-register senders for handled types. Tightened to IsListener.ShouldBeFalse(), which is the actual semantic the test was checking.
  • All shipped routing conventions (RabbitMqMessageRoutingConvention, AzureServiceBusMessageRoutingConvention, AzureServiceBusTopicBroadcastingRoutingConvention, AmazonSqsMessageRoutingConvention, PubsubMessageRoutingConvention) inherit from MessageRoutingConvention<,,,> so they pick up the fix automatically. The interface default keeps custom user implementations source-compatible.

Test plan

  • dotnet test src/Persistence/EfCoreTests/EfCoreTests.csproj --filter Bug_2588 (the original repro) — passes
  • dotnet test src/Testing/CoreTests — 1360 passed, 0 regressed
  • dotnet test src/Transports/RabbitMQ/Wolverine.RabbitMQ.Tests --filter ConventionalRouting — all 28 pass
  • dotnet test src/Transports/AWS/Wolverine.AmazonSqs.Tests --filter ConventionalRouting (excluding the pre-existing _with_prefix flaky AWS-credential test) — 19/19 pass
  • dotnet test src/Transports/GCP/Wolverine.Pubsub.Tests --filter ConventionalRouting — 20/20 pass
  • New per-transport Bug_2588 tests pass: RabbitMQ, Amazon SQS, GCP Pub/Sub, Azure Service Bus (queue + topic broadcasting)

🤖 Generated with Claude Code

jeremydmiller and others added 3 commits April 26, 2026 16:05
Mirrors the reporter's setup: EF Core DbContext with manual envelope
mapping + Postgres persistence + RabbitMQ conventional routing + the
durable-outbox-on-all-senders policy + a registered handler. Asserts
the conventionally-routed exchange ends up with EndpointMode.Durable.

Currently FAILS with Mode=BufferedInMemory.

Root cause (traced via diagnostic added/removed to Endpoint.Compile):

- Registering Bug2588Handler for Bug2588Message means
  WolverineRuntime.HostService.discoverListenersFromConventions runs
  RabbitMqMessageRoutingConvention.DiscoverListeners which calls
  ApplyListenerRoutingDefaults — this PRE-CREATES the sender exchange
  via transport.Exchanges[name] (RabbitMqMessageRoutingConvention.cs:33)
  with no Subscriptions on it.
- BrokerTransport.InitializeAsync then enumerates explicitEndpoints()
  (which now includes that exchange) and calls Compile() on each. The
  AllSenders policy gates on `e.Subscriptions.Any()`, so it
  short-circuits and the policy never applies.
- Endpoint._hasCompiled is set true.
- Later, on first publish (or the test's RoutingFor call),
  MessageRoutingConvention.DiscoverSenders runs and adds the
  Subscription via the GH-2304 fix — but Compile is now no-op.

Why Bug_2304 (Marten + RabbitMQ) passes: Bug_2304 calls
DisableConventionalDiscovery() WITHOUT IncludeType<Bug2304Handler>(),
so the handler is never discovered, no listener gets created, no early
exchange creation, the GH-2304 fix path runs cleanly. Bug_2304 passes
"by accident" — it doesn't cover the realistic case the reporter hit.

Investigation only — no production fix included. See PR description
for recommendations.

Adds Wolverine.RabbitMQ project ref to EfCoreTests for the test
(no broker connection needed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes #2588. When `UseConventionalRouting()` was combined with
`UseDurableOutboxOnAllSendingEndpoints()` (or any AllSenders policy that
gates on `endpoint.Subscriptions.Any()`) the policy never applied to
conventionally-routed sender endpoints created as a side-effect of
listener discovery. The reporter saw cascaded events sent inline,
bypassing the EF outbox transaction.

Root cause: `RabbitMqMessageRoutingConvention.ApplyListenerRoutingDefaults`
(and the analogous code in other transports) creates the sender
exchange/queue/topic during listener discovery, BEFORE
`BrokerTransport.InitializeAsync` calls `Compile()` on the endpoint.
At that point the endpoint has zero subscriptions, so the AllSenders
policy lambda short-circuits, and `Endpoint._hasCompiled` later prevents
re-application when `DiscoverSenders` lazily adds the subscription on
first publish.

Fix: add `IMessageRoutingConvention.PreregisterSenders` (default no-op).
Wolverine calls it from `discoverListenersFromConventions` after
`DiscoverListeners` so that subscription metadata + sender configuration
are registered on the endpoint BEFORE any `Compile()` runs. The base
`MessageRoutingConvention<,,,>` overrides it to register subscriptions
without building sending agents (the broker isn't connected yet at this
phase). All shipped routing conventions (RabbitMQ, ASB queue + topic
broadcasting, AmazonSQS, GCP Pub/Sub) inherit the fix automatically.

Bonus: split the `describe` Sending Endpoints table so it shows actual
senders (via `EndpointCollection.ActiveSendingAgents`) instead of every
endpoint registered in any transport. Listeners and senders may overlap
when an endpoint plays both roles. Previously a Durable listener queue
showed up in the Subscriptions table looking like a BufferedInMemory
sender.

Per-transport regression tests cover RabbitMQ, ASB (queue + topic
broadcasting), AmazonSQS, and GCP Pub/Sub. The pre-existing
`disable_listener_by_lambda` tests for SQS / GCP / ASB asserted "no
endpoint at this URI", which was over-specified — with eager sender
pre-registration an endpoint may exist as a sender even when the
listener was disabled. Tightened the assertion to `IsListener.ShouldBeFalse()`
which is the actual semantic the test is checking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Bug_2588 EF Core reproducer combines EF Core, Postgres, and RabbitMQ
to mirror the reporter's real-world setup. CIEfCore was only starting
postgresql + sqlserver, so the test's RabbitMQ connection timed out
(20 retries x 5s = ~100s) and the test failed in CI even though it
passes locally where docker-compose runs the rabbitmq service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeremydmiller jeremydmiller merged commit 113628d into main Apr 27, 2026
18 of 21 checks passed
jeremydmiller added a commit that referenced this pull request May 1, 2026
…rs into ExplicitRouting

PR #2596 added MessageRoutingConvention.PreregisterSenders to make endpoint
policies like UseDurableOutboxOnAllSendingEndpoints see Subscriptions.Any()
on conventionally-routed sender endpoints at Compile() time. The fix worked
for #2588 but introduced a routing-precedence regression: ExplicitRouting
walks Transports.AllEndpoints().Where(ShouldSendMessage), and
ShouldSendMessage just calls Subscriptions.Any(s => s.Matches(messageType)).
Pre-registering subscriptions on the conventional sender now makes
ExplicitRouting fire for handled messages — and since ExplicitRouting is
non-additive, it short-circuits past LocalRouting, so handled messages stop
routing to their local handlers. Explicit publish rules also got duplicated
against the conventional broker exchange.

Eight MessageRoutingTests caught it:
- explicit_rules_always_win.locally_handled_messages_get_overridden_by_routing
- using_additive_local_routing_and_external_conventions.{local_routes_and_broker_conventional_routing_for_handled_messages,explicit_rules_win}
- using_messaging_conventions_for_both_external_and_local.{local_routes_for_handled_messages,multiple_handlers_are_still_routed_to_one_place_in_default_mode,respect_sticky_attributes_but_default_is_still_there_too,respect_sticky_attributes_no_default}
- using_local_routing_disabled_and_external_routing_conventions.explicit_routing_wins_no_matter_what

Fix: tag convention-pre-registered subscriptions with a new internal
Subscription.IsFromConvention flag (via Subscription.ForConventionalType),
and have Endpoint.ShouldSendMessage skip those when answering "is this an
explicit publish rule?". The AllSenders policy still gates on
Subscriptions.Any() (count, not provenance), so #2588's outbox fix keeps
working — verified by re-running Bug_2588 and Bug_2304 reproducers across
RabbitMQ / ASB / SQS / GCP Pub/Sub. The Bug_2588 tests previously
asserted endpoint.Mode via routes.Single() — that worked only because of
this leak. Reworked them to look up the conventional broker sender
directly via IMessageRoutingConvention.DiscoverSenders, which tests the
actual #2588 invariant ("AllSenders policy applied to conventional
sender") without depending on routing precedence.

CI: add CIMessageRouting target (boots RabbitMQ via docker compose, runs
MessageRoutingTests one class at a time) and wire it into the .NET
workflow's CI dependency chain plus a docker compose down step. Without
this, the regression wouldn't have been caught on PRs — MessageRoutingTests
isn't part of any existing per-transport workflow even though the bug
lives in core routing precedence.

Closes the regression introduced by #2596. Refs #2588.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jeremydmiller added a commit that referenced this pull request May 1, 2026
Minor bump for the 5.36 release. Cuts:
- Routing-precedence regression fix introduced by #2596 (PreregisterSenders
  was leaking conventional senders into ExplicitRouting and breaking local
  handler routing for handled message types).
- MetricsTests now wired to the IWolverineObserver pipeline so the
  end-to-end CritterWatch / Hybrid mode tests actually exercise the
  metrics-export path again after d999a2e.
- SQL Server docker image pinned to 2022-latest (2025-latest is broken
  on Linux/ARM hosts).
- New CIMessageRouting target in the .NET workflow so future routing-
  precedence regressions fail fast on PRs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UseDurableOutboxOnAllSendingEndpoints does not work for EF Core

1 participant