Skip to content

Materialize EF domain-event scraper before publishing (#2585)#2594

Merged
jeremydmiller merged 2 commits intomainfrom
bugfix/2585-domain-event-scraper-lazy-enumeration
Apr 26, 2026
Merged

Materialize EF domain-event scraper before publishing (#2585)#2594
jeremydmiller merged 2 commits intomainfrom
bugfix/2585-domain-event-scraper-lazy-enumeration

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

Fixes #2585.

The bug

DomainEventScraper<T,TEvent>.ScrapeEvents enumerated this LINQ pipeline lazily and then await bus.PublishAsync(...) per event inside the foreach:

var eventMessages = dbContext.ChangeTracker.Entries().Select(x => x.Entity)
    .OfType<T>().SelectMany(_source);

foreach (var eventMessage in eventMessages)
{
    await bus.PublishAsync(eventMessage);
}

With the EF-backed outbox, PublishAsync flows through EfCoreEnvelopeTransaction.PersistIncomingAsync, which adds an IncomingMessage entity to the same DbContext. That mutates the ChangeTracker mid-enumeration and throws:

InvalidOperationException: Collection was modified; enumeration operation may not execute.
   at Wolverine.EntityFrameworkCore.DomainEventScraper`2.ScrapeEvents

The fix

.ToArray() at the end of the LINQ pipeline. Single end-of-pipeline materialization is sufficient because SelectMany flattens all per-entity event lists during the ToArray evaluation — every list is enumerated to completion before any PublishAsync call runs. Cost: one small array allocation per scrape.

Tests

DomainEventScraperStateFilterTests.domain_event_scraper_materializes_events_before_publishing adapts the approach from @jf2s's closed PR #2586. Uses an ISendMyself domain event that mutates the DbContext during ApplyAsync to deterministically trigger the bug without needing live SQL Server / PostgreSQL. Verified the test fails on main and passes with this change.

All 10 tests in EfCoreTests/DomainEvents/ still pass.

Aggressive sweep — Wolverine.EntityFrameworkCore only

Per discussion in the issue thread, scope is Wolverine.EntityFrameworkCore. Audited every foreach in the package; no other risky lazy-enumeration patterns found:

  • DbContextOutbox._scrapersIDomainEventScraper[], materialized in constructor.
  • EfCoreEnvelopeTransaction._scrapers — same, materialized.
  • TenantedDbContextBuilder* — iterates _store.Source.AllActiveByTenant(), which returns IReadOnlyList<>. The BuildAllAsync body adds DbContexts to a local List<T> only — does not mutate the source.
  • Codegen/* — runs at code-generation time, not on hot paths; no publishing side effects.

This was the only site that exhibited the publish-while-iterating-the-source pattern.

Credit

Same approach as the closed PR #2586 by @jf2s — credited in the test docstring and this PR.

Test plan

  • New regression test fails on main
  • New regression test passes with the fix
  • All other DomainEvents tests pass (10/10)
  • Audit of all foreach in Wolverine.EntityFrameworkCore — no other lazy-enumeration risks

🤖 Generated with Claude Code

jeremydmiller and others added 2 commits April 26, 2026 12:14
DomainEventScraper<T,TEvent>.ScrapeEvents enumerated this LINQ pipeline
lazily:

    dbContext.ChangeTracker.Entries()
        .Select(x => x.Entity)
        .OfType<T>()
        .SelectMany(_source);

then `await bus.PublishAsync(...)` per event inside the foreach. With the
EF-backed outbox, PublishAsync flows through
EfCoreEnvelopeTransaction.PersistIncomingAsync, which adds an
IncomingMessage entity to the SAME DbContext. That mutates the
ChangeTracker mid-enumeration and throws:

    InvalidOperationException: Collection was modified;
    enumeration operation may not execute.

Fix: materialize with .ToArray() before the publish loop. Single
end-of-pipeline materialization is sufficient because all per-entity
event lists get enumerated by SelectMany during ToArray, before any
PublishAsync runs. Cost: one small array allocation per scrape.

Tests: regression test in DomainEventScraperStateFilterTests adapts the
approach from @jf2s's closed PR #2586 — uses an ISendMyself domain event
that mutates the DbContext during ApplyAsync to deterministically
trigger the bug without needing live SQL Server / PostgreSQL. Verified
the test fails on main and passes with this change. All other 9
DomainEvents tests in the folder still pass.

Sweep of remaining foreach loops across Wolverine.EntityFrameworkCore
turned up no other risky patterns — every other foreach iterates an
already-materialized array (_scrapers), an IReadOnlyList<>
(_store.Source.AllActiveByTenant()), or codegen-time data with no
side-effecting body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pop_off_buffered (and the other tests in this fixture that assert
queue/scheduled counts after a manual TryPop / DeleteExpired /
MoveScheduled call) were flaky on slow CI: the host's auto-started
PostgreSQL queue listener polls every 5s by default, and tests that
take longer than that under load can have the host listener consume
messages out from under the test's manual pop. Observed on PR #2594's
CIPersistence run (test wallclock 9.28s) where pop_off_buffered failed
the `(await theQueue.CountAsync()).ShouldBe(5)` assertion after
TryPopAsync(5). Locally the test takes ~50ms and the issue never
surfaces.

None of the tests in this fixture rely on the host's auto-started
listener — every test that exercises receiving creates its own
PostgresqlQueueListener and calls TryPopAsync /
TryPopDurablyAsync / DeleteExpiredAsync directly. Setting the host
listener's PollingInterval to 1.Hours() keeps the queue registered
(theTransport.Queues["one"] still resolves) while ensuring the host
listener won't fire during any test in this fixture.

Verified: full basic_functionality class passes (15/15) under the fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DomainEventScraper<T,TEvent> throws "Collection was modified" when EF-backed persistence shares the DbContext

1 participant