Skip to content

Foundation for binary event serialization (#4515 — Phase 1, Rich-mode only)#4578

Merged
jeremydmiller merged 2 commits into
masterfrom
feature/4515-binary-event-serializers
May 28, 2026
Merged

Foundation for binary event serialization (#4515 — Phase 1, Rich-mode only)#4578
jeremydmiller merged 2 commits into
masterfrom
feature/4515-binary-event-serializers

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

@jeremydmiller jeremydmiller commented May 28, 2026

Closes part of #4515. Phase 1 ships the foundation and a working vertical slice on the Rich append path; the Quick-mode write paths, BulkEventAppender, and binary upcasters are deferred as explicit, documented scope (see §Limitations).

TL;DR

  • Per-event-type binary serialization, coexisting with JSON on the same mt_events table.
  • New optional package: Marten.MemoryPack.
  • Additive schema change only: one new bdata bytea NULL column. No migration of existing data required.
  • 5/5 new MemoryPack integration tests pass; existing EventSourcing tests unaffected.
  • Limitation: Rich append mode only in this cut — the Quick modes throw at store-build time with a clear remediation recipe if a binary event type is registered without AppendMode = Rich.

Coexistence design

Schema is purely additive:

column type populated when
data jsonb NOT NULL JSON event: full payload. Binary event: literal '{}'::jsonb placeholder.
bdata bytea NULL (new) Binary event: serialized bytes. JSON event: NULL.

Per-row discriminator: bdata IS NULL. data NOT NULL stays intact, so existing rows in an upgraded store have bdata = NULL and read through the JSON path unchanged.

Registration API — both forms, per the maintainer's recommended option

// 1. Attribute-driven (uses opts.Events.DefaultBinarySerializer as resolver)
[BinaryEvent]
[MemoryPackable]
public partial record TripStarted(Guid TripId, string DriverName, DateTimeOffset StartedAt);

// 2. Fluent (explicit per-type serializer; wins over the default)
opts.Events.UseBinarySerializer<TripStarted>(new MemoryPackEventSerializer());

Resolution order on EventMapping construction:

  1. Explicit UseBinarySerializer<T>(...) for that type
  2. [BinaryEvent] + opts.Events.DefaultBinarySerializer
  3. Otherwise, plain JSON (existing path)

If [BinaryEvent] is set but neither a per-type nor a default serializer is wired, the EventMapping constructor throws with a remediation message naming both registration entry points.

Marten.MemoryPack package

opts.Events.AppendMode = EventAppendMode.Rich;   // Phase 1 limitation
opts.Events.UseMemoryPackSerializer();           // sets DefaultBinarySerializer

Then [BinaryEvent] [MemoryPackable] types just work. The serializer itself is ~5 lines over MemoryPackSerializer.

Surface area touched

Area What
Marten/Events/{IEventBinarySerializer,BinaryEventAttribute}.cs New interface + attribute
Marten/Events/Schema/EventBdataColumn.cs New nullable bytea column
Marten/Events/Schema/EventsTable.cs Add column; pin at SELECT position 3
Marten/Events/IEventStoreOptions.cs DefaultBinarySerializer + UseBinarySerializer<T>
Marten/Events/EventGraph.cs Registry + ResolveBinarySerializerFor
Marten/Events/EventMapping.cs BinarySerializer + IsBinary
Marten/Events/EventDocumentStorage.cs Per-row JSON-vs-binary dispatch in Resolve / ResolveAsync
Marten/EventStorage/Rich/RichAppendEventOperation.cs Bind bdata at column slot 4
Marten/EventStorage/Rich/RichEventStorageDescriptor.cs SerializeEventBdata closure
Marten/EventStorage/Dialects/PostgresEventStoreDialect.cs bdata in IsCoreColumn; binary-aware closures; Quick-mode guard
Marten/EventStorage/ClosedShapeEventDocumentStorage.cs Skip(3)Skip(4); +4 ordinal offset
Marten.MemoryPack/ New library — MemoryPackEventSerializer + sugar extensions
Marten.MemoryPack.Tests/ 5 integration tests
Marten.slnx New MemoryPack/ folder
build/build.cs TestMemoryPack target hung off TestExtensions
Directory.Packages.props MemoryPack 1.21.4

Tests

Marten.MemoryPack.Tests — all 5 pass against local Postgres:

Test What it pins
can_round_trip_a_single_binary_event Basic MemoryPack write/read
multiple_binary_events_replay_in_order Multi-event stream round-trip
json_and_binary_events_coexist_on_one_stream Mixed stream: JSON and binary events interleaved; both read back correctly
binary_events_land_in_bdata_column_jsoned_events_land_in_data_column On-disk shape verification — confirms binary rows have data = '{}' + bdata non-null; JSON rows have data = real JSON + bdata = NULL
pre_existing_json_rows_still_read_after_feature_is_in_place Upgrade-backfill: rows written before the feature still read correctly

Regression check: EventSourcingTests.end_to_end_event_capture_and_fetching 83/83 passing — JSON-only events still flow correctly through the dispatched read path.

Phase 1 limitations (explicit, documented)

These are written up in the new docs/events/binary-serialization.md page; calling them out here so the review knows what's intentionally deferred:

  1. EventAppendMode.Rich only. The default QuickWithServerTimestamps and Quick go through the mt_quick_append_events PG function, whose signature needs a parallel bdata bytea[] parameter to carry binary payloads. BuildQuickDescriptor / BuildQuickWithServerTimestampsDescriptor throw at store-build time if a binary event type is registered while AppendMode is non-Rich, with the remediation recipe in the exception message. Quick-mode support is Phase 2.
  2. No BulkEventAppender support. Same root cause — COPY column shape needs bdata adding. Follow-up.
  3. No binary upcaster support. Marten's JSON upcasters operate on JSON payloads; a separate typed transform shape for binary is needed. Filing as a separate follow-up issue.

Docs

  • New page: docs/events/binary-serialization.md (coexistence design, registration, on-disk shape, migration story, full constraints list)
  • Wired into the Events sidebar in docs/.vitepress/config.mts
  • Link from docs/events/optimizing.md (Performance & Scalability) pointing to the new page

Both pass markdownlint --disable MD009 and cspell locally.

Out of scope (deferred to follow-ups)

  • Quick-mode write paths (the PG function mt_quick_append_events needs a bdata bytea[] parameter and matching operation binds)
  • BulkEventAppender COPY-path binary support
  • Binary event upcasters / downcasters (filed as a separate follow-up issue — see [#4579](https://github.com/JasperFx/marten/issues/4579))
  • Integration tests for projections / FetchForWriting / async daemon against binary events — the same Resolve/ResolveAsync dispatch is used everywhere, so the existing per-row tests cover the core read contract; richer integration tests can land in Phase 2

🤖 Generated with Claude Code

… only)

Adds per-event-type binary serialization on the event store side, with
JSON-serialized and binary-serialized events coexisting in the same
mt_events table. Designed so the feature can be turned on in an
existing store with no migration of existing event data.

## Coexistence design

Schema is purely additive — one new column on mt_events:

  data  jsonb NOT NULL   -- existing; for binary events, holds {} placeholder
  bdata bytea NULL       -- new; the serialized bytes for binary events,
                          --      NULL for JSON-serialized events

Per-row discriminator: bdata IS NULL ⇒ JSON path, non-null ⇒ binary.
The existing `data NOT NULL` constraint stays intact, so the migration
is one nullable column — safe for in-place upgrade. Existing rows in
an upgraded store have `bdata = NULL` and continue to read through the
JSON path unchanged.

## Public API

  public interface IEventBinarySerializer
  {
      byte[] Serialize(Type type, object data);
      object Deserialize(Type type, byte[] data);
  }

Two equivalent registration paths:

  // 1. Attribute-driven (uses opts.Events.DefaultBinarySerializer as resolver)
  [BinaryEvent]
  [MemoryPackable]
  public partial record TripStarted(...);

  // 2. Fluent (explicit per-type serializer)
  opts.Events.UseBinarySerializer<TripStarted>(new MemoryPackEventSerializer());

Resolution order on EventMapping construction:
  1. Explicit UseBinarySerializer<T>(...) for that type
  2. [BinaryEvent] + opts.Events.DefaultBinarySerializer
  3. Otherwise, plain JSON (existing path).

If `[BinaryEvent]` is set but neither a per-type nor a default
serializer is wired, EventMapping throws at construction with a
remediation message naming both entry points.

## Surface area touched

- src/Marten/Events/{IEventBinarySerializer,BinaryEventAttribute}.cs       (new)
- src/Marten/Events/Schema/EventBdataColumn.cs                              (new — bytea nullable column)
- src/Marten/Events/Schema/EventsTable.cs                                   (add column; pin at SELECT position 3)
- src/Marten/Events/IEventStoreOptions.cs                                   (DefaultBinarySerializer + UseBinarySerializer<T>)
- src/Marten/Events/EventGraph.cs                                          (registry + ResolveBinarySerializerFor)
- src/Marten/Events/EventMapping.cs                                        (BinarySerializer + IsBinary)
- src/Marten/Events/EventDocumentStorage.cs                                (per-row JSON-vs-binary dispatch in Resolve/ResolveAsync)
- src/Marten/EventStorage/Rich/RichAppendEventOperation.cs                 (bind bdata at column slot 4)
- src/Marten/EventStorage/Rich/RichEventStorageDescriptor.cs               (SerializeEventBdata closure)
- src/Marten/EventStorage/Dialects/PostgresEventStoreDialect.cs            (bdata in IsCoreColumn; SerializeEventBdata wiring; Quick-mode guard)
- src/Marten/EventStorage/ClosedShapeEventDocumentStorage.cs               (Skip(3)→Skip(4); +4 ordinal offset)

## Marten.MemoryPack package

  - src/Marten.MemoryPack/                — IEventBinarySerializer impl over MemoryPack 1.21.4
  - src/Marten.MemoryPack.Tests/          — 5 integration tests (round-trip,
                                            multi-event replay, mixed JSON+binary
                                            stream, on-disk shape verification,
                                            upgrade-backfill against a pre-existing
                                            JSON row)

Added to Marten.slnx under a `/MemoryPack/` folder; TestMemoryPack
target wired into build.cs and hung off TestExtensions; MemoryPack
1.21.4 pinned in Directory.Packages.props.

## Phase 1 scope — explicit limitations (documented in docs/events/binary-serialization.md)

- **EventAppendMode.Rich only.** The default `QuickWithServerTimestamps`
  and `Quick` modes go through the `mt_quick_append_events` PostgreSQL
  function whose signature would need a parallel `bdata bytea[]`
  parameter. BuildQuickDescriptor / BuildQuickWithServerTimestampsDescriptor
  fail loud at store-build time if a binary event type is registered
  with a non-Rich AppendMode, with the remediation recipe in the
  exception message. Quick-mode support is the Phase 2 scope.
- **No BulkEventAppender support.** Same root cause — the COPY column
  shape needs the bdata column adding. Follow-up.
- **No binary upcaster support.** Marten's JSON upcasters
  (Marten.Services.Json.Transformations) operate on JSON payloads and
  don't generalize to byte[]. Binary upcasters need their own typed
  transform shape; tracked as a separate follow-up issue.

## Docs

  - docs/events/binary-serialization.md      (new page — coexistence design, registration,
                                              on-disk shape, migration story, constraints)
  - docs/events/optimizing.md                (link from the scalability/optimization page)
  - docs/.vitepress/config.mts               (sidebar entry under Events)

Both pass `markdownlint --disable MD009` and `cspell` locally.

## Test results

  - Marten.MemoryPack.Tests: 5/5 passing
  - EventSourcingTests.end_to_end_event_capture_and_fetching: 83/83 passing
    (regression check — JSON-only events through the dispatched read path)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ths)

CI failure on PR #4578: every test that exercises an event append on
Quick / QuickWithServerTimestamps modes — including the Daemon tests
and the PgVector projection tests on the prior PR #4576 import —
failed with:

    Npgsql.PostgresException : 42601: INSERT has more target columns
    than expressions

Root cause: the previous commit added `bdata` to `EventsTable.SelectColumns`
(SELECT position 3) and consequently to the SQL prefix built by
`PostgresEventStoreDialect.BuildAppendEventFullColumnsAndPrefix` (used by
*every* full-shape INSERT — Rich AND the per-event Quick path). I only
wired the bind in `RichAppendEventOperation`. The siblings —
`QuickAppendEventWithVersionOperation` (used by RichEventStorage,
QuickEventStorage, and QuickWithServerTimestampsEventStorage for
per-event INSERTs: tombstone batches, new-stream appends, optimistic-
concurrency appends, side-effect replay through EventSlice.BuildOperations)
— still emitted the old N parameters against the now-N+1 column list.

Fix: thread a `Func<IEvent, byte[]?> SerializeEventBdata` closure
through both Quick descriptors and `QuickAppendEventWithVersionOperation`,
mirroring the Rich descriptor's existing field. The closure binds
`bdata` immediately after `mt_dotnet_type` to match the column-list
position. In Quick modes the dialect installs `_ => null` since binary
events are rejected at descriptor-build time anyway.

This is what was missing from PR #4578's initial commit — the Rich-mode-only
constraint covers the *binary opt-in*, but `bdata` is still a column on
every row (NULL for JSON events), so every full-shape INSERT has to bind
it regardless of mode.

Local repro now passes:
- DaemonTests.Bug_3059_double_application: 1/1 ✅ (was the canary failure)
- Marten.MemoryPack.Tests: 5/5 ✅ (unaffected — binary path was already correct)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeremydmiller jeremydmiller merged commit b4763cf into master May 28, 2026
8 of 9 checks passed
@jeremydmiller jeremydmiller deleted the feature/4515-binary-event-serializers branch May 28, 2026 19:30
jeremydmiller added a commit that referenced this pull request May 28, 2026
…arget (#4582)

The Nuke `Pack` target in `build/build.cs` lists every project that
gets packed and pushed by the `on-manual-do-nuget-publish.yml`
workflow. The three new optional companion packages added in PR #4576
(PostGIS / PgVector) and PR #4578 (MemoryPack — binary event
serialization) were never added to the list, so the next NuGet release
would silently leave them off NuGet.

Repacking locally now produces all 9 .nupkgs (was 6):

    Marten.9.2.1.nupkg
    Marten.AspNetCore.9.2.1.nupkg
    Marten.EntityFrameworkCore.9.2.1.nupkg
    Marten.MemoryPack.9.2.1.nupkg          ← new
    Marten.Newtonsoft.9.2.1.nupkg
    Marten.NodaTime.9.2.1.nupkg
    Marten.PgVector.9.2.1.nupkg            ← new
    Marten.PostGIS.9.2.1.nupkg             ← new
    Marten.SourceGenerator.9.2.1.nupkg

Gating fix before the 9.3.0 release.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jeremydmiller added a commit that referenced this pull request May 28, 2026
…xed-format aggregation (#4583)

Follow-up to #4578: covers the read-side surface Marten consumers
actually use end-to-end against binary events, not just the round-trip
path. The per-row JSON-vs-binary dispatch in
EventDocumentStorage.Resolve/ResolveAsync is the load-bearing piece
under all of these — if these tests pass for binary events the
dispatch is sound across the full Marten surface.

Five new tests in BinaryEventIntegrationTests, all on a self-aggregating
`Trip` registered as `opts.Projections.Snapshot<Trip>(SnapshotLifecycle.Inline)`:

- aggregate_stream_async_replays_binary_events
    Live aggregation (AggregateStreamAsync) over a binary-only stream
    of 4 events. Resolve dispatches each row to the binary
    deserializer; the aggregator sees the typed event Data instances.

- aggregate_stream_async_replays_mixed_binary_and_json_events
    Same shape but the stream has 5 events alternating binary +
    JSON-serialized. Each row goes through its own deserialization
    path; the aggregator stays agnostic.

- inline_projection_applies_binary_events
    Snapshot lifecycle = Inline: the projection runs in the same
    transaction as the event append, so LoadAsync<Trip>(streamId)
    immediately after SaveChangesAsync returns the projected document
    built from binary events.

- inline_projection_updates_across_two_appends_with_binary_events
    Two separate save transactions appending binary events; the
    inline projection updates correctly across both.

- fetch_for_writing_round_trips_binary_events
    DCB / read-modify-write: FetchForWriting<Trip>(streamId) hydrates
    the aggregate from binary-serialized events, we examine state,
    AppendOne a new binary event, save, then re-aggregate. The
    optimistic-concurrency path uses the same per-row dispatch.

Source generator wiring: added
`<PackageReference Include="JasperFx.Events.SourceGenerator" OutputItemType="Analyzer" ReferenceOutputAssembly="false" />`
to the test csproj. Required in any test assembly that defines its
own aggregate types — the conventional Apply/Create methods are
dispatched by the compile-time generator with no runtime fallback.

Full Marten.MemoryPack.Tests suite: 10/10 passing locally.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jeremydmiller added a commit that referenced this pull request May 28, 2026
… fix

Weasel 9.0.2 (JasperFx/weasel#299) fixes
PostgresqlMigrator.executeWithConcurrencyRetryAsync so it reopens a
Closed/Broken connection before the retry attempt. That eliminates
the recurring "Connection is not open" failure on the conjoined
`EventSourcingTests.end_to_end_event_capture_and_fetching_the_stream.
query_before_saving` test that hit this PR + #4576 + #4578 + #4582.

Bumps Weasel.Postgresql + Weasel.EntityFrameworkCore 9.0.1 → 9.0.2 in
Directory.Packages.props (CPM). Marten.MemoryPack.Tests still 8/8
locally on top of the new Weasel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jeremydmiller added a commit that referenced this pull request May 28, 2026
#4584)

* #4515 Phase 2: binary event serialization on Quick + BulkEventAppender

#4578 shipped the foundation with `Rich` mode only; the Quick paths
(QuickWithServerTimestamps default + Quick + bulk COPY) all guarded
against binary events at store-build time. This lifts that constraint —
binary serialization now works on every EventAppendMode and through the
BulkEventAppender.

## Wire-format change: `mt_quick_append_events` grows a `bdatas bytea[]` param

The PostgreSQL function used by both Quick variants now accepts a
parallel `bdatas bytea[]` parameter right after `bodies jsonb[]`. The
INSERT writes `bdatas[index]` into `mt_events.bdata`. For JSON events
the array slot is NULL; for binary events `bodies[index]` is the `{}`
placeholder and `bdatas[index]` carries the real payload. Same
on-disk row shape as the Rich path: `bdata IS NULL` remains the
discriminator that the existing read path keys off.

Weasel's standard function-diff migration handles the signature
change as DROP + CREATE on existing installations; existing JSON rows
are untouched.

## Call-site dispatch — same shape as Rich

`PostgresEventStoreDialect.BuildQuickDescriptor` and
`BuildQuickWithServerTimestampsDescriptor` install the same
`serializeEventData` / `serializeEventBdata` closures the Rich
descriptor uses (look up `EventMapping`, branch on `IsBinary`).
`QuickAppendEventsOperationBase.writeBasicParameters` now accepts an
optional `Func<IEvent, byte[]?> serializeEventBdata` and binds the
parallel `bdatas bytea[]` array.

## BulkEventAppender — bdata in the COPY column list

`buildEventColumns()` adds `bdata` right after `data`; `writeEventRow`
looks up the EventMapping per event and writes either the binary
payload (for `[BinaryEvent]` types) or NULL (for JSON). The COPY
format already supports NULL values per column, so no schema
relaxation is needed.

## Removed: AssertNoBinaryEventsForQuickMode

The Phase 1 guard in `PostgresEventStoreDialect` that threw at
store-build time if a binary event type was registered with a
non-Rich AppendMode is gone — no longer needed.

## Tests

Three new tests in `QuickModeBinaryEventTests` (separate fixture so
each test can dial in its own AppendMode):

- `quick_with_server_timestamps_round_trips_binary_events` — mixed
  binary + JSON stream on the default mode, round-trip via the PG
  function.
- `quick_mode_round_trips_binary_events` — explicit `Quick` mode.
- `quick_mode_binary_events_land_in_bdata_column` — on-disk shape
  verification: binary rows have `data = '{}'` + `bdata != NULL`;
  JSON rows have `data = real JSON` + `bdata = NULL`.

Regression checks:
- Full Marten.MemoryPack.Tests suite: 8/8 ✅
- EventSourcingTests.end_to_end_event_capture_and_fetching: 83/83 ✅
- DaemonTests.Bug_3059_double_application: 1/1 ✅ (re-running the
  test that flushed out the column-count bug in PR #4578's first CI run)

## Docs

`docs/events/binary-serialization.md` updated:
- Removed the "EventAppendMode.Rich only" + "No bulk appender support"
  constraints from the Constraints section.
- Added a new "Append modes" section explaining the feature works
  across all three modes + BulkEventAppender.
- Quick-start example no longer forces `AppendMode = Rich`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Consume Weasel 9.0.2 — picks up the migration-retry connection-reopen fix

Weasel 9.0.2 (JasperFx/weasel#299) fixes
PostgresqlMigrator.executeWithConcurrencyRetryAsync so it reopens a
Closed/Broken connection before the retry attempt. That eliminates
the recurring "Connection is not open" failure on the conjoined
`EventSourcingTests.end_to_end_event_capture_and_fetching_the_stream.
query_before_saving` test that hit this PR + #4576 + #4578 + #4582.

Bumps Weasel.Postgresql + Weasel.EntityFrameworkCore 9.0.1 → 9.0.2 in
Directory.Packages.props (CPM). Marten.MemoryPack.Tests still 8/8
locally on top of the new Weasel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Document the versioned-event-types pattern for binary schema evolution

Closes #4579 with a docs-only answer rather than building a binary-side
upcaster framework. The JSON upcasters
(Marten.Services.Json.Transformations) operate on the JSON wire form and
don't generalize to a byte[] payload; designing a typed transform shape
for binary events is non-trivial and the use case can be addressed
end-to-end today by leaning on Marten's existing per-event-type
registry.

The recommendation: introduce a new event type for each schema change
(e.g. TripStarted -> TripStartedV2), have the aggregate handle both
versions, and let the coexistence design carry old rows + new rows on
the same stream without migration. The only caveat is that MemoryPack's
in-place backward-compatible field evolution works for additive-only
changes too, but stops at the serializer's tolerance rules (renames,
type changes, splits) — versioned event types work for every shape of
change and stay explicit.

Replaces the "No upcaster support" constraint section with a "Schema
evolution — use versioned event types" section that gives the
recommended pattern with code samples + a sub-section on the
"why-not-in-place" tradeoff + a note on mixing binary + JSON.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant