#4712 + #4717 — per-tenant partitioning daemon: safe-harbor high-water + per-tenant progression (JasperFx 2.9.4) by jeremydmiller · Pull Request #4714 · JasperFx/marten

jeremydmiller · 2026-06-10T14:15:06Z

Fixes #4712. Follow-up to #4705.

Problem

Under UseTenantPartitionedEvents on a sharded conjoined store, composite projection rebuilds hang. The daemon logs a SafeHarborTime of 0001-01-01 (≈ DateTime.MinValue + the 3s stale threshold), the gap-skip becomes a no-op, and the store-global high-water agent loops forever — silently freezing some composite rebuilds.

Root cause (the #4705 bug class, one query that was missed)

The store-global HighWaterStatisticsDetector reads select last_value from mt_events_sequence for HighestSequence. Under per-tenant partitioning the store-global mt_events_sequence is never advanced (each tenant draws seq_ids from its own mt_events_sequence_{suffix}), so HighestSequence reads 1 while the true mark is far higher → the agent treats the store as perpetually Stale. And because no store-global HighWaterMark progression row is read, HighWaterStatistics.Timestamp is left at default(DateTimeOffset) = 0001-01-01 — the source of the bogus SafeHarborTime.

Fix (mirrors the #4705 `FetchHighestEventSequenceNumber` change)

Read coalesce(max(seq_id), 0) from mt_events when UseTenantPartitionedEvents.
Stamp Timestamp from that first result (which always returns a row), so it can never be left at 0001-01-01 when the progression row is absent.
Non-partitioned stores keep reading last_value from mt_events_sequence.

Test

Bug_4712_safe_harbor_high_water drives HighWaterDetector.Detect directly under per-tenant partitioning. Before the fix: HighestSequence=1, Timestamp=0001-01-01 (with CurrentMark=40). After: HighestSequence=40 and a real timestamp. Deterministic single-DB single-tenant repro (the detector-level seam; the sharded multi-composite hang is the downstream symptom). Non-partitioned high-water detection tests unchanged (10/10).

🤖 Generated with Claude Code

…nant partitioning Follow-up to #4705. The store-global HighWaterStatisticsDetector read `select last_value from mt_events_sequence` for HighestSequence. Under UseTenantPartitionedEvents the store-global sequence is never advanced (each tenant draws seq_ids from its own mt_events_sequence_{suffix}), so HighestSequence read 1 while the true mark was far higher. The store-global high-water agent then treated the store as perpetually Stale and, because no store-global HighWaterMark progression row was read, left HighWaterStatistics.Timestamp at default(DateTimeOffset) = 0001-01-01 — which the daemon turned into a bogus SafeHarborTime (0001-01-01 + 3s threshold), making the gap-skip a no-op and hanging composite projection rebuilds. Fix (mirrors the #4705 FetchHighestEventSequenceNumber change): - read coalesce(max(seq_id),0) from mt_events when UseTenantPartitionedEvents; - stamp Timestamp from that first result (which always returns a row) so it can never be left at default/0001-01-01 when the progression row is absent. Non-partitioned stores keep reading last_value from mt_events_sequence. Regression test Bug_4712_safe_harbor_high_water drives HighWaterDetector.Detect directly under per-tenant partitioning: before the fix HighestSequence=1 and Timestamp=0001-01-01 (with CurrentMark=40); after, HighestSequence=40 and a real timestamp. Single-DB single-tenant — per-tenant partitioning is the only factor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…l progression (skipped) Follow-up to #4712/#4714. Demonstrates the #4717 requirement: under UseTenantPartitionedEvents the async daemon must persist PER-TENANT progression records (per tenant per projection) plus a per-tenant high-water, because each tenant's events use its own mt_events_sequence_{suffix} starting at 1 — a single store-global <Projection>:All shard cannot track multiple tenants. Bug_4717_per_tenant_progression runs two tenants of DIFFERENT heights (20 and 12 events) on one per-tenant-partitioned database, with BOTH a composite projection and a standalone async projection running continuously, then asserts a per-tenant progression row per (projection, tenant) at that tenant's own height plus a per-tenant HighWaterMark row. Proven RED on master/JasperFx 2.9.2 — mt_event_progression holds only: 20 | bug4717-composite:All / Bug4717Count:All / Bug4717Standalone:All / Bug4717Trip:All 20 | HighWaterMark (no <Projection>:tenant rows; tenant B's 12 events untracked). The continuous daemon starts one store-global agent per projection (JasperFxAsyncDaemon.StartAllAsync) and never fans out per tenant; the per-tenant high-water machinery is read/route-only. Skipped pending the JasperFx per-tenant continuous-progression fix (separate PR); un-skip + bump once it ships. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…high-water persistence JasperFx 2.9.4 makes the async daemon fan out a continuous agent per (shard, tenant) under UseTenantPartitionedEvents, so each tenant's projection advances against its own high-water and persists its own <Projection>:All:<tenant> progression row (marten#4717). 2.9.4 also carries the projections-rebuild subscription fix (#438). Marten side: - Bump JasperFx* 2.9.2 -> 2.9.4. - HighWaterDetector.MarkHighWaterForTenantAsync: implement the new IHighWaterDetector hook to persist a durable per-tenant HighWaterMark:<tenant> row (keyed on HighWaterShardIdentity.PerTenant) — invoked by JasperFx's TenantedHighWaterCoordinator. - Un-skip Bug_4717_per_tenant_progression: now green — two tenants of different heights, composite + standalone, each get per-tenant projection rows AND per-tenant high-water rows at their own height. - sharded_daemon_per_shard_progression: POLL for the per-tenant rows/docs instead of asserting immediately after WaitForNonStaleData. That helper's caught-up check counts store-global shards, so it can return before a per-tenant agent commits on a partitioned store — making the immediate assert racy. (Per-tenant progression itself is correct; only the test wait was racy. Hardening WaitForNonStaleData for partitioned stores is a separate follow-up.) Full TenantPartitionedEventsTests 186/186 on clean DBs; sharded tests stable across repeated runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jeremydmiller · 2026-06-10T17:55:46Z

Follow-up: #4717 per-tenant progression (consume JasperFx 2.9.4)

Building on the #4712 fix, this PR now also adopts the per-tenant continuous-progression work shipped in JasperFx.Events 2.9.4 and resolves #4717.

Bumped JasperFx* 2.9.2 → 2.9.4 (per-tenant continuous agents fan out one per (shard, tenant); also carries the projections-rebuild subscription fix, HiLoSequence Increment is not used #438).
HighWaterDetector.MarkHighWaterForTenantAsync — Marten implements the new IHighWaterDetector hook to persist a durable HighWaterMark:<tenant> row.
Un-skipped Bug_4717_per_tenant_progression — now green: two tenants of different heights, composite + standalone, each get per-tenant <Projection>:All:<tenant> rows and per-tenant high-water rows at their own height.
sharded_daemon_per_shard_progression (×2) now poll for the per-tenant rows/docs. WaitForNonStaleData's caught-up check counts store-global shards, so it can return before a per-tenant agent commits on a partitioned store — making the immediate assert racy. Per-tenant progression itself is correct; only the test wait was racy.

Verification: full TenantPartitionedEventsTests 186/186 on clean DBs; the sharded tests are stable across repeated runs (including under within-run shard accumulation).

Investigation note: the earlier "sharded regression" / "1-event-per-tenant commit bug" turned out to be a flaky WaitForNonStaleData + dirty-DB artifact — not a product bug. Per-tenant progression is production-correct, so no JasperFx 2.9.5 was needed. Hardening WaitForNonStaleData to be per-tenant-aware (so the public helper is reliable on partitioned stores) is a worthwhile separate follow-up.

…ssion) 2.9.4 carried a source-generator regression (#432) that dropped the generated dispatcher for self-aggregating projections, so EventSourcingTests (SingleStreamProjection<SimpleAggregate, Guid>) failed CI with "No source-generated dispatcher found". Fixed in JasperFx 2.9.5 (#439): the Pipeline-1 dedupe now uses a set distinct from the cross-pipeline `seen`. EventSourcingTests Aggregation green against published 2.9.5; per-tenant suite 186/186 unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

erdtsieck mentioned this pull request Jun 10, 2026

Per-tenant event partitioning needs per-tenant progression records — a single store-global :All shard can't track multiple tenants #4717

Closed

jeremydmiller and others added 2 commits June 10, 2026 10:47

jeremydmiller changed the title ~~Fix #4712 — composite rebuilds hang under per-tenant partitioning (SafeHarborTime 0001-01-01)~~ #4712 + #4717 — per-tenant partitioning daemon: safe-harbor high-water + per-tenant progression (JasperFx 2.9.4) Jun 10, 2026

jeremydmiller mentioned this pull request Jun 10, 2026

Fix #432 regression — self-aggregating projection dispatcher dropped (2.9.5) JasperFx/jasperfx#439

Merged

jeremydmiller merged commit 41e6d9d into master Jun 10, 2026
8 checks passed

jeremydmiller deleted the fix/4712-safe-harbor-high-water branch June 10, 2026 18:41

jeremydmiller mentioned this pull request Jun 10, 2026

Bump JasperFx 2.9.5 → 2.9.6 #4719

Merged

erdtsieck mentioned this pull request Jun 12, 2026

Composite projection stalls under UseTenantPartitionedEvents + managed distribution on the pinned Marten 9.6.0 (missing marten#4712) JasperFx/wolverine#3084

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#4712 + #4717 — per-tenant partitioning daemon: safe-harbor high-water + per-tenant progression (JasperFx 2.9.4)#4714

#4712 + #4717 — per-tenant partitioning daemon: safe-harbor high-water + per-tenant progression (JasperFx 2.9.4)#4714
jeremydmiller merged 4 commits into
masterfrom
fix/4712-safe-harbor-high-water

jeremydmiller commented Jun 10, 2026

Uh oh!

jeremydmiller commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jeremydmiller commented Jun 10, 2026

Problem

Root cause (the #4705 bug class, one query that was missed)

Fix (mirrors the #4705 FetchHighestEventSequenceNumber change)

Test

Uh oh!

jeremydmiller commented Jun 10, 2026

Follow-up: #4717 per-tenant progression (consume JasperFx 2.9.4)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix (mirrors the #4705 `FetchHighestEventSequenceNumber` change)