Fix #4761 + #4763: per-tenant non-stale wait + sharded reassignment tenant_count by jeremydmiller · Pull Request #4764 · JasperFx/marten

jeremydmiller · 2026-06-18T12:22:30Z

Closes #4761. Closes #4763. Reproductions courtesy of #4762 (@erdtsieck) — those two repro tests are included here and now pass. (#4751 from that same repro PR needs a JasperFx-side change and is being handled separately.)

#4761 — `WaitForNonStaleData` never completes for a multi-tenant shard

Under MultiTenantedWithShardedDatabases + UseTenantPartitionedEvents, when two tenants share a shard, each has its own mt_events_sequence_<suffix> (overlapping seq_ids). WaitForNonStaleDataAsync checked projections.All(x => x.Sequence >= initial.EventSequenceNumber) where initial is the global max seq_id. A tenant with fewer events legitimately tops out below that max (e.g. HighWaterMark:tenant_y = 2 while global = 3), so the check could never pass — the wait timed out ("...reaching the initial sequence of 3") even though both tenants' data was fully projected.

Fix: WaitForNonStaleDataAsync is now per-tenant aware under partitioning — it requires each registered projection shard to have caught its own tenant up to that tenant's HighWaterMark:<tenant> mark (and guards against a premature pass before the daemon has done any work). The non-partitioned path is byte-for-byte unchanged.

#4763 — sharded reassignment leaves the source shard's `tenant_count` inflated

ShardedTenancy.AssignTenantAsync recomputed tenant_count only for the target shard. Re-assigning a tenant A→B never decremented A, so A stayed inflated forever and UseSmallestDatabaseAssignment kept mis-ranking it as fuller.

Fix: capture the tenant's prior shard before the upsert and recompute both the source and target shard counts.

Tests

Bug_4761_per_tenant_progression_same_shard and Bug_4763_reassign_count_divergence (from Failing repros: sharded tenant-partitioned daemon catch-up + tenant-count divergence (#4751, #4761, #4763) #4762) now pass.
Full TenantPartitionedEventsTests suite green (191); non-partitioned WaitForNonStaleData consumers (e.g. querying_with_non_stale_data) unaffected.

🤖 Generated with Claude Code

@erdtsieck

…ount Two independent multi-tenant + tenant-partitioned bugs (repros from #4762, thanks @erdtsieck): #4761 — WaitForNonStaleData never reports non-stale when multiple tenants share a shard. Under UseTenantPartitionedEvents each tenant has its own mt_events_sequence, so seq_ids overlap and a single store-global "initial" (the max across tenants) is not a valid bar for every per-tenant progression row — a tenant with fewer events legitimately tops out below the global max, so the "all rows >= initial" check could never pass and the wait timed out even though every tenant's data was fully projected. WaitForNonStaleDataAsync is now per-tenant aware under partitioning: it requires each registered projection shard to have caught its OWN tenant up to that tenant's HighWaterMark:<tenant> mark, and guards against a premature pass before any work is done. The non-partitioned path is unchanged. #4763 — ShardedTenancy.AssignTenantAsync only recomputed the TARGET shard's tenant_count on a re-assignment, leaving the SOURCE shard inflated forever (which made UseSmallestDatabaseAssignment mis-rank it). It now captures the tenant's prior shard before the upsert and recomputes both the source and target counts. Full TenantPartitionedEventsTests suite green (191); non-partitioned WaitForNonStaleData consumers unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ant-partitioned (#4767) Locks in the fix from #4764. Under MultiTenantedWithShardedDatabases + UseTenantPartitionedEvents, an async CompositeProjection was not driven to a non-stale state by the normal daemon catch-up path (StartAllAsync + WaitForNonStaleData) — the member read models stayed empty after catch-up returned. Root cause was shared with #4761: WaitForNonStaleData's "caught up" check could be satisfied by the HighWaterMark rows alone, without requiring the composite's per-tenant projection-progression row. #4764 fixed that. This test pins the behavior: a sharded, tenant-partitioned composite must materialize BOTH stages via the async daemon (not just via RebuildProjectionAsync). Passes on JasperFx 2.13.0 (the merged #4764 fix is sufficient; the defensive JasperFx ExecutionStage guard from #457 is a separate invariant assertion that rides the next routine JasperFx bump). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jeremydmiller mentioned this pull request Jun 18, 2026

ForceAllMartenDaemonActivityToCatchUpAsync does not catch up an async CompositeProjection under MultiTenantedWithShardedDatabases + UseTenantPartitionedEvents #4751

Closed

jeremydmiller merged commit 504bda0 into master Jun 18, 2026
9 checks passed

jeremydmiller deleted the fix-4761-4763-sharded-tenant-partitioned branch June 18, 2026 12:50

This was referenced Jun 18, 2026

Failing repros: sharded tenant-partitioned daemon catch-up + tenant-count divergence (#4751, #4761, #4763) #4762

Closed

#4751: regression test for async composite catch-up under sharded + tenant-partitioned #4767

Merged

erdtsieck mentioned this pull request Jun 30, 2026

UseTenantPartitionedEvents: co-located tenants on a shard still share one store-global high-water (no per-tenant progression) → lagging tenant's later appends skipped (9.12.0; #4761 persists) #4798

Open

This was referenced Jun 30, 2026

chore: Bump Marten from 8.37.3 to 9.12.0 erintyler/Nexus#424

Closed

chore: Bump Marten and 3 others erintyler/Nexus#427

Closed

chore: Bump Marten and WolverineFx.Marten erintyler/Nexus#428

Closed

chore: Bump Marten and 3 others erintyler/Nexus#431

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix #4761 + #4763: per-tenant non-stale wait + sharded reassignment tenant_count#4764

Fix #4761 + #4763: per-tenant non-stale wait + sharded reassignment tenant_count#4764
jeremydmiller merged 1 commit into
masterfrom
fix-4761-4763-sharded-tenant-partitioned

jeremydmiller commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

jeremydmiller commented Jun 18, 2026

#4761 — WaitForNonStaleData never completes for a multi-tenant shard

#4763 — sharded reassignment leaves the source shard's tenant_count inflated

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

#4761 — `WaitForNonStaleData` never completes for a multi-tenant shard

#4763 — sharded reassignment leaves the source shard's `tenant_count` inflated