Skip to content

Stabilize recurring CI flake cluster (#4310)#4311

Merged
jeremydmiller merged 1 commit intomasterfrom
fix-4310-flaky-tests
Apr 28, 2026
Merged

Stabilize recurring CI flake cluster (#4310)#4311
jeremydmiller merged 1 commit intomasterfrom
fix-4310-flaky-tests

Conversation

@jeremydmiller
Copy link
Copy Markdown
Member

Closes #4310.

Root causes

Three independent issues feed the same flake pattern. Each is localized; this PR addresses all three.

1. Shared static random sequence in Target.cs

private static readonly Random _random = new Random(67) is consumed by ~80 tests across LinqTests + DocumentDbTests. Every Target.Random() / GenerateRandomData() call advances the shared sequence, so a test's effective data depends on which sibling tests consumed before it. xUnit discovery order isn't perfectly stable across CI workers and TFMs — a small order shift consumes a different slice and silently flips assertions that depend on exact counts or distributions.

Fix. Introduce Target.ResetRandomSeed(int) and call it in the tests that genuinely depend on specific data:

  • Bug_605_unary_expressions_in_where_clause_of_compiled_query (3 facts)
  • Bug_3337_select_page.try_it_out
  • query_against_child_collections.buildUpTargetData (covers can_query_on_enum_properties and other variants)

Also tightened Bug_605 assertions: .ShouldBe(15) was a fragile hardcode; the real point of the test is "compiled query == inline LINQ query". Compare against expected.Count.

2. DateTimeOffset.UtcNow inside a shared LINQ expression

child_collection_queries.cs:67 registered:

@where(x => x.Children.Any(c => c.NullableDateOffset <= DateTimeOffset.UtcNow));

That expression runs in BOTH the in-memory LINQ-to-objects "expected" provider AND the LINQ-to-SQL "actual" provider. Each evaluates DateTimeOffset.UtcNow at its own moment. Target.NullableDateOffset values are ±60 seconds of "now" from random data; values within microseconds of either provider's "now" can land on opposite sides of <= and disagree.

Fix. Capture asOf = DateTimeOffset.UtcNow.AddDays(1) once in the static ctor and use that as the boundary. The expression now embeds a constant timestamp both providers see identically. AddDays(1) is well beyond the test data range so the predicate remains meaningful.

3. Ordering assumptions on server-generated Guids

Bug_4282 did ids.ShouldHaveTheSameElementsAs(doc1.Id, doc3.Id) after OrderBy(x => x.Id). Server-generated Guids don't necessarily sort in declaration order. Switched to set-membership: Count == 2 + ShouldContain per expected id.

4. Defensive: ShouldBeEqualWithDbPrecision tolerance

The helper used to round both sides to 100µs with truncation, then ShouldBe. The math works in the common case but the assertion was fragile on loaded-runner clock-comparison edges. Switched to a 1ms tolerance check (well above PostgreSQL's worst-case 9-tick truncation, still tight enough to catch real differences), with a clearer failure message.

Test plan

  • 5x consecutive stress runs of LinqTests.Bugs (178 tests each) — 0 failures
  • 123 tests across Bug_605 + Bug_4282 + Bug_3337 + query_against_child_collections + child_collection_queries — all pass
  • Bug_2283 in DocumentDbTests passes

🤖 Generated with Claude Code

Three independent root causes feed the test-flake pattern that's been
hitting recent PRs (#4279, #4281, #4292, #4295, #4296, #4302). All three
are localized; this PR addresses each.

## Root cause 1: shared static random sequence

Target.cs defines `private static readonly Random _random = new Random(67)`
that is consumed by ~80 tests across LinqTests + DocumentDbTests. Each
Target.Random() / GenerateRandomData() call advances the shared sequence,
so a test's effective random data depends on which sibling tests ran
before it. xUnit discovery order is mostly stable but NOT guaranteed
identical run-to-run, especially across CI workers with different load,
.NET TFM combinations, etc. A small order shift consumes a different
slice of the sequence and produces different test data — silently flipping
assertions that depend on exact counts or distributions.

Fix: introduce `Target.ResetRandomSeed(int seed = 67)` so a test that
genuinely depends on specific random data can pin the sequence at the
start. Remove the readonly modifier on _random to allow rebinding.

Updated tests to call ResetRandomSeed():
- Bug_605_unary_expressions_in_where_clause_of_compiled_query (3 facts)
- Bug_3337_select_page.try_it_out
- query_against_child_collections.buildUpTargetData (covers
  can_query_on_enum_properties and many more)

Also tightened Bug_605's assertion: it was hardcoded to `.ShouldBe(15)`
but the real point of the test is "compiled query == inline LINQ query for
the same expression"; the page size of 15 is incidental. Compare against
expected.Count instead so the test is robust to data variance.

## Root cause 2: DateTimeOffset.UtcNow inside a shared LINQ expression

`child_collection_queries.cs:67` was registering this where-clause for the
acceptance suite:

    @where(x => x.Children.Any(c => c.NullableDateOffset <= DateTimeOffset.UtcNow));

That expression runs in BOTH the in-memory LINQ-to-objects "expected"
provider AND the LINQ-to-SQL "actual" provider. Each provider evaluates
DateTimeOffset.UtcNow at its own moment. Target.NullableDateOffset values
are ±60 seconds of "now" from random data; values within microseconds of
either provider's "now" can land on opposite sides of <= and disagree.

Fix: capture a fixed `asOf = DateTimeOffset.UtcNow.AddDays(1)` in the
static ctor and use that as the boundary. The expression now embeds a
constant timestamp that both providers see identically. AddDays(1) puts
it well beyond the test data range so the predicate is meaningfully true
for matching rows.

## Root cause 3: ordering assumptions on server-generated Guids

Bug_4282 asserted `ids.ShouldHaveTheSameElementsAs(doc1.Id, doc3.Id)`
after `OrderBy(x => x.Id)`. The IDs are server-generated Guids; their
sort order does not in general match declaration order (Marten uses
sequential Guids in many configs but not always, depending on the
StoreOptions in scope and the underlying provider). Switched to
set-membership: `Count == 2` plus ShouldContain for each expected id.

## Root cause 4 (defensive): ShouldBeEqualWithDbPrecision tolerance

The helper used to round both sides to 100µs with truncation (`Ticks /
1000 * 1000`) and then ShouldBe. The math works in the common case, but
the assertion was fragile under loaded-runner clock-comparison edge
cases. Switched to a 1ms tolerance check; widely above the worst-case
PostgreSQL truncation (9 ticks ≈ 0.9µs) but still tight enough to catch
real semantic differences. Also produces a clearer failure message when
it does fire.

## Verification

Stress-ran the previously-flaky suites locally: 5x consecutive runs of
all 178 LinqTests.Bugs tests, no failures. All 123 tests across Bug_605,
Bug_4282, Bug_3337, query_against_child_collections, and
child_collection_queries pass. Bug_2283 in DocumentDbTests passes.

Closes #4310.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jeremydmiller jeremydmiller merged commit 2282ea5 into master Apr 28, 2026
6 checks passed
@jeremydmiller jeremydmiller deleted the fix-4310-flaky-tests branch April 28, 2026 20:18
jeremydmiller added a commit that referenced this pull request Apr 28, 2026
LinqTests.Acceptance shares a Target[] dataset built from the static
Target.GenerateRandomData helper. That helper consumes a process-wide
Random(67) which other tests in the same assembly may have already
advanced. When the LinqTests fixture happened to land on an unlucky
slice of that sequence, we'd see two related flakes:

- select_clauses: SelectTransform.Compare picks the first Target with
  StringArray.Length > 0, NumberArray.Length > 0, and Inner != null.
  If no document matched, target was null and line 27 NRE'd through
  every select_clauses theory.
- take_and_skip: OrderBy(x => x.Long).Skip(N).Take(M) compared against
  Postgres ORDER BY which is unstable on Long ties; non-deterministic
  data occasionally produced a tie.

Calling Target.ResetRandomSeed() in the fixture constructor before
generating Documents/FSharpDocuments makes the dataset stable seed-67
data: 0 Long-value collisions across 1000 docs and 3 select_clauses-
eligible Targets. Continues the flake-stabilization work from #4310 /
#4311.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jeremydmiller added a commit that referenced this pull request Apr 28, 2026
…4301) (#4313)

* Add UsingStore<T> declarative enrichment from ancillary stores (#4300)

Plumbs IServiceProvider through StoreOptions and IStorageOperations.Services
so the JasperFx.Events 1.31 EntityStep<TEntity>.UsingStore<TStore>() built-in
can resolve ancillary IDocumentStore types from DI at enrichment time. Adds
a Lazy<TStore> overload (AncillaryStoreEnrichmentExtensions.UsingStore) for
projections that already hold a Lazy<> reference, plus an end-to-end test
demonstrating projection enrichment from a separate ancillary store.

Supersedes #4301 — rebased onto current master (JasperFx 1.28.1 /
JasperFx.Events 1.31.0). Original work by Anne Erdtsieck.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reset Target.Random in TargetSchemaFixture for deterministic LINQ data

LinqTests.Acceptance shares a Target[] dataset built from the static
Target.GenerateRandomData helper. That helper consumes a process-wide
Random(67) which other tests in the same assembly may have already
advanced. When the LinqTests fixture happened to land on an unlucky
slice of that sequence, we'd see two related flakes:

- select_clauses: SelectTransform.Compare picks the first Target with
  StringArray.Length > 0, NumberArray.Length > 0, and Inner != null.
  If no document matched, target was null and line 27 NRE'd through
  every select_clauses theory.
- take_and_skip: OrderBy(x => x.Long).Skip(N).Take(M) compared against
  Postgres ORDER BY which is unstable on Long ties; non-deterministic
  data occasionally produced a tie.

Calling Target.ResetRandomSeed() in the fixture constructor before
generating Documents/FSharpDocuments makes the dataset stable seed-67
data: 0 Long-value collisions across 1000 docs and 3 select_clauses-
eligible Targets. Continues the flake-stabilization work from #4310 /
#4311.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Anne Erdtsieck <anne.erdtsieck@topicus.nl>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stabilize the recurring LINQ + datetime acceptance-test flake cluster on CI

1 participant