Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers by tlrx · Pull Request #143999 · elastic/elasticsearch

tlrx · 2026-03-11T10:18:43Z

The method ESIntegTestCase.getLiveDocs verifies that primary and replica have the same set of documents. This method must be adapted to account for sequence numbers that can be merged away on the shard if the IndexSettings.DISABLE_SEQUENCE_NUMBERS is set.

This method was previously adjusted for synthetic id and synthetic sources to rely on the Engine's changes snapshot API to retrieve Lucene documents. At that time, LuceneChangesSnapshot and LuceneSyntheticSourceChangesSnapshot were changed to accommodate for missing id/source. It was already a bit ugly but now with _seq_no also pruned it would require even larger changes in those Lucene*ChangesSnapshot classes only for testing, since _seq_no are loaded at the lower level in Lucene*ChangesSnapshot.

So I changed ESIntegTestCase to not use the change snapshot API anymore, I reverted the changes in Lucene*ChangesSnapshot classes and now simply bulk load documents from the reader directly.

Relates #136305

…nce numbers The method ESIntegTestCase.getLiveDocs verifies that primary and replica have the same set of documents. This method must be adapted to account for sequence numbers that can be merged away on the shard if the IndexSettings.DISABLE_SEQUENCE_NUMBERS is set. This method was previously adjusted for synthetic id and synthetic sources to rely on the Engine's changes snapshot API to retrieve Lucene documents. At that time, LuceneChangesSnapshot and LuceneSyntheticSourceChangesSnapshot were changed to accommodate for missing id/source. It was already a bit ugly but now with _seq_no also pruned it would require even larger changes in those Lucene*ChangesSnapshot classes only for testing, since _seq_no are loaded at the lower level in Lucene*ChangesSnapshot. So I changed ESIntegTestCase to not use the change snapshot API anymore, I reverted the changes in Lucene*ChangesSnapshot classes and now simply bulk load documents from the reader directly. Relates elastic#136305

elasticsearchmachine · 2026-03-11T10:19:08Z

Pinging @elastic/es-distributed (Team:Distributed)

romseygeek

+65 -151

My favourite kind of commit. LGTM!

fcofdez

LGTM

This commit adds tests to verify that CCR works correctly with pruned sequence numbers. The test is inspired by SeqNoPruningIT. Note: made by Cursor, adjusted by me. Also requires elastic#143999 to pass. Relates elastic#136305

…bled

tlrx · 2026-03-11T15:03:37Z

Thanks Francisco and Alan!

…elocations * upstream/main: (54 commits) [ES|QL|DS] Wire parallel parsing into production for text formats (elastic#143997) ESQL: Allow EXTERNAL commands be run part of the CsvTests suite (elastic#143970) [ESQL] Push stats to external source via metadata (elastic#143940) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:approximation.Approximate stats with stats where} elastic#144051 Refactored SortedNumericDocValuesSyntheticFieldLoader into a Layer (elastic#143912) Enable extended doc_values params feature flag in RandomizedRollingUpgradeIT (elastic#143918) Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {csv-spec:approximation.Approximate stats with sample} elastic#144022 Ensure we use float values for rolling upgrade float vectors (elastic#144032) Remove sensitive info from reindex task description (elastic#143635) Fix HistogramUnionState.equals (elastic#143990) Use dedicated IndexRouting API in ShardSplittingQuery (elastic#143776) Engine/Store DistributedArchitectureGuide doc (elastic#143818) Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testDeletesAreBatched elastic#144034 Avoid serializing exceptions as JSON in remote write endpoint (elastic#143987) allow testLoadDocSequenceReturnsCorrectResultsText to circuit break, it happens in serverless occasionally (elastic#144023) [ESQL] Adds memory accounting to GroupedLimitOperator (elastic#143941) Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers (elastic#143999) Support target bucket count in `TBUCKET` with explicit from/to date range (elastic#142747) TSDBDocValuesFormatSingleNodeTests with and without synthetic id (elastic#144002) Fix circuit breaker leak in BreakingTDigestHolder (elastic#143873) ...

…nce numbers (elastic#143999) The method ESIntegTestCase.getLiveDocs verifies that primary and replica have the same set of documents. This method must be adapted to account for sequence numbers that can be merged away on the shard if the IndexSettings.DISABLE_SEQUENCE_NUMBERS is set. This method was previously adjusted for synthetic id and synthetic sources to rely on the Engine's changes snapshot API to retrieve Lucene documents. At that time, LuceneChangesSnapshot and LuceneSyntheticSourceChangesSnapshot were changed to accommodate for missing id/source. It was already a bit ugly but now with _seq_no also pruned it would require even larger changes in those Lucene\*ChangesSnapshot classes only for testing, since _seq_no are loaded at the lower level in Lucene\*ChangesSnapshot. So I changed ESIntegTestCase to not use the change snapshot API anymore, I reverted the changes in Lucene*ChangesSnapshot classes and now simply bulk load documents from the reader directly. Relates elastic#136305

tlrx requested review from fcofdez, martijnvg and romseygeek March 11, 2026 10:18

tlrx added >test Issues or PRs that are addressing/adding tests :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v9.4.0 labels Mar 11, 2026

elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Mar 11, 2026

tlrx mentioned this pull request Mar 11, 2026

Trim _seq_no doc values when no longer needed #136305

Closed

19 tasks

romseygeek approved these changes Mar 11, 2026

View reviewed changes

[CI] Auto commit changes from spotless

17c7638

fcofdez approved these changes Mar 11, 2026

View reviewed changes

tlrx mentioned this pull request Mar 11, 2026

[Test] Add CCR integration tests for pruned sequence numbers #144013

Merged

tlrx added 2 commits March 11, 2026 13:11

fix nested docs

6f6ca6d

Merge branch 'main' into 2026/03/11-adjust-test-framework-seq-no-disa…

26e951f

…bled

tlrx added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 11, 2026

Merge branch 'main' into 2026/03/11-adjust-test-framework-seq-no-disa…

f490711

…bled

elasticsearchmachine merged commit 51601c6 into elastic:main Mar 11, 2026
36 checks passed

tlrx deleted the 2026/03/11-adjust-test-framework-seq-no-disabled branch March 11, 2026 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers#143999

Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers#143999
elasticsearchmachine merged 5 commits intoelastic:mainfrom
tlrx:2026/03/11-adjust-test-framework-seq-no-disabled

tlrx commented Mar 11, 2026 •

edited

Loading

Uh oh!

elasticsearchmachine commented Mar 11, 2026

Uh oh!

romseygeek left a comment

Uh oh!

fcofdez left a comment

Uh oh!

Uh oh!

tlrx commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tlrx commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 11, 2026

Uh oh!

romseygeek left a comment

Choose a reason for hiding this comment

Uh oh!

fcofdez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tlrx commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tlrx commented Mar 11, 2026 •

edited

Loading