Skip to content

Fix replica writes after _seq_no doc values are pruned#144180

Merged
tlrx merged 5 commits intoelastic:mainfrom
tlrx:2026/03/13-fix-replica-op-after-seq-no-pruned
Mar 13, 2026
Merged

Fix replica writes after _seq_no doc values are pruned#144180
tlrx merged 5 commits intoelastic:mainfrom
tlrx:2026/03/13-fix-replica-op-after-seq-no-pruned

Conversation

@tlrx
Copy link
Copy Markdown
Member

@tlrx tlrx commented Mar 13, 2026

When sequence numbers are disabled, PruningMergePolicy removes _seq_no doc values from merged segments once the global checkpoint advances past them.

A subsequent write (update, delete) for the same document on the replica then fails in compareOpToLuceneDocBasedOnSeqNo because readNumericDocValues expects the doc value to exist.

With assertions enabled this throws an AssertionError that causes the primary waiting for the replica response (see test failure here). In production (no assertions), this would cause a replica shard failure.

The fix skips loading _seq_no doc values when sequence numbers are disabled, returning UNASSIGNED_SEQ_NO instead in docAndSeqNo, which would then matches the condition in compareOpToLuceneDocBasedOnSeqNo:

} else if (op.seqNo() > docAndSeqNo.seqNo) {
status = OpVsLuceneDocStatus.OP_NEWER;
}

and the operation would be processed normally (OP_NEWER) on the replica.

Not marking as bug since it's not released yet.

Relates #136305

When sequence numbers are disabled, PruningMergePolicy removes _seq_no
doc values from merged segments once the global checkpoint advances past
them.

A subsequent write (update, delete) for the same document on the replica
then fails in compareOpToLuceneDocBasedOnSeqNo because readNumericDocValues
expects the doc value to exist.

With assertions enabled this throws an AssertionError that causes the
primary waiting for the replica response (see test failure here). In
production (no assertions), this would cause a replica shard failure.

The fix skips loading _seq_no doc values when sequence numbers are
disabled, returning UNASSIGNED_SEQ_NO instead in docAndSeqNo, which
would then matches the condition in compareOpToLuceneDocBasedOnSeqNo:
```
} else if (op.seqNo() > docAndSeqNo.seqNo) {
status = OpVsLuceneDocStatus.OP_NEWER;
}
```

and the operation would be processed normally (OP_NEWER)
on the replica.

Relates elastic#136305
@tlrx tlrx added >non-issue :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v9.4.0 labels Mar 13, 2026
@tlrx tlrx requested review from burqen, fcofdez and romseygeek March 13, 2026 10:44
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Mar 13, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Copy Markdown
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


/** Return null if id is not found. */
DocIdAndSeqNo lookupSeqNo(BytesRef id, LeafReaderContext context) throws IOException {
DocIdAndSeqNo lookupSeqNo(BytesRef id, LeafReaderContext context, boolean loadSeqNo) throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this to lookupDocIdAndSeqNo?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed fcb00b8

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@tlrx tlrx merged commit 1cb9630 into elastic:main Mar 13, 2026
36 checks passed
@tlrx tlrx deleted the 2026/03/13-fix-replica-op-after-seq-no-pruned branch March 13, 2026 15:20
@tlrx
Copy link
Copy Markdown
Member Author

tlrx commented Mar 13, 2026

Thanks Alan & Francisco!

szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 13, 2026
…elocations

* upstream/main: (72 commits)
  [Test] Randomly disable sequence numbers in CcrTimeSeriesDataStreamsIT (elastic#143930)
  Fix AsyncSearchIndexServiceTests.testCircuitBreaker failure (elastic#144058)
  Refine GenerativeIT some more, this time with accounting for some added (elastic#144220)
  ESQL: Physical Planning on the Lookup Node (elastic#143707)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:approximation.Approximate stats by with zero variance} elastic#144240
  Trigger counter metrics in test for delta temporality measurements (elastic#144193)
  fix capabiltiy approximation_v3 (elastic#144230)
  [ci] Add PR pipeline for testing ipv6 and fix tests not working with ipv6 (elastic#140473)
  update (elastic#144095)
  Make from/to optional in TBUCKET when Kibana timestamp filter is present (elastic#144057)
  Extract reroute behavior from create-index request classes (elastic#144140)
  ESQL: Fix release build only failures (elastic#144122)
  ES|QL query approximation: move sample correction to data node (elastic#144005)
  Add indexing pressure tracking to OTLP endpoints (elastic#144009)
  Fix replica writes after _seq_no doc values are pruned (elastic#144180)
  allow tests to configure supportsLoadingConfig (elastic#144061)
  [ES|QL] Unmute testGiantTextFieldInSubqueryIntermediateResultsWithSort (elastic#144126)
  [ESQL][DOCS] Add CPS page (unpublished for moment) (elastic#144206)
  ESQL: Forbid "load" unmapped_fields for certain commands (elastic#144115)
  Add CCS Remote Views Detection (elastic#143384)
  ...
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
When sequence numbers are disabled, PruningMergePolicy removes _seq_no doc values from merged segments once the global checkpoint advances past them.

A subsequent write (update, delete) for the same document on the replica then fails in compareOpToLuceneDocBasedOnSeqNo because readNumericDocValues expects the doc value to exist.

With assertions enabled this throws an AssertionError that causes the primary waiting for the replica response (see test failure here). In production (no assertions), this would cause a replica shard failure.

The fix skips loading _seq_no doc values when sequence numbers are disabled, returning UNASSIGNED_SEQ_NO instead in docAndSeqNo, which would then matches the condition in compareOpToLuceneDocBasedOnSeqNo:

} else if (op.seqNo() > docAndSeqNo.seqNo) {
status = OpVsLuceneDocStatus.OP_NEWER;
}

and the operation would be processed normally (OP_NEWER) on the replica.

Not marking as bug since it's not released yet.

Relates elastic#136305
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue Team:Distributed Meta label for distributed team. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants