Skip to content

Sporadic failure in testForceStaleReplicaToBePromotedToPrimary #35497

@DaveCTurner

Description

@DaveCTurner

#34140 introduces an assertion in PrimaryAllocationIT#testForceStaleReplicaToBePromotedToPrimary that sometimes fails. See for instance https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=fedora/56/console:

   > Throwable #1: java.lang.AssertionError: expected:<[]> but was:<[tuWXzhckTEu0-GBybc-VJQ]>
   > 	at __randomizedtesting.SeedInfo.seed([6DA82B0AC7639B17:3BB0A62AD8D3F5B0]:0)
   > 	at org.elasticsearch.cluster.routing.PrimaryAllocationIT.testForceStaleReplicaToBePromotedToPrimary(PrimaryAllocationIT.java:222)
   > 	at java.lang.Thread.run(Thread.java:748)

I think the issue is that we allocate a stale or empty primary via a reroute command, then grab the cluster state, and then assert that the in-sync IDs in that cluster state are what they should be straight after the reroute. However, if the shard has actually been allocated by the time we get hold of the cluster state then this assertion fails.

@vladimirdolzhenko could you take a look?

Metadata

Metadata

Labels

:Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions