-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Closed
Copy link
Labels
:Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.Anything around constructing a new shard, either from a local or a remote source.>test-failureTriaged test failures from CITriaged test failures from CI
Description
Example build failure
Jenkins build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+default-distro+bwc/BWC_VERSION=7.2.1,nodes=centos-7&&immutable/502/
Build scan: https://gradle-enterprise.elastic.co/s/o257sqrvlr37k
Reproduction line
When I tried to reproduce this locally, I got an unrelated error. Perhaps I'm not set up correctly for these integration tests.
REPRODUCE WITH: ./gradlew ':qa:full-cluster-restart:v7.2.1#upgradedClusterTest' --tests "org.elasticsearch.upgrades.FullClusterRestartIT.testOperationBasedRecovery" \
-Dtests.seed=E537B7AF25FA7DE1 \
-Dtests.security.manager=true \
-Dtests.locale=es-AR \
-Dtests.timezone=America/Buenos_Aires \
-Dtests.distribution=default \
-Dcompiler.java=13
REPRODUCE WITH: ./gradlew ':x-pack:qa:full-cluster-restart:v7.2.1#upgradedClusterTest' --tests "org.elasticsearch.xpack.restart.CoreFullClusterRestartIT.testOperationBasedRecovery" \
-Dtests.seed=E537B7AF25FA7DE1 \
-Dtests.security.manager=true \
-Dtests.locale=zh-TW \
-Dtests.timezone=MIT \
-Dtests.distribution=default \
-Dcompiler.java=13
Example relevant log:
java.lang.AssertionError:
Expected: an empty collection
but: <[{name=_0.cfe, length_in_bytes=405, reused=false, recovered_in_bytes=405}, {name=_0.si, length_in_bytes=383, reused=false, recovered_in_bytes=383}, {name=_0_2_Lucene80_0.dvm, length_in_bytes=160, reused=false, recovered_in_bytes=160}, {name=_2.si, length_in_bytes=383, reused=false, recovered_in_bytes=383}, {name=_0.cfs, length_in_bytes=4542, reused=false, recovered_in_bytes=4542}, {name=_2.cfe, length_in_bytes=405, reused=false, recovered_in_bytes=405}, {name=_0_2.fnm, length_in_bytes=906, reused=false, recovered_in_bytes=906}, {name=_2.cfs, length_in_bytes=2637, reused=false, recovered_in_bytes=2637}, {name=_0_2_Lucene80_0.dvd, length_in_bytes=97, reused=false, recovered_in_bytes=97}, {name=segments_5, length_in_bytes=443, reused=false, recovered_in_bytes=443}]>
at __randomizedtesting.SeedInfo.seed([E537B7AF25FA7DE1:F8DBBB8A237C8796]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at org.elasticsearch.test.rest.ESRestTestCase.assertNoFileBasedRecovery(ESRestTestCase.java:1137)
at org.elasticsearch.upgrades.FullClusterRestartIT.testOperationBasedRecovery(FullClusterRestartIT.java:1295)
[…]
Frequency
This failure began to appear yesterday and has cropped up on several cycles of scheduled BWC tests. We have 42 failures so far, according to build tests. The first failures came just after #51189 was merged. That PR is entitled "Use Lucene index in peer recovery and resync" and it touched the FullClusterRestartIT class that's failing, so it seems like it might be a good place to start looking.
Metadata
Metadata
Assignees
Labels
:Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.Anything around constructing a new shard, either from a local or a remote source.>test-failureTriaged test failures from CITriaged test failures from CI