Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnapshotLifecycleRestIT » testBasicTimeBasedRetenion failure #48017

Closed
jrodewig opened this issue Oct 14, 2019 · 6 comments · Fixed by #51075
Closed

SnapshotLifecycleRestIT » testBasicTimeBasedRetenion failure #48017

jrodewig opened this issue Oct 14, 2019 · 6 comments · Fixed by #51075
Assignees
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >test-failure Triaged test failures from CI

Comments

@jrodewig
Copy link
Contributor

jrodewig commented Oct 14, 2019

This test failed in the 7.x branch following 1702667.

Was not able to reproduce locally.

[2019-10-14T12:58:57,730][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] before test
[2019-10-14T12:59:00,370][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,389][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,400][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,419][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,438][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,462][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,503][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,582][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,723][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:00,990][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:01,511][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:02,695][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:04,771][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:08,894][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:17,109][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T12:59:33,534][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T13:00:06,337][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] --> checking to see if snapshot has been deleted...
[2019-10-14T13:00:07,631][INFO ][o.e.x.s.SnapshotLifecycleRestIT] [testBasicTimeBasedRetenion] after test

Scan

https://gradle-enterprise.elastic.co/s/fw2xlp2ojjp7o/failure?openFailures=WzBd&openStackTraces=WzFd#top=0

@jrodewig jrodewig added >test-failure Triaged test failures from CI :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Oct 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Allocation)

@jrodewig jrodewig added :Data Management/ILM+SLM Index and Snapshot lifecycle management and removed :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Oct 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@original-brownbear
Copy link
Member

@dakrone @gwbrown can one of you take a look here? I tried, but I don't really understand how retention runs are counted (that's what's eventually failing here):


2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:qa:multi-node:integTestRunner' --tests "org.elasticsearch.xpack.slm.SnapshotLifecycleRestIT.testBasicTimeBasedRetenion" -Dtests.seed=7830545AB5631D78 -Dtests.security.manager=true -Dtests.locale=nl -Dtests.timezone=America/Aruba -Dcompiler.java=12 -Druntime.java=8
--
2> java.lang.AssertionError:
Expected: <1>
but: was <68>
at __randomizedtesting.SeedInfo.seed([7830545AB5631D78:ED373C7A97F51630]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at org.elasticsearch.xpack.slm.SnapshotLifecycleRestIT.lambda$testBasicTimeBasedRetenion$11(SnapshotLifecycleRestIT.java:405)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:893)
at org.elasticsearch.xpack.slm.SnapshotLifecycleRestIT.testBasicTimeBasedRetenion(SnapshotLifecycleRestIT.java:386)

Maybe you'll have an easier time here than me ? :)

@gwbrown gwbrown self-assigned this Oct 23, 2019
@dliappis
Copy link
Contributor

Another occurrence: https://gradle-enterprise.elastic.co/s/27s2vumgbjtpk

dliappis added a commit to dliappis/elasticsearch that referenced this issue Oct 24, 2019
@dliappis
Copy link
Contributor

These have been failing for a while on various branches so I raised PRs to mute in master/7.x and 7.5

dliappis added a commit that referenced this issue Oct 24, 2019
dliappis added a commit to dliappis/elasticsearch that referenced this issue Oct 24, 2019
dliappis added a commit to dliappis/elasticsearch that referenced this issue Oct 24, 2019
dliappis added a commit that referenced this issue Oct 24, 2019
dliappis added a commit that referenced this issue Oct 24, 2019
@gwbrown
Copy link
Contributor

gwbrown commented Oct 29, 2019

I've added some extra logging in #48612 that will hopefully help catch this failure. I'll try to keep an eye out for failures but if you find a failure for this test please ping me.

dakrone added a commit to dakrone/elasticsearch that referenced this issue Jan 15, 2020
These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves elastic#48531
Resolves elastic#48017
dakrone added a commit that referenced this issue Jan 17, 2020
These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves #48531
Resolves #48017
dakrone added a commit to dakrone/elasticsearch that referenced this issue Jan 17, 2020
…1075)

These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves elastic#48531
Resolves elastic#48017
dakrone added a commit that referenced this issue Jan 17, 2020
These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves #48531
Resolves #48017
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
…1075)

These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves elastic#48531
Resolves elastic#48017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants