Skip to content

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented Jul 22, 2019

Deleting a follower index does not delete its ShardFollowTasks, potentially leaving many persistent tasks in the cluster that cannot be allocated on nodes and unnecessary fill the logs. This pull request adds a cluster state listener (ShardFollowTaskCleaner) that completes (with a failure) any persistent task that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the task would have been completed as failed and removed from the cluster state.

@tlrx tlrx added >bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v8.0.0 v7.4.0 labels Jul 22, 2019
@tlrx tlrx requested review from jasontedor and martijnvg July 22, 2019 14:41
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@tlrx
Copy link
Member Author

tlrx commented Jul 23, 2019

@elasticmachine run elasticsearch-ci/1
@elasticmachine run elasticsearch-ci/default-distro
@elasticmachine run elasticsearch-ci/packaging-sample

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up. I left a suggestion.

@tlrx tlrx requested a review from jasontedor July 23, 2019 12:09
@tlrx
Copy link
Member Author

tlrx commented Jul 23, 2019

Thanks @jasontedor, I applied your feedback. Can you please have another look?

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jasontedor
Copy link
Member

Can you apply this to 6.8 as well?

@tlrx tlrx merged commit 5afdca2 into elastic:master Jul 24, 2019
@tlrx tlrx deleted the clean-up-shard-follow-tasks-for-deleted-indices branch July 24, 2019 08:13
@tlrx
Copy link
Member Author

tlrx commented Jul 24, 2019

Thanks @jasontedor. I'll backport this change down the line to 6.8.

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by elastic#34404: before this change the
task would have been completed as failed and removed from the cluster state.
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by elastic#34404: before this change the
task would have been completed as failed and removed from the cluster state.
tlrx added a commit that referenced this pull request Jul 24, 2019
…ex (#44801)

This commit unmutes and renames the test that failed on CI (#44796) 
after #44702 has been merged.

This test assumes that follow stats still exist after a follower index has 
been deleted. The follow stats are based on persistent tasks, and 
since #44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I 
don't think that this test makes much sense now the tasks are cleaned 
up. This is why the test has been renamed.

closes #44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 on 7.x
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 to 7.3
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 on 6.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.8.3 v7.3.1 v7.4.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants