Skip to content

Conversation

@original-brownbear
Copy link
Contributor

On master failover we have to resent all the shard failed messages,
but the transport requests remain the same in the eyes of equals.
If the master failover is registered and the requests to the new master
are sent before all the callbacks have executed and the request to the
old master removed from the deduplicator then the requuests to the new
master will incorrectly fail and the snapshot get stuck.

Closes #51253

backport of #51270

On master failover we have to resent all the shard failed messages,
but the transport requests remain the same in the eyes of `equals`.
If the master failover is registered and the requests to the new master
are sent before all the callbacks have executed and the request to the
old master removed from the deduplicator then the requuests to the new
master will incorrectly fail and the snapshot get stuck.

Closes elastic#51253
@original-brownbear original-brownbear added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport labels Jan 22, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@original-brownbear
Copy link
Contributor Author

Jenkins run elasticsearch-ci/1

@original-brownbear original-brownbear merged commit a2a81a7 into elastic:7.6 Jan 22, 2020
@original-brownbear original-brownbear deleted the 51270-7.6 branch January 22, 2020 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants