Fix deadlock in 10-node cluster convergence #2467

manishrjain · 2018-07-01T02:55:46Z

This PR fixes #2286 .

CheckQuorum was causing us multiple issues. When doing a 5-node Zero cluster
bootstrap, it would cause a leader to step down when the size of the cluster
is 2, then causing all the rest of the joins to be blocked indefinitely. It
would also cause leader step down in a seemingly healthy cluster which is
processing proposals. CheckQuorum was mandated by raft.ReadOnlyLeaseBased,
which is a less safe option to do linearizable reads. Switch ReadOnlyOption
back to raft.ReadOnlySafe. Moreover, we don't need to do quorum based lin
reads in the Alpha servers, because of the switch to proposing and then
applying transaction updates.
raft.ReadIndex is not working for some reason. So, commented out its usage in
Zero (and removed it from Alpha permanently). Needs to be fixed when the
following issue is resolved. ReadIndex doesn't provide ReadStates etcd-io/etcd#9893
The logic to do lin reads was replicated in both Zero and Alpha. Refactor that
into one place in conn/node.go.
Retry conf change proposals if they timeout. This mechanism is similar to the
one introduced for normal proposals in a previous commit 06ea4c.
Use a lock to only allow one JoinCluster call at a time. Block JoinCluster
until node.AddToCluster is successful (or return the error).
Set raft library to 3.2.23. Before upgrade, we were at 3.2.6.

This change is

…oesn't respond, unless there's a proposal.

…icient for waiting. Also, for the time being, comment out lin read wait from Zero as well.

This PR fixes hypermodeinc#2286 . - CheckQuorum was causing us multiple issues. When doing a 5-node Zero cluster bootstrap, it would cause a leader to step down when the size of the cluster is 2, then causing all the rest of the joins to be blocked indefinitely. It would also cause leader step down in a seemingly healthy cluster which is processing proposals. CheckQuorum was mandated by raft.ReadOnlyLeaseBased, which is a less safe option to do linearizable reads. Switch ReadOnlyOption back to raft.ReadOnlySafe. Moreover, we don't need to do quorum based lin reads in the Alpha servers, because of the switch to proposing and then applying transaction updates. - raft.ReadIndex is not working for some reason. So, commented out its usage in Zero (and removed it from Alpha permanently). Needs to be fixed when the following issue is resolved. etcd-io/etcd#9893 - The logic to do lin reads was replicated in both Zero and Alpha. Refactor that into one place in conn/node.go. - Retry conf change proposals if they timeout. This mechanism is similar to the one introduced for normal proposals in a previous commit 06ea4c. - Use a lock to only allow one JoinCluster call at a time. Block JoinCluster until node.AddToCluster is successful (or return the error). - Set raft library to 3.2.23. Before upgrade, we were at 3.2.6. Commit log: * Trying to understand why JoinCluster doesn't work properly. * Fucking works. Fucking works. * It all works now. * More Dgraph servers. Found a new issue where requesting read quorum doesn't respond. * Refactor wait lin read code and move it to conn/node.go * Remove lin read wait for server, because txn timestamp should be sufficient for waiting. Also, for the time being, comment out lin read wait from Zero as well.

manishrjain added 8 commits June 28, 2018 20:57

Trying to understand why JoinCluster doesn't work properly.

8d89e98

Fucking works. Fucking works.

4c9cfa1

It all works now.

984d42b

More Dgraph servers. Found a new issue where requesting read quorum d…

b178009

…oesn't respond, unless there's a proposal.

Refactor wait lin read code and move it to conn/node.go

cfd0407

Some more experimentation. But, no luck.

4768347

Remove lin read wait for server, because txn timestamp should be suff…

8797a47

…icient for waiting. Also, for the time being, comment out lin read wait from Zero as well.

Remove printfs

154f46e

manishrjain mentioned this pull request Jul 1, 2018

Possible deadlock in JoinCluster #2286

Closed

manishrjain merged commit eb3910c into master Jul 1, 2018

manishrjain deleted the mrjn/joincluster branch July 1, 2018 03:17

dshekhar95 mentioned this pull request Nov 29, 2022

fix(JoinCluster): Avoid retrying JoinCluster indefinitely (#7961) #8467

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix deadlock in 10-node cluster convergence #2467

Fix deadlock in 10-node cluster convergence #2467

manishrjain commented Jul 1, 2018 •

edited

Loading

Fix deadlock in 10-node cluster convergence #2467

Fix deadlock in 10-node cluster convergence #2467

Conversation

manishrjain commented Jul 1, 2018 • edited Loading

manishrjain commented Jul 1, 2018 •

edited

Loading