-
Notifications
You must be signed in to change notification settings - Fork 408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joint consensus may get stuck if either C<old> or C<new> could not reach a quorum after entering the begin-conf-change stage #192
Comments
Hi @Fullstop000, It's so cool you're testing out joint consensus! Yes, the situation you describe is a bad situation to be in. I think a rollback mechanism does make sense. I'm curious if there is a formal definition of such a thing exists. (cc @ongardie?) in TiKV we're focusing on a "Replace Node" use for this right now, where you'd have say set However, in order to use joint consensus fully we do need to consider this. I think allowing the leader to roll back is a valid idea... What do you think @BusyJay / @hicqu / @overvenus ? Maybe a |
Rolling back is a nice to have feature. But it can't be implemented unless conf change is taken affect immediately on receiving. |
Do we have any plan to refactor the current implementation to support that ConfChange takes affect immediately ? @BusyJay |
@Fullstop000 Our current implementation's docs do suggest you call the apply after receiving the message, not applying. |
I'm not aware of any formal spec for joint consensus, @Hoverbear. You can search this file for the word "configuration" to see what I implemented in LogCabin: https://github.com/logcabin/logcabin/blob/master/Server/RaftConsensus.cc There is a rollback mechanism in LogCabin. It's probably not explicitly described in the original paper because rolling back log entries is something that Raft already does. In LogCabin and the Raft paper, the configuration a server uses is the last one in its log (which may not be committed). So, when extraneous entries are removed from its log, its configuration rolls back. |
@ongardie Thanks for this really useful insight. :) |
Seems we may be able to solve this with the work from @hicqu ! :) |
After entering the begin-conf-change stage In joint consensus, if we couldn't get a majority of responses from
C<new>
(maybe due to network isolation), the Log Replication will always fail andC<old>
cluster seems hanging forever because bothC<old>
andC<new>
quorums must be satisfied.Below are some un-matured personal thoughts about this scenario:
C<old>
orC<new>
, clearly we can do nothing in raft to avoid this.C<old>
orC<new>
, I think cluster has a chance to move forward.Basically maybe we should let
C<old>
still work by some rollback mechanisms because in the real world situations like two data center involved transferring we always concern about cluster stability.In a more generally situation, we can define the relationship between
C<old>
andC<new>
:C<new>
is totally exclusive ofC<old>
e.g.[A, B, C]
to[D, E]
This indicates that if we still can reach a quorum from
C<old>
but fails fromC<new>
, the rollback may performs like dismiss the previous begin-conf-change stage in a timeout ( similar to Leader Transfer) because theC<old>
can be still trusted to work well and we need to fix theC<new>
problems beyond the raft lib.C<new>
overlapsC<old>
Things become kind of complicated now if we still want a rollback to
C<old>
. Let's sayM<old>
is the majority ofC<old>
and similar to theC<new>
:-
M<old>
is still inC<new>
and is alsoM<new>
e.g.[A, B, C]
to[A, B, D]
--- S1-
M<old>
is still inC<new>
and is notM<new>
e.g.[A, B, C]
to[A, B, D, E, F]
--- S2-
M<old>
is not inC<new>
e.g.[A, B, C]
to[A, D, E, F]
--- S3I'm still trying to figure out the problem in these situations :(
But since the raft paper and thesis doesn't mention something about above, maybe we can still stick to the current implement and add some comments.
The text was updated successfully, but these errors were encountered: