Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended Raft algorithm with witness support #133

Open
joshuazh-x opened this issue Jan 25, 2024 · 2 comments
Open

Extended Raft algorithm with witness support #133

joshuazh-x opened this issue Jan 25, 2024 · 2 comments

Comments

@joshuazh-x
Copy link
Contributor

joshuazh-x commented Jan 25, 2024

The Raft algorithm requires an odd number of servers to maintain a quorum, meaning a minimum of three for a single point of failure. This isn't an issue for large systems but can be challenging for budget-limited customers needing fewer servers 1.

Efforts have been made in both scholarly circles 23 and commercial sectors 456 to resolve this issue. However, all existing research and implementations for small scale clusters (with two servers for example) either depend on another HA solution or necessitate a standalone server as a witness, adding to deployment complexity and potential performance bottlenecks.

I hereby propose extended Raft algorithm, a variant of Raft algorithm, which is designed for clusters with regular servers and a single witness, minimizing data traffic and access to witness while maintaining all key Raft properties. The witness in this algorithm is very suitable for implementation as a storage object with various options such as NFS, SMB, or cloud storage.

The extended Raft algorithm is backward compatible with Raft, meaning any cluster running with Raft can be seamlessly upgraded to support witness.

The correctness of the algorithm has been conclusively proven through a formal proof in https://github.com/joshuazh-x/extended-raft-paper. Besides that, we also validate the formal specification of the algorithm using TLC model checker.

Look forward to your suggestions and feedback.

@pav-kv @serathius @ahrtr @tbg @Elbehery @erikgrinaker @lemmy

Footnotes

  1. https://github.com/etcd-io/etcd/issues/8934#issuecomment-398175955

  2. Pâris, Jehan-François, and Darrell DE Long. "Pirogue, a lighter dynamic version of the Raft distributed consensus algorithm." 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC). IEEE, 2015.

  3. Yadav, Ritwik, and Anirban Rahut. "FlexiRaft: Flexible Quorums with Raft." The Conference on Innovative Data Systems Research (CIDR). 2023.

  4. https://github.com/tikv/tikv

  5. https://platform9.com/blog/transforming-the-edge-platform9-introduces-highly-available-2-node-kubernetes-cluster/

  6. https://www.spectrocloud.com/blog/two-node-edge-kubernetes-clusters-for-ha-and-cost-savings

@serathius
Copy link
Member

+1 to witness support. My main concern would be creation of a test plan to ensure correctness. I think the TLC model checker will be crucial here.

@joshuazh-x
Copy link
Contributor Author

Yes, indeed. We need tests to ensure implementation correctness besides algorithm itself. In our PoC work on etcd, we managed to make all existing tests (e2e, integration, robust) run on cluster with single witness. But additional tests would be needed to cover witness specific functionalities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants