Mencius: Building efficient replicated state machines for WANs

Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo

Summary

In this paper, the authors aim to design a protocol for general state machine replication with high performance in a wide-area network. Previous approaches all have their own limitation and thus cannot achieve the design goals listed in this paper: (1) high throughput under high client load and (2) low latency under low client load. Paxos is limited by its single leader design. A single leader can lead to high throughput if requests generated by clients are on the same site. However, clients in other sites have higher latency. As well, a single leader also has the drawback of unbalanced communication pattern. On the other hand, Fash Paxos and CoReFP have low latency under low load but have lower throughput under high load due to their higher message complexity.

By concluding the limitations of these previous works, the authors present Mencius, a multi-leader state machine replication protocol, which derives from Paxos. The main idea of Mencius is to partition the sequence of consensus protocol instances among the servers. By doing so, it amortizes the load of being a leader, which increases throughput when the system is CPU-bound. Even if the network becomes the bottleneck, a partitioned leader scheme can fully utilize the available bandwidth to increase throughput. The latency is also reduced because of using a local server as the leader for the requests from clients. Apart from partitioning sequence numbers, Mencius also address key performance problems including adapting to changing client load and to asymmetric network bandwidth. At last, the authors discuss how to choose parameters for Mencius as well as future work and open issues.

Contributions

The authors present a novel multi-leader state machine replication protocol, Mencius. It can not only partition sequence numbers but also address key performance problems in the real world with changing client load and asymmetric network bandwidth.
This paper considers the state machine replication protocol under real-world scenario (i.e., wide-area network).

Strengths

Mencius can be further refined to take advantage of specific network or application requirements.
Mencius is built on a simplified consensus. Servers in Mencius can skip their turns without reaching an agreement with others. It is worth noting that such skipping actions only result in little or no communication and computation overhead.

Weaknesses

Mencius does not consider about Byzantine failures. The reason why Mencius is efficient is that it uses a simplified version of consensus, which allows servers with low client load to skip their turns without reaching an agreement with other servers. This core technology is not built on a quorum abstraction, so it is difficult to extend current Mencius to tolerate Byzantine failures.
The slowest server may increase Mencius's commit latency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2018.03.06-Mencius.md

2018.03.06-Mencius.md

Mencius: Building efficient replicated state machines for WANs

Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo

Summary

Contributions

Strengths

Weaknesses

Files

2018.03.06-Mencius.md

Latest commit

History

2018.03.06-Mencius.md

File metadata and controls

Mencius: Building efficient replicated state machines for WANs

Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo

Summary

Contributions

Strengths

Weaknesses