In this paper, the authors present XFT (cross fault tolerance), a novel approach to building efficient resilient distributed system, which can tolerate both non-crash faults and network faults. They observe that Byzantine faults are usually independent of asynchrony. Thus, they design such model, XFT, to handle different faults under different network scenarios. Particularly, XFT is correct only when all the following conditions hold: (1) only crash faults happen in asynchronous periods, and (2) non-crash faults (Byzantine faults) happen only in synchronous periods. By relaxing the guarantees, the authors build XPaxos which has a comparable performance of CFT systems and tolerates Byzantine faults. Apart from the design and implementation details, the authors also include implementation optimizations, comprehensive evaluations, and reliability analysis in this paper.
- The authors propose a novel network and node fault model called XFT, which can tolerate f Byzantine nodes with 2f+1 replicas.
- In this paper, the authors also implement XPaxos, the first XFT SMR protocol, and deploy it in a real geo-replicated setting across Amazon EC2 data centers worldwide. Furthermore, they also integrate XPaxos within Apache ZooKeeper.
- The authors also evaluate the performance of XPaxos on real EC2 data centers. Their results show that XFT performs almost as well as a WAN-optimized variant of Paxos, and significantly better than BFT protocols.
- XFT does not need extra replica resources compared with asynchronous CFT.
- XFT preserve all reliability guarantees of asynchronous CFT and also provides safety and liveness even when Byzantine faults occur.
- Under the assumption that machine faults and network faults occur independent and identically distributed random variables, XFT can offer strictly stronger reliability guarantees than state-of-the-art BFT.
- XFT’s performance does not scale well, and it is very efficient for small f (e.g., f = 1). Although XPaxos requires 2f+1 to tolerate f faults, it still requires all-to-all multicast in the agreement stage, which therefore results in exponential message complexity.