EPaxos is an efficient, leaderless replication protocol. The name stands for Egalitarian Paxos -- EPaxos is based on the Paxos consensus algorithm. As such, it can tolerate up to F concurrent replica failures with 2F+1 total replicas.
To function effectively as a replication protocol, Paxos has to rely on a stable leader replica (this optimization is known as Multi-Paxos). The leader can become a bottleneck for performance: it has to handle more messages than the other replicas, and remote clients have to contact the leader, thus experiencing higher latency. Other Paxos variants either also rely on a stable leader, or have a pre-established scheme that allows different replicas to take turns in proposing commands (such as Mencius). This latter scheme suffers from tight coupling of the performance of the system from that of every replica -- i.e., the system runs at the speed of the slowest replica.
EPaxos is an efficient, leaderless protocol. It provides strong consistency with optimal wide-area latency, perfect load-balancing across replicas (both in the local and the wide area), and constant availability for up to F failures. EPaxos also decouples the performance of the slowest replicas from that of the fastest, so it can better tolerate slow replicas than previous protocols.
We have an SOSP 2013 paper that describes EPaxos in detail.
A simpler, more straightforward explanation is coming here soon.
This repository contains the Go implementations of:
Egalitarian Paxos (EPaxos), a new distributed consensus algorithm based on Paxos EPaxos achieves three goals: (1) availability without interruption as long as a simple majority of replicas are reachable---its availability is not interrupted when replicas crash or fail to respond; (2) uniform load balancing across all replicas---no replicas experience higher load because they have special roles; and (3) optimal commit latency in the wide-area when tolerating one and two failures, under realistic conditions. Egalitarian Paxos is to our knowledge the first distributed consensus protocol to achieve all of these goals efficiently: requiring only a simple majority of replicas to be non-faulty, using a number of messages linear in the number of replicas to choose a command, and committing commands after just one communication round (one round trip) in the common case or after at most two rounds in any case.
The struct marshaling and unmarshaling code was generated automatically using the tool available at: https://code.google.com/p/gobin-codegen/
The repository also contains a machine-readable (and model-checkable) specification of EPaxos in TLA+.
Iulian Moraru, David G. Andersen -- Carnegie Mellon University
Michael Kaminsky -- Intel Labs