The load, capacity, and availability of quorum systems Naor & Wool, SIAM J Computing 1998
This is the paper that Howard et al referenced inFlexible Paxos as defining the “fundamental theorem of quorum intersection.” I like a good fundamental theorem, how could I possibly resist? When you first open up the pdf (or glance over your printout, if you’re that way inclined) it looks pretty daunting, but there’s a lot of great information in here once you dig in.
There are two main sections to the paper. First we are provided with some thinking tools and theorems for reasoning about quorum systems, and then the authors present four novel (as of 1998) quorum systems that have optimal or near optimal load, and high availability. As a bonus we get an analysis of some existing quorum systems too. If you’re looking for the proof that majority quorum systems are far from optimal, this is your paper…
Reasoning about quorum systems
Let’s start out by defining a quorum system: a quorum system is a collection of sets S = {S 1 ,S 2 ,…S m } with each S (called a quorum ) drawn from some underlying universe U . Furthermore, for every S,R ∈ S , S and R have a non-empty intersection. This is the property that makes quorum systems useful: we are guaranteed to be able to pass an unbroken chain of information through such a system because at every step there is at least one participant that knows the ‘latest.’ The same property guarantees that a user will always have a consistent view of the system state.
There are lots of different schemes for guaranteeing the non-intersection property, the best-known and simplest of which is that every set S ∈ S must contain a majority of the members of U . We can evaluate these schemes using three primary criteria: the load it places on each member of the universe; the overall capacity of the system to handle requests; and the availability of the system, i.e. how resilient it is to failure.
Load
Take a quorum system that selects quorums from some universe of members U . There is a strategy in place for selecting quorums, which results in some load on each member of U – the number of times it is accessed.
The load( S ) of a quorum system S is the minimal load on the busiest element.
In other words, no matter what strategy you come up with, there’s going to be at least one element that gets hit at least load( S ) times.
If the minimum quorum size in a given system is denoted by cardinality(S), then:
load(S) ≥ max(1/cardinality(S) , cardinality(S)/n)
Capacity
We’d like the system to handle as many requests as possible. If a system is going to handle a accesses within a period of k time units, it’s clear that every element in U can be accessed at most k times.
By an analogy to hypergraph theory, the authors demonstrate that:
capacity(S) = 1/load(S)
And therefore all the information regarding the capacity of a quorum system is actually captured by knowing the load of the system (the minimal load on the busiest element of the system). This I believe is the fundamental theorem I alluded to at the start of this post.
Availability
If each element fails with probability p , we’re interested in the probability F p that the surviving elements do not contain any quorum.
This failure probability measures how resilient the system is, and we would like F p to be as small as possible.
If individual elements are more prone to failure, then the load of the system is higher. This trade-off is captured by the equation:
F p ≥ p n.load(S)
The most interesting quorum systems will offer low load (and hence high capacity / throughput) coupled with high availability. Interestingly, when p < 1/2 then a majority is the best quorum system with respect to availability, yet it ‘fails miserably’ on load with a load score of 1/2 also. In fact, all voting systems have a load of at least 1/2, which is very high.
It’s tempting to use quorum size as a measure of load (i.e., intuitively it seems that systems requiring smaller quorum sizes should be better). But this turns out to be a mistake in isolation:
Several authors have emphasized the criterion of having small quorums. This is an important parameter since it captures the message complexity of a protocol using the quorum system. However, it does not tell us how to use the quorum so each element is used as infrequently as possible. Moreover, our lower bounds show that if the quorum size is small (ie. < √n) then decreasing it any further actually increases the load . We therefore argue that when analyzing a quorum system one should consider both its quorum size and load (and of course its availability) since each measures a different aspect of the system’s quality.
The Paths quorum system
The paths quorum system is the best overall quorum system presented in the paper. It has a load of O(1/√n) and a failure probability e -Ω√n when elements fail with probability p < 1/2 . Even in the presence of faults, the load is still O(1/&sqrt;n) with very high probability.
In the paths system each element (e.g. node) in the system is represented by an edge between two vertices is a specially constructed grid. A quorum system is then formed by joining the edges (elements) intersected by any left-to-right traversal of the grid with the edges (elements) intersected by any top-to-bottom traversal.
A Paths quorum system of order d has n = 2d 2 + 2d + 1 elements. A quorum system involving 41 nodes therefore, would have d = 4 (and the minimum system size would be d=1 = 5 nodes).
Here’s how to make the grid (I’m going to do it for d = 1 since that’s quite complex enough, thank you!):
Start off by laying out a d+1 x d+2 set of vertices, and create edges between them all except for the first and last columns, giving the grid G(d).
Make a copy of that grid and rotate it counter-clockwise by 90 degrees, giving G*(d)
Superimpose G*(d) on G(d) at offset 1/2,-1/2
Still with me? Here come’s the part that feels really counter-intuitive to me. You see, I really want those vertices to represent my nodes (that’s what happens in just about every other diagram of this kind right?), but they don’t. Each node is represented by the point where an edge from G(d) crosses an edge from G*(d):
To form a quorum, take any path along edges crossing the grid from left-right, and any path along edges crossing the grid from top-to-bottom. Your quorum members are the elements (nodes) that sit in the edges you traversed: