Unifying Synchronous and Asynchronous Message ... - CiteSeerX

Unifying Synchronous and Asynchronous Message-Passing Models Maurice Herlihy

Computer Science Department Brown University Providence, RI 02912 [email protected]

Sergio Rajsbaum

Instituto de Matematicas U.N.A.M., D.F. 04510, Mexico [email protected]

Mark R. Tuttle

Digital Equipment Corporation Cambridge Research Lab One Kendall Square, Building 700 Cambridge, MA 02139 [email protected]

March 17, 1998 Abstract

We take a signi cant step toward unifying the synchronous, semisynchronous, and asynchronous message-passing models of distributed computation. The key idea is the concept of a pseudosphere, a new combinatorial structure in which each process from a set of processes is independently assigned a value from a set of values. Pseudospheres have a number of nice combinatorial properties, but their principal interest lies in the observation that the behavior of protocols in the three models can be characterized as simple unions of pseudospheres, where the exact structure of these unions is determined by the timing properties of the model. We use this pseudosphere construction to derive new and remarkably succinct proofs of bounds on consensus and k-set agreement in the asynchronous and synchronous models, as well as the rst lower bound on wait-free k-set agreement in the semi-synchronous model.

To appear in the 16th Annual ACM Symposium on Principles of Distributed Computing (PODC98), Puerto Vallarta, Mexico, June 1998.

1 Introduction The eld of distributed computing embraces a bewildering variety of models [LL90, GY97]. A fundamental dimension along which these models dier is the degree to which process activity is synchronized. At one end of the spectrum is the synchronous model in which computation proceeds in a sequence of rounds. In each round, a process sends messages to the other processes, receives the messages sent to it by the other processes in that round, and changes state. All processes take steps at exactly the same rate, and all messages are delivered with exactly the same message delivery time. At the other end is the asynchronous model in which there is no bound on the amount of time that can elapse between process steps, and there is no bound on the time it can take for a message to be delivered. Between these extremes is the semi-synchronous model in which process step times and message delivery times can vary, but are bounded between constant upper and lower bounds. Proving a lower bound in any of these models requires a deep understanding of the global states that can arise in the course of a protocol's execution, and of how these global states are related. The notion of indistinguishability or similarity [FLP85, HM90] has played a fundamental role in nearly every lower bound in distributed computation. Two global states are considered indistinguishable if one process has the same local state in both, and therefore cannot distinguish between them. The graphtheoretic representation of similarity, in which two global states are joined by an edge labeled with a process P if the global states are indistinguishable to P , has proven to be immensely powerful. While the classical notion of similarity captures the notion of two states being indistinguishable to a single process, higher degrees of similarity have proved essential for understanding problems such as k-set agreement [Cha91] and renaming [ABND+ 87, ABND+ 90]. For example, it may matter that a pair of global states are indistinguishable to two processes, to three processes, and so on. To capture these higher degrees of similarity it is convenient to represent the global state of a system of n +1 processes with the n-dimensional analog of a triangle, called a simplex, where each vertex of the simplex representing a global state is labeled with the local states of processes in this global state. The set of all global states represented in this way forms a simplicial complex, sometimes called the protocol complex. The degree of similarity between two global states is represented geometrically by the number of vertices the corresponding simplexes have in common: two global states similar to one process share one vertex (namely that vertex labeled with the local state of this process having the same local state in both global states), states similar to two processes share two vertices, states similar to three processes share three vertices, and so on. In this paper, we take a signi cant step toward unifying the synchronous, semi-synchronous, and asynchronous models of computation. The key, unifying idea in this paper is the concept of a pseudosphere, a simplicial complex in which each process from a set of processes is independently assigned a value from a set of values. Pseudospheres have a number of nice combinatorial properties, 1

P,0 R

Q

P

P,0

Q,0

R,0

Q,0

R,0

R,1

Q,1

R,1

Q,1

P,1

P,1

Figure 1: Construction of a three-process binary pseudosphere. but their principal interest lies in the observation that protocol complexes in the three models can be characterized as simple unions of pseudospheres, where the exact structure of these unions is determined by the timing properties of the model. Because of the simple combinatorial properties of pseudospheres, reasoning about these unions can be accomplished by straightforward combinatorial arguments. We use this pseudosphere construction to derive new and remarkably succinct proofs of bounds on consensus [PSL80, Fis83] and k-set agreement [Cha91] in the asynchronous and synchronous models, as well as the rst lower bound on wait-free k-set agreement in the semi-synchronous model. A pseudosphere can be de ned very simply. Start with an n-dimensional simplex where each vertex is labeled with a process id, and choose a nite set of values taken from an arbitrary domain. The pseudosphere is the complex constructed by taking multiple copies of this simplex and independently labeling each vertex with a value from the domain. For example, Figure 1 shows how to construct a pseudosphere by independently assigning binary values to a set of three processes. The left-hand gure shows a triangle labeled with process ids P , Q, and R. The central gure shows an intermediate stage where two copies of the triangle are each labeled with zeros and ones. The right-hand gure shows the complete construction, where copies of the triangle are labeled with all combinations of zeros and ones. We can just as easily assign values from a larger set than f0; 1g, although the result is harder to illustrate. We call this construct a pseudosphere because it is easily shown that the result of assigning binary values to n + 1 processes is topologically equivalent to an n-dimensional sphere. The collection of initial global states for consensus or k-set agreement clearly forms a pseudosphere whose vertices are labeled with input values. For example, the right-hand gure in Figure 1 is the input complex for three-process consensus. The basic insight underlying the work presented in this paper is that protocol complexes in the models of interest have natural representations as unions of pseudospheres, except that the vertices are labeled with timing and failure information instead of input values. Reasoning about these protocol complexes reduces to the purely combinatorial problem of reasoning about 2

unions of pseudospheres, and indeed the formal manipulations needed to derive our results are remarkably similar in all three models. In each model, we de ne a \round" structure appropriate for that model, and express the one-round executions as the union of pseudospheres. An r-round execution is constructed by inductively replacing each simplex in the single-round execution with the union of pseudospheres produced by the (r ? 1)-round protocol. The protocol complex produced by this iterative construction represents only a subset of the global states reachable in the model, but this set is large enough to prove the desired results for consensus, k-set agreement, renaming, and so on.

2 Related Work Of course, we are not the rst to propose iterative constructions representing the global states at the end of a protocol. As one example, Borowsky and Gafni [BG93] state a construction for the asynchronous model, and while the intuition behind the construction is compelling, it is not easy to write down a formal description of that construction. In a later paper [BG97], they de ne an iterated immediate snapshot model that has a recursive structure and has proven to be useful [HS97]. As another example, Chaudhuri, Herlihy, Lynch, and Tuttle [CHLT93] give an inductive construction for the synchronous model, and while the resulting \Bermuda Triangle" is visually appealing and an elegant combination of proof techniques from the literature, there is a fair amount of machinery needed in the formal description of the construction. In this sense, the formal presentation of our construction is substantially more succinct than these constructions. We are also not the rst to attempt to unify synchronous and asynchronous models of computation. All such attempts, including ours, restrict attention to a subset of well-behaved, round-based executions. There are a number of ways one could consider unifying models. One approach is to translate algorithms written for a synchronous model into an asynchronous model. This was the approach used by Awerbuch [Awe85] in the message-passing model when he constructed his synchronizer and showed how (in the absence of faults) synchronous protocols can be run in asynchronous systems in the presence of a synchronizer. This was also the approach used by Gafni and Yang [GY97] in the shared-memory model when they described a round-by-round failure detector and how these detectors can be used to run in an asynchronous model an algorithm written for a synchronous model. Another approach, which is basically our approach, is to identify a set of concepts that can be used to describe or reason about multiple models. This was the approach used by Moses and Rajsbaum [MR98] when they showed how the concepts of communication layering and mobile faults can be used to reason in a uniform way about the synchronous and asynchronous models. The translation approach assumes that the synchronous model is an easier model in which to work. This assumption seems justi ed for algorithm design, but lower bounds are typically easier to derive in the asynchronous model. Our pseudosphere construction illustrates this distinction in a striking way: in the 3

asynchronous model, a single round of computation from a global state yields a single pseudosphere. We are the rst, however, to unify the synchronous, semi-synchronous, and asynchronous models of message-passing computation with a single concept, namely the pseudosphere. Gafni and Yang [GY97] do their most formal work in a synchronous and asynchronous shared-memory model, and while they sketch how their ideas might be extended to a semi-synchronous message-passing model, this extension requires changing the nature of the failure detector. Moses and Rajsbaum [MR98] focus on synchronous and asynchronous models. Their stated results apply to consensus whereas our results apply to both consensus and k-set agreement. One indication that our construction is fundamental is that the pseudosphere constructions originally developed to unify the synchronous and asynchronous models extended cleanly to the semi-synchronous model. We consider this signi cant. Although variants of the semi-synchronous model have been around for a long time, we are aware of only one substantial lower bound in this model: the consensus bound of Attiya, Dwork, Lynch, and Stockmeyer [ADLS94]. The absence of other results suggests that it is very dicult to prove signi cant lower bounds in this model, and that results and proof techniques from other models do not translate into the semi-synchronous model as easily as one might hope. With our pseudosphere construction, however, we can prove the rst lower bound for wait-free k-set agreement in this model. Another indication is that our construction can be used to simplify the proof of known results. For example, our protocol complex construction is signi cantly more succinct than the construction used by Herlihy and Shavit [HS93] in their asynchronous computability theorem, and our construction could be used to simplify the proof of one direction of that theorem. And, as mentioned, the formal analysis underlying our construction can be presented considerably more succinctly than the constructions used by Borowsky and Gafni [BG93] and by Chaudhuri, Herlihy, Lynch, and Tuttle [CHLT93]. Our constructions are guided by concepts and theorems taken from elementary combinatorial topology. As described above, we believe our results are interesting in their own right, even to readers unfamiliar with or uninterested in topological techniques. For readers interested in applications of topology to distributed computing, however, our constructions should be even more interesting. Our approach here replaces the existential arguments used by Herlihy and Shavit [HS93] to analyze protocol complexes in the asynchronous model with a constructive, inductive analysis. Although the existential analysis encompasses the entire complex, and ours is restricted to a well-structured subcomplex, we feel that the concise and constructive nature of our treatment makes a contribution, both in terms of simplicity and brevity, and in terms of intuitive appeal. Like us, Chaudhuri, Herlihy, Lynch, and Tuttle [CHLT93] explicitly construct a subset of the protocol complex in the synchronous model, although a smaller subset than ours. It is not clear, however, how to translate that construction into the asynchronous model. Moreover, since that construction, it has been a personal quest goal to nd an inductive, round-by-round construction of the 4

synchronous protocol complex.

3 Model A set of n + 1 sequential threads of control, called processes, communicate by message-passing. An initial or nal state of a process is modeled as a vertex, ~v = hP; vi, a pair consisting of a process id P and a value v (either input or output). We speak of the vertex as being colored with the process id. A set of d + 1 mutually compatible initial or nal states is modeled as a d-dimensional simplex (or d-simplex) S d = (~s0 ; : : : ;~sd ). We say that ~s0 ; : : : ;~sd span S d. Simplex S is a (proper) face of T if the vertexes of S form a (proper) subset of the vertexes of T . Where convenient, we use superscripts to indicate dimensions of simplexes. By convention, a simplex of dimension d < 0 is an empty simplex. The set of process ids associated with simplex S n is denoted by ids(S n ), and the set of values by vals(S n ). The complete set of possible initial (or nal) states is represented by a set of simplexes, closed under containment, called a simplicial complex (or complex) K. The dimension of a complex is the highest dimension of any of its simplexes. In this paper all the complexes of dimension n are full in the sense that every simplex is contained in some n-simplex. A map : K ! L carrying vertexes to vertexes is simplicial if it also carries simplexes to simplexes. A complex is colored if each vertex is labeled with a process id, and each m-simplex is labeled with m + 1 distinct colors. If K and L are colored complexes, then is colorpreserving if id(~v) = id((~v )). Two colored complexes K and L are isomorphic, written K = L, if there is a surjective and one-to-one simplicial map : K ! L. A protocol is a program in which each process receives a private input value, communicates with the other processes via message-passing, and eventually halts with a private output value. Processes may crash, which means they fail by halting in the middle of the protocol, and we assume there is a bound f on the number of processes that can fail. At any point in time, the view of a process consists of its input and the sequence of messages it has received. Any protocol has an associated protocol complex P , de ned as follows. Each vertex is labeled with a process id and a possible view for that process. A set of vertexes hPi1 ; vi1 i ; : : : ; hPid ; vid i spans a simplex of P if and only if there is some execution in which Pi1 ; : : : ; Pid nish the protocol with respective views vi1 ; : : : ; vid . Each simplex thus corresponds to an equivalence class of executions that \look the same" to the processes at its vertexes. The subcomplex of P corresponding to executions starting from a xed simplex S m is denoted P (S m). The protocol complex P depends both on the protocol and the kinds of interleavings permitted. A protocol is uniquely determined by its message function and its decision function. The message function determines what messages a process should send in a given state, and the decision function determines what output value a process should choose in a given state (if any). A protocol is a full-information protocol [Had83, FL82, PSL80] if the message function causes each process to 5

send its entire local state when it sends a message. We can assume without loss of generality that all protocols P we consider are full-information protocols [Had83, FL82, PSL80, DM90]. de ne asynchronous, synchronous, and semisynchonous models

4 Basic topology

Let P be a protocol in an arbitrary message-passing model of computation, S n an input complex, and P (S n ) the protocol complex for P on S n . If S m is a face of S n, then we de ne P (S m ) to be subcomplex of P (S n ) corresponding to executions in which only the process in ids(S m ) take steps (i.e., the rest fail before sending any messages.) If there are no such executions, then P (S m) is empty. If S0n and S1n are two input n-simplexes, then for any full-information protocol, Lemma 1: P (S0m \ S1m) = P (S0m) \P (S1m ) for all input simplexes S0m and S1m. Proof: To see that P (S0n \ S1n) P (S0n) \ P (S1n), simply note that S0n \ S1n is a face of S0n , so P (S0n \ S1n ) is a face of P (S0n), and similarly for P (S1n ). For the reverse inclusion, let Pi be a process id in ids(S0n ) ? ids(S1n). If vertex ~s is labeled with the id of a process that received a message from Pi , then ~s cannot lie in P (S0n ) \ P (S1n). Informally, a complex is k-connected if it has no holes in dimensions k or less. More precisely, De nition 2: A complex K is k-connected if every continuous map of the ksphere to K can be extended to a continuous map of the (k + 1)-disk [Spa66, p.51]. This de nition says that a complex is 0-connected if it is connected in the graph-theoretic sense. By convention, a complex is (?1)-connected if it is nonempty, and every complex is k-connected for k < ?1. When k = 1, k-set agreement is consensus, and 0-connectivity is the usual graph-theoretic notion of connectivity. The following is an elementary consequence of the Mayer-Vietoris sequence [Spa66, p.186]. Theorem 3: If K and L are complexes such that K and L are k-connected, and K \ L is nonempty and (k ? 1)-connected, then K [ L is k-connected. We will be most interested in applying Theorem 3 to computing the connectivity of unions of pseudospheres, which we now de ne.

5 Connectivity theorem

Let 2K denote the powerset of a nite set K . The basic result used in this paper to analyze k-set agreement is the statement, \If P (S m ) is (m ? (n ? k) ? 1)connected for all m with n ? f m n, then P cannot solve k-set agreement 6

in the presence of f failures." This result is already essentially proved elsewhere [HR94], but for the sake of making this paper self-contained, we reprove this result here. The point-set occupied by a complex C is called its polyhedron, and is denoted by jCj. It is a standard result that any simplicial map : A ! B induces a piece-wise linear map jj : jAj ! jBj that agrees with on vertexes of A. A subdivision of a complex A is a complex B such that (1) the vertexes of B are points of jAj, (2) if S n is a simplex of B there is some simplex T n 2 A such that S n jT nj, and (3) the piece-wise linear map jBj ! jAj mapping each vertex of B to the corresponding point of jAj is a homeomorphism. Let C be a complex, and w~ a point with the property that any ray emanating from w~ intersects jCj in at most one point. De ne the cone w~ C to be the collection of all simplexes of the form (w~ ;~s0 ; : : : ;~sk ), where (~s0 ; : : : ;~sk ) is a simplex of C , together with all faces of such simplexes. This cone is itself a complex, having C as a subcomplex [Mun84, p. 44]. Let be a subdivision of skel`?1 (C ), and S0` ; : : : ; SL` the `-simplexes of skel` (C ). For 0 i L, let w~ i be an interior point of jSi` j. Each cone w~ i (Si` ) is a subdivision of Si`, and the union of these cones as i ranges from 0 to L is a subdivision of skel` (C ) that agrees with on the (` ? 1) skeleton [Mun84, p. 85]. The result is called the subdivision of S ` obtained by starring . We use the following variant of Sperner's Lemma [Lef49, Lemma 5.5]: Lemma 4 (Sperner's Lemma): Let (S n) be a subdivision of simplex S n. If F : (S n ) ! S n is a map sending each vertex of (S n ) to a vertex in its carrier, then there is at least one n-simplex T n = (~t0 ; : : : ; ~tn ) in (S n ) such that the F (~ti ) are all distinct. We also exploit the following lemma, which appears in Glaser [Gla70, Theorem IV.2]. Lemma 5: Let A, B, and C be complexes such that A B, and f : jBj ! jCj is a continuous map such that f restricted to jAj is simplicial. There exists a subdivision of B such that (A) = A, and a simplicial map : (B) ! C extending the restriction of f to jAj. Recall that the decision map : P ! O carries a vertex labeled with a process id and protocol execution view to a vertex labeled with the same process and the value that process chooses at the end of the execution. De ne the map 1 by 1 (p~) = val((p~)), carrying each vertex of the protocol complex to the decision value alone. The following theorem is model independent, in the sense that it depends on general topological properties of the protocol complex, not on explicit timing or failure properties of the model. Theorem 6: Let P be a protocol with the property that for every n-dimensional pseudosphere (P0 ; : : : ; Pn ; V ), where V is non-empty, P ( (P0 ; : : : ; Pn ; V0 ; : : : ; Vn )) is (k ? 1)-connected. If there are more than k possible input values, then P cannot solve k-set agreement. 7

Proof: Let S k = (~s0; : : : ;~sk ) be a simplex in which each vertex ~si is labeled with a distinct value vi from V . The set of values labeling a face S j of S k is denoted by vals(S j ). For 0 ` k, we will inductively build up a sequence of

subdivisions ` and simplicial maps ` ` : ` (skel` (S k )) ! P with the property that for every vertex ~s 2 ` (skeli (S k )), ` (~s) 2 P ((P0 ; : : : ; Pk ; vals(carrier(~s; S k )))): (1) Before undertaking this constuction, we observe why it suces to prove the theorem. Note that skelk (S k ) = S k . De ne : k (S k ) ! S k to carry each vertex ~s of k (S k ) to the unique vertex of S k labeled with 1 (k (~s)). Equation 1 guarantees that k (k (S k )) P ( (P0 ; : : : ; Pn ; vals(S k ))): Because k-set agreement requires that the value chosen by any process have been some process's input, 1 (k (~s)) 2 vals(S k ) for every vertex ~s 2 (S k ), so is well-de ned. Moreover, Equation 1 guarantees that (~s) 2 carrier(~s; S k ). By Lemma 4, there is a simplex T k = (~t0 ; : : : ; ~tk ) in k (S k ) such that for 0 i k, the (~ti ) are distinct. As a result, the values of 1 ((~ti )) are distinct, implying that the simplex (T k ) in P has dimension k. But this (T k ) corresponds to a protocol execution in which k +1 distinct processes choose k +1 distinct values, a contradiction. We now turn our attention to the inductive construction. For the base case, let 0 carry each vertex ~si of S k to any vertex of P ( (P0 ; : : : ; Pn ; fvi g)), the protocol complex over the input simplex where all processes begin with input vi . It is immediate that 0 satis es Equation 1. For the induction step, assume a subdivision `?1 and simplicial map `?1 : `?1 (skel`?1 (S k )) ! P satisfying Equation 1. Let be the subdivision of skel` (S k ) obtained by ? +1starring . For `?1 . Let S0` ; : : : ; SL` be all the `-simplexes in skel` (S k ), where L = k`+1 each Si`, 0 i L, let Ai be the complex (skel`?1 (Si` )), and Bi the complex (Si` ) Equation 1 implies that ` sends A into P ( (P0 ; : : : ; Pn ; vals(Si` ))). Let gi : jAj ! jP ( (P0 ; : : : ; Pn ; vals(Si` )))j be the piece-wise linear map induced on the polyhedra. Because A is homeomorphic to an (` ? 1)-sphere, gi is continuous (being piece-wise linear), and P ( (P0 ; : : : ; Pn ; vals(Si` )))j is (k ? 1)-connected, where k `, gi can be extended to a continuous fi : jBi j ! jP ( (P0 ; : : : ; Pn ; vals(Si` )))j By Lemma 5, there exists a subdivision i of Bi and a simplicial map i : i (Bi ) ! P ( (P0 ; : : : ; Pn ; vals(Si` )))j 8

P0 , 0 P0 , 0

P1 , 1

P1 , 1

P1, 2

P1 , 0

P0 , 1

P0 , 1

P0 , 2 P1 , 0

Figure 2: Psuedospheres (fP0 ; P1 g ; f0; 1g) and (fP0 ; P1 g ; f0; 1; 2g). extending the restriction of `?1 to jAj. By construction, the i subdivisions agree on the intersections of their domains, so together they de ne a subdivision of (skel` (S k )). Let ` (S k ) = ( (S k )). The maps i also agree on the intersections of their domains, and together they de ne a simplicial map

` : ` (skel` (S k )) ! P satisfying Equation 1.

6 Pseudospheres Informally, a pseudosphere is a combinatorial structure in which each process from a set of processes is independently assigned a value from a set of values.

De nition 7: Let S m = (~s0; : : : ;~sm) be a simplex and U0; : : : ; Um be a sequence of nite sets. The pseudosphere (S m ; U0 ; : : : ; Um ) is the following complex. Each vertex is a pair h~si ; ui i, where ~si is a vertex of S m and ui 2 Ui . Vertexes h~si0 ; ui0 i ; : : : ; h~si ; ui i span a simplex of (S m ; U0 ; : : : ; Um ) if and only if `

`

the ~si are distinct. A pseudosphere in which all Ui equal U is simply written (S m ; U ).

We call this construct a pseudosphere because if S n is an n-dimensional simplex, then (S n ; f0; 1g) is homeomorphic to an n-dimensional sphere. Pseudospheres are important because every complex considered here is either a pseudosphere or the union of pseudospheres. Notice that the input complexes for consensus and set agreement are pseudospheres.

Lemma 8: Pseudospheres satisfy the following combinatorial properties. Sm. 1. If U is a singleton set, then (S m ; U ) =

2. Let S m = (~s0 ; : : : ;~sm ), and S m?1 = (~s0 ; : : : ; b~si ; : : :~sm), where circum ex denotes omission. If Ui = ;, then (S m ; U0 ; : : : ; Um ) = (S m?1 ; U0 ; : : : ; Ubi ; : : : ; Um ). 3. (S0 ; U0 ; : : : ; Um ) \ (S1 ; V0 ; : : : ; Vm ) = (S0 \ S1 ; U0 \ V0 ; : : : ; Um \ Vm ). 9

7 Protocols The next theorem shows how to exploit the nice combinatorial properties of pseudospheres. It states that if applying a protocol to a single simplex preserves connectivity below some dimension, then applying that protocol to any input pseudosphere also preserves that degree of connectivity. It is actually a theorem in topology, and so it applies to any model of computation.

Theorem 9: Let P be a protocol, S m be a simplex, and c be a constant. If for every face S ` of S m and for every sequence V0 ; : : : ; V` of singleton sets the protocol complex P ( (S ` ; V0 ; : : : ; V` )) is (` ? c ? 1)-connected, then for every sequence U0 ; : : : ; Um of nonempty sets the protocol complex P ( (S m ; U0 ; : : : ; Um )) is (m ? c ? 1)-connected. Proof: De ne the following partial order on sequences of sets: (U0 ; : : : ; Um) (V0 ; : : : ; Vm ) if each Ui Vi , and for at least one set, the inclusion is strict. We proceed by induction on m. For the base case, let m = c. Since P ( (S m ; U0 ; : : : ; Um )) is the union of the P ( (S m ; V0 ; : : : ; Vm )) as the singletons Vi range over the values in Ui , and since each P ( (S m ; V0 ; : : : ; Vm )) is (?1)-connected (meaning non-empty), so is the union P ( (S m ; U0 ; : : : ; Um)).

For the induction step for m, assume the hypothesis holds for simplexes of dimension less than m. We proceed by induction on the partially-ordered sequences U0 ; : : : ; Um . For the base case, when all the Ui are singletons, the claim is just the hypothesis. For the induction step for the set sequences, assume the claim for every sequence less than U0 ; : : : ; Um . There must be some index i such that Ui = Vi [ fvg, where Vi is non-empty. The pseudosphere is the union of K = P ( (S m ; U0 ; : : : ; Vi ; : : : ; Um)) and L = P ( (S m ; U0 ; : : : ; fvg ; : : : ; Um )). By the induction hypothesis for the sets, both K and L are (m ? c ? 1)connected. Moreover,

K \ L = P ( (S m ; U0 ; : : : ; Vi ; : : : ; Um ) \ (S m ; U0; : : : ; fug ; : : : ; Um)) = P ( (S m ; U0 ; : : : ; ;; : : : ; Um)) = P ( (S m?1 ; U0 ; : : : ; ^;; : : : ; Um)): This last expression is P applied to an (m ? 1)-dimensional pseudosphere, which is (m ? c ? 2)-connected by the induction hypothesis for m. We have shown that K and L are (m ? c ? 1)-connected, and their intersection is (m ? c ? 2)connection, so their union is (m ? c ? 1)-connected by Theorem 3. A consequence of this theorem is that any n-dimensional pseudosphere is (n ? 1)-connected (just let P be the trivial protocol in which each process halts immediately):

Corollary 10: If U0; : : : ; Um are all non-empty, then (S m; U0; : : : ; Um) is (m? 1)-connected.

10

Naively, one might think that S m is always m-connected, but note that although the empty simplex has dimension ?1, it is not (?1)-connected. We can generalize Theorem 9 to multiple pseudospheres.

Theorem 11: Let P be a protocol satisfying the precondition of Theorem 9, and let A0 ; : : : ; A` be a sequence of nite sets. If \`i=0 Ai = 6 ; then P

!

[`

i=0

(S m ; Ai ) is (m ? c ? 1)-connected.

Proof: We argue by induction on `. The base case, when ` = 0, follows directory from Theorem 9. For the induction step, assume that

K=P

`[ ?1 i=0

!

(S m ; Ai )

is (m ? c ? 1)-connected:

By Theorem 9,

L = P ( (S m ; A` )) is (m ? c ? 1)-connected. The intersection is

K\L=P

`[ ?1 i=0

!

(S m ; Ai )

\

!

(S m ; A` ) = P

`[ ?1 i=0

!

(S m ; Ai \ A` )

:

De ne

Bi = Ai \ A` for 0 i ` ? 1

B =

`[ ?1 i=0

(S m ; Bi )

?1 Bi 6= ;, so by the induction B is the union of ` ? 1 pseudospheres, and \`i=0

hypothesis,

K \ L = P (B) = P

`[ ?1 i=0

(S m ; Bi )

!

is (m ? c ? 1)-connected.

The claim now follows from Theorem 3. Letting P be the trivial protocol in which each process decides its input: Corollary 12: A is (m ? 1)-connected.

11

8 Asynchronous Computation

In this section, we de ne the r-round asynchronous protocol complex Ar (S m ). We restrict our attention to a subset of well-behaved, round-based executions de ned as follows: in each round, each process sends its state to all other processes, receives the messages delivered to it in that round, and undergoes a state transition. Because the model is asynchronous, the message sent from Pi to Pj in round r may not be delivered in that round. When that message is delivered, however, all other undelivered messages from Pi to Pj are delivered at the same time (so message delivery is FIFO). In each round, each process receives at least n ? f + 1 messages, including its message to itself. The only failures that occur are initial failures in which certain processes never take steps at all. Let A1 (S n ) be the protocol complex of all one-round, f -faulty, (n + 1)process protocol executions with input simplex S n. Let P be the set of all n + 1 processes, and if U is a set, then let 2Uk denote the subset of 2U consisting of sets of size at least k. Our rst result says that the one-round protocol complex is a single pseudosphere:

Lemma 13: A1(S n) = (S n ; 2Pn??ff P0 g; : : : ; 2nP??ff P g). Proof: Each vertex in A1 (S n) has the form hPi ; M i, where Pi 2 P and M is the set of messages delivered to Pi during the round. Every process receives a message from itself, and also from at least n ? f other processes (since at most f processes can fail). Since each process can hear from an independently assigned set of at least n ? f other processes, these sets induce a pseudosphere on S n . n

De ne the vertex map

: A1 (S n ) ! (S n ; 2Pn??ff P0 g ; : : : ; 2nP??ff Pn g )

by (i; M ) = h~si ; ids(M ) ? fPi gi. It is easy to see that is simplicial (sending simplexes to simplexes) and one-to-one. Let Ar (S m ) be the r-round protocol complex de ned by induction to be the result of taking the union of Ar?1 (T ) for every simplex T in A1 (S m ). First we can prove that the one-round protocol complex A1 (S m ) is (m ? (n ? f ) ? 1)connected for all n m 0, and then we can iterate this argument to prove that the r-round complex is also highly-connected.

Lemma 14: For n m 0, A1 (S m) is (m ? (n ? f ) ? 1)-connected. Proof: There are three cases: m = n, n ? t m < n, and m < n ? t. When m = n, by Lemma 13, A1 (S n ) is an n-dimension pseudosphere, which is (n ? 1)connected by Corollary 10, The claim follows because n ? 1 n ? (n ? t) ? 1 = t ? 1. When n ? t m < n, recall that A1 (S m ) is the set of executions in which only the processes in S m take steps. This protocol is equivalent to the complex of (m+1)-process, (t?n+m)-faulty protocol executions, which is an m-dimensional 12

R;PR

Q;PQ

Q fails

Q;PQR

P;PQR

P;PQR

P;PR

R fails

P;PQR

P;PQ

no failures no failures

R;PQR

Q;PQR

R;PQR

R fails

P;PQ

Q;PQR

Q;PQ P fails

Q;QR

R;QR

Figure 3: Construction of a one-round three-process protocol complex. pseudosphere, (m ? 1)-connected. The claim follows because m ? 1 m ? (n ? t) ? 1. When m < n ? t, m ? (n ? t) ? 1 < ?1, and A1 (S m ) is empty, so the condition holds vacuously.

Lemma 15: Ar (S m) is (m ? (n ? f ) ? 1)-connected. Proof: We argue by induction on r. When r = 1, the claim follows from Lemma 14. The induction hypothesis is that Ar?1 (S m ) is m ? (n ? t) ? 1)-connected. For n ? k m n, A1 (S m ) is an m-dimensional pseudosphere (Lemma 13). By the induction hypothesis and Theorem 9, Ar (S n ) = Ar?1 (A(S n )) is also (m ? (n ? t) ? 1)-connected. It follows that the asynchronous protocol complex is (f ? 1)-connected (when

m = n), and thus we can prove the impossibility of asynchronous k-set agreement [BG93, HS93, SZ93]

Theorem 16: If I the set of input values, and I = (S n ; I ) the input complex, then Ar (I ) is (t ? 1)-connected for any r > 0. Proof: For each input simplex S m, Ar (S m) is (m ? (n ? t) ? 1)-connected (Lemma 15), so Ar ( (S m ; I )) is also (m ? (n ? t) ? 1)-connected by Theorem

9.

Corollary 17: There is no asynchronous f -resilient k-set agreement protocol for k f .

9 Synchronous Computation

We now de ne the r-round synchronous protocol complex S r (S m ). Here too, we consider only a subset of all possible executions: executions in which no more than k processes fail in any round. We are interested in executions where no 13

more than k processes fail in any round. Informally, we will show that the oneround protocol complex is the union of pseudospheres, where each pseudosphere corresponds to the set of executions in which a xed set of processes fail. For example, Figure 9 illustrates the possible executions of a one-round protocol for three processes, P , Q, and R, starting from a xed input simplex, in which no more than one process fails. Here, each vertex is labeled with a process, followed by the processes from which it has received messages. The gure on the left represents the execution in which in which no processes fail: this is a (degenerate) pseudosphere in which each process receives the same set of messages. The gure in the middle represents the executions in which R alone fails. This complex is a pseudosphere: P and Q independently do or do not receive a message from R. The gure on the right represents the entire onefaulty protocol complex. It is the union of the failure-free pseudosphere with the three single-failure pseudospheres.

9.1 Single-Round Protocols

Let S 1 (S n ) be the complex of one-round executions of an (n+1)-process protocol with input simplex S n in which at most k processes fail. It is the union of complexes SK1 (S n ) of one-round executions starting from S n in which exactly the processes in K fail. Given a set K of process ids, let S n nK be the face of S n labeled with the process ids not in K . Our next result says that SK1 (S n ) is a pseudosphere, which means that S 1 (S n ) is a union of pseudospheres: Lemma 18: SK1 (S n) = (S nnK ; 2K ). Proof: Each vertex in SK1 (S n) has the form hPi ; M i, where Pi 2 ids(S nnK ) and M is a set of messages received from other processes in the round. Every nonfaulty process receives a message from every other nonfaulty process, and also from some subset of faulty processes. De ne the vertex map : SK1 (S n ) ! (S n nK ; 2K ) by (i; M ) = h~si ; ids(M ) \ K i. The vertex map is simplicial and one-to-one. Note the similarity between the two models: the one-round complex is a single pseudosphere in the asynchronous model (Lemma 13) and a union of pseudospheres in the synchronous model (Lemma 18). To compute the connectivity of this union using Theorem 3, we need to understand the intersections. The next lemma shows that these intersections have a simple structure: they are themselves the union of pseudospheres. Order the process sets lexicographically: the empty set rst, followed by singleton sets, followed by two-element sets, and so on. Let K0 ; : : : ; K` be the sequence of sets of process ids less than or equal to K`, listed in lexicographic order.

Lemma 19:

`[ ?1 i=0

SK1 (S n ) \ SK1 (S n ) = i

`

[ j 2K`

14

(S n nK`; 2K` ?fjg ):

Proof: For the inclusion, notice that for each i, SK1 (S n ) \ SK1 (S n ) = (S n nKi ; 2K ) \ (S nK`; 2K ) = (S n nKi \ S n nK`; 2K \K ) i

i

`

`

i

`

Because Ki 6= K` and K` is not a subset of Ki , there exists a Pj 2 K` ? Ki , and therefore Ki \ K` K` ? fj g. Finally, S n nKi \ S n nK` S n nK`, so (S n nKi \ S n nK`; 2Ki\K` ) is a subcomplex of (S n nK`; 2K`?fjg ). For the reverse inclusion, because the Ki are sorted in lexicographic order, each K` ? fj g precedes K` in the sequence. (S n nK` ? fj g ; 2K`?fjg ) \ (S n nK`; 2K` ) (S n n(K` ? fj g) \ S n nK`; 2K`?fjg\K` ) (S n nK`; 2K`?fjg ):

SK1 ?fjg (S n ) \ SK1 (S n ) = `

`

= =

The last step follows because (K` ? fj g) \ K` = K` ? fj g, and S nnK` S n n(K` ? fj g). Let S 1 (S n ) denote the protocol complex for a one-round synchronous (n +1)process protocol with input simplex S n where no more than than k processes fail.

Lemma 20: If n 2k, then S 1(S m) is (m ? (n ? k) ? 1)-connected. Proof: There are three cases: m = n, n ? t m < n, and m < n ? t. First

consider m = n. Let K0 ; : : : ; K` be the sequence of all sets of process ids less than or equal to K`, listed in lexicographic order. If K` is the maximal set of k process ids, then

S 1 (S n ) =

[`

i=0

SK1 (S n ): i

We will show inductively that if jK`j k,

[`

i=0

SK1 (S n ) is (k ? 1)-connected: i

The claim then follows when K` is the maximal set of size k. For the base case, when ` = 0, K0 = ;, SK1 0 (S n ) = (S n ; 2; ) = S n , (n ? 1)connected, and n ? 1 k ? 1. For the induction step, assume

K=

`[ ?1 i=0

SK1 (S n ) is (k ? 1)-connected. i

15

By Lemma 18,

L = SK1 (S n ) = (S n nK`; 2K ): Since dim(S n nK`) n ? k, L is (n ? k ? 1)-connected by Corollary 10, and because n 2k, n ? k ? 1 k ? 1. `

`

By Lemma 19,

K\L=

`[ ?1 i=0

SK1 (S n ) \ SK1 (S n ) = i

`

[ i2K`

(S n nK`; 2K`?fig ):

Let Ai = 2K`?fig . \i2K` Ai 6= ;, and S n nK` has dimension at least n ? k, so Corollary 12 implies that K \ L is (n ? k ? 1)-connected. Since n 2k, n ? k ? 1 k ? 1. When n ? t m < n, recall that S 1 (S m ) is the set of executions in which only the processes in S m take steps. This protocol is equivalent to the complex of (m + 1)-process, (k ? n + m)-faulty protocol executions. Substituting m for n and k ? n + m for k, S 1 (S m ) is (m ? (n ? k) ? 1)-connected. When m < n ? k, m ? (n ? k) ? 1 < ?1, and S 1 (S m ) is empty, so the condition holds vacuously. The reader is invited to compare both the statements and proofs of Lemmas 14 and 20.

9.2 Multi-Round Protocols

Let S r (S n ) be the protocol complex for an r-round synchronous (n + 1)-process protocol with input simplex S n where no more than than k processes fail in each round. We can decompose this complex as follows. Let K0 ; : : : ; K` be a sequence of sets of k or fewer process ids in lexicographic order. Recall that SK1 i (S n ) = (S n nKi ; 2Ki ) is the complex of one-round executions in which exactly the processes in Ki fail. The set of r-round executions in which exactly the processes in Ki fail in the rst round can be written as Sir?1 (SK1 i (S n )), where Sir?1 is the complex for an (r ? 1)-round, (f ?jKi j)-faulty, (n ?jKi j +1)process full-information protocol. The Sir?1 are considered distinct protocols because the SK1 i (S n ) have varying dimensions. Taking the union over all the Ki , we have

S r (S n ) =

[`

i=0

Sir?1 (SK1 (S n )): i

The connectivity of a protocol P depends on the degree of the protocol. Consider the multi-round executions of P in which fi is the maximum number of processes that fail at round i. The degree of P is the minimum fi for any round. De ne Se`r?1 to be the protocol identical to S`r?1 except that it fails at most k ? 1 processes in its rst round. While S`r?1 has degree k, Se`r?1 has degree k ? 1. Our next result implies that intersections of the complexes comprising S r are equivalent to Se`r?1 applied to a union of pseudospheres, which makes it possible to use Theorem 3 to analyze the connectivity of S r . 16

Lemma 21: `[ ?1 i=0

0 1 [ Sir?1 (SK1 (S n )) \ S`r?1 (SK1 (S n )) = Se`r?1 @ (S n nK`; 2K ?fjg )A : i

`

`

j 2K`

Proof: For the inclusion, we have seen in the proof of Lemma 19 that for each i, there is some j 2 K` such that (S n nKi \ S nK`; 2Ki \K` ) (S n nK`; 2K`?fjg ):

Under what circumstances can Sir?1 and S`r?1 agree on this complex? Because Pj has already failed in S`r?1 , the only executions of Sir?1 that agree with S`r?1 are ones in which Pj fails without sending any messages to non-faulty processes. But then Sir?1 (and therefore S`r?1 ) can fail at most k ? 1 processes that do send messages to non-faulty processes. Any such execution is also an execution of Se`r?1 . For the reverse inclusion, we have seen that for each j 2 K` ,

SK1 ?fjg (S n ) \ SK1 (S n ) = (S n nK`; 2K ?fjg ): `

`

`

The set of executions in which the two protocols agree are exactly those in which Sir?1 immediately fails Pj , and in which S`r?1 fails no more than k ? 1 processes. These executions comprise Se`r?1 .

Lemma 22: If n rk + k, and S r is an r-round, (n + 1)-process protocol with degree k, then S r (S m ) is (m ? (n ? k) ? 1)-connected. Proof: We argue by induction on r. When r = 1, the claim follows from Lemma 20. As induction hypothesis, assume the claim for any (r ? 1)-round protocol.

First, consider the case where m = n. Let K0 ; : : : ; K` be the sequence of all sets of process ids less than or equal to K` , listed in lexicographic order. We will show inductively that if jK`j k,

[`

i=0

Sir?1 (SK1 (S n )) is (k ? 1)-connected: i

The claim then follows when K` is the maximal set of size k. For the base case, when ` = 0, K0 = ;, S;1 (S n ) is (S n ; 2; ) = S n , and S0r?1 (S n ) is (k ? 1)connected by the induction hypothesis for r. For the induction step for `, assume

K=

`[ ?1 i=0

SK1 (S n ) is (k ? 1)-connected. i

By Lemma 18,

L = S`r?1 (SK1 (S n )) = S`r?1 ( (S n nK`; 2K )): `

`

17

Recall that S`r?1 is a k-faulty, (n ? jK`j + 1)-process, (r ? 1)-round protocol, where n ? jK` j (r ? 1)k, so by the induction hypothesis, for each simplex S d 2 (S n nK`; 2K` ), S`r?1 (S d ) is (d ? (n ?jK`j? k) ? 1)-connected. By Theorem 9, S`r?1 ( (S n nK`; 2K` )) is (k ? 1)-connected. By Lemma 21,

! [ n r ? 1 K ?f i g e K \ L = S` ) : (S nK`; 2 `

i2K`

This protocol has (r ? 1)-rounds and degree (k ? 1). By the induction hypothesis, for any simplex S n?k , Se`r?1 (S n?k ) is (k ? 2)-connected. Let Ai = 2K`?fig , for i 2 K`. \i2K` Ai 6=, so K \ L is also (k ? 2)-connected by Theorem 11. The claim now follows from Theorem 3. If n > m n ? k S r (S m ) is equivalent to an m-process protocol in which at most k ? (n ? m) processes fail in the rst round, and k thereafter. This protocol has degree k ? (n ? m), so S r (S m ) is (k ? (n ? m) ? 1)-connected, which is (m ? (n ? k) ? 1). When m < n ? k, m ? (n ? k) ? 1 < ?1, and S r (S m ) is empty, so the condition holds vacuously. The connectivity of this protocol complex implies the lower bound for synchronous k-set agreement [CHLT93]:

Theorem 23: If n f + k, then any synchronous f -resilient k-set agreement protocol requires bf=kc+1 rounds. If n < f +k, then any synchronous f -resilient k-set agreement protocol requires bf=kc rounds. Proof: If n ? k f , then S bf=kc (I ) is (k ? 1)-connected. If n ? k < f , then S bf=kc?1 (I ) is (k ? 1)-connected. Either way, 6 states that the protocol cannot solve k-set agreement.

10 Semi-Synchronous Computation

Finally, we de ne the r-round semi-synchronous protocol complex Mr (S m ). In this model, the time between two consecutive steps of a process is at least c1 and at most c2 , and the time to deliver a message is at most d. The values c1 , c2 , and d are known constants, and we de ne C = c2 =c1. As in the previous two sections, we will show that any full-information protocol complex in the semi-synchronous model has a well-behaved subcomplex that can be expressed as a well-structured union of pseudospheres. The structure of this union is more complex than in the synchronous case, but the formally analysis is nearly the same.

10.1 Failure Patterns

Once again, we restrict attention to round-structured executions de ned as follows. Each round takes exactly time d. All messages sent during a round are 18

delivered at the very end of that round (at multiples of time d). All processes take steps in lockstep as quickly as possible (at multiples of time c1 ). The interval between process steps is called a microround, and there are = bd=c1 c microrounds per round. Each message is labeled with the microround in which it was sent. The view of a process Pi at the end of a round is a sequence h0 ; : : : ; n i, where j is the microround of the last message received from Pj (or 0 if no message was received from Pj ). Consider a set K of processes, and consider all single-round executions in which K is the set of failing processes. A failure pattern for K is a function F mapping each process Pj 2 K to the microround j in which it fails. At the end of the round, there are a number of views consistent with F , since the last message received by Pi from a process Pj failing in microround j will be sent either in microround j or j ? 1. We de ne [F ] to be the set of possible views h0 ; : : : ; n i produced by F :

i =

F (P ) ? 1 or F (P )

i

i

if Pi 2 K otherwise

We de ne [F " j ] to be the subset of [F ] in which Pj 's last message is delivered in microround j (and not j ? 1). If F is a failure pattern for K , and j 2 K , then [F " j ] is de ned to be the set of views h0 ; : : : ; n i, where

8 F (P ) ? 1 or F (P ) < i i i = : F (Pi )

if Pi 2 K ? fj g if i = j otherwise.

We order failure patterns for a xed set K in reverse lexicographical order: the rst pattern fails all processs in K at micro-round , and the last at 0. Let M1 (S n ) be the complex of one-round executions of an (n + 1)-process protocol with input simplex S n in which at most k processes fail. It is the union of complexes M1K;F (S n ) denoting protocol complex of one-round executions starting from S n in which the processes in K fail with pattern F . The next result says that M1K;F (S n ) is a pseudosphere, so the one-round protocol complex is a union of pseudospheres:

Lemma 24: M1K;F (S n) = (S nnK ; [F ]). Proof: Each vertex in M1K;F (S n ) has the form hPi ; M i, where Pi 2 ids(S nnK )

and M is a set of messages received from other processes in the round. This set contains a message from every nonfaulty process in every microround, and a message from every faulty process in some (possibly empty) pre x of microrounds. Each message is labeled with the microround in which it was sent. Let map each vertex v of M1K;F (S n ) to the vertex v0 of (S n nK ; [F ]) where the j -th component of the view labeling v0 is the microround of the last message from Pj in the set of messages labeling v. This is an isomorphism. Once again, notice the similarity among the models (Lemmas 13, 18, and 24). In each case, the one-round complex is the union of pseudospheres, where 19

the structure of the union re ects the timing and failure properties of the model. The pseudospheres (S n nK ; [F ]) forming the one-round complex M1(S n ) are lexicographically ordered by the ordering on process sets K and the ordering on failure patterns F , ordering rst by K and then by F . Let (S n nK0 ; [F0 ]); : : : ; (S n nK`; [F` ]) be the sequence of pseudospheres less than or equal to (S n nK`; [F` ]), listed in this order. Let

K=

Lemma 25:

`[ ?1 i=0

(S n nKi ; [Fi ]) and L = (S n nK`; [F` ]):

K\L=

[ j 2K`

(S n nK`; [F` " j ]):

Proof: For the inclusion, Lemma 8 implies that (S n nKi ; [Fi ]) \ (S n nK`; [F` ]) = (S n nKi \ S n nK`; [Fi ] \ [F` ]): S n nKi \ S n nK` S n nK`. If Ki is distinct from K` , then it precedes it lexicographically, and there is some j 2 K` ? Ki . If [Fi ] \ [F` ] is non-empty,

then F` (j ) = , because Fi does not fail Pj . Otherwise, if Ki = K`, then Fi precedes F` , and there is some j such that Fi (j ) > F` (j ). In either case, [Fi ] \ [F` ] [F` " j ], proving the inclusion. Consider the reverse inclusion. Suppose F` (j ) = . Let F be the failure pattern that agrees with F` on K` ?fj g, but does not fail Pj . [F ] \ [F` ] = [F` " j ] because every view has in position j , and either F (i) ? 1 or F (i) in position i 2 K` ? fj g. (S n n(K` ? fj g); F ) precedes (S n nK`; F` ) in the sequence, and hence (S n n(K` ? fj g); [F ]) \ (S n nK`; [F` ]) = (S n nK`; [F` " j ]) K \ L: Suppose instead that F` (j ) < . Let F be the failure pattern that agrees with F` on K` ? fj g, but F (j ) = F` (j ) + 1. [F ] \ [F` ] = [F` " j ]. (S n nK`; F ) precedes (S n nK`; F` ) in the sequence, and hence (S n nK`; [F ]) \ (S n nK`; [F` ]) = (S n nK`; [F` " j ]) K \ L: Let M1(S n ) denote the protocol complex for a one-round semi-synchronous (n + 1)-process protocol with input simplex S n where no more than than k processes fail. This result says that the intersections of the pseudospheres making up M1 (S n ) are just the unions of other pseudospheres. This makes is possible to use Theorem 3 to compute the connectivity of their union M1 (S n ). For the case of one-round protocols, we can prove that one particular union M1 (S m ) of pseudospheres is (m ? (n ? k) ? 1)-connected for all m n when n 2k. Iterating this construction r times to de ne the r-round protocol complex Mr (S m ), using techniques analogous to those used in the synchronous model, we can prove that Mr (S m ) is also highly-connected. 20

Lemma 26: If n 2k, then M1(S m ) is m ? (n ? k) ? 1)-connected. Proof: There are three cases: m = n, n ? t m < n, and m < n ? t. First consider m = n. Let (S n nK0; [F0 ]); : : : ; (S n nK`; [F` ]) be the sequence of pseudospheres less than or equal to (S n nK`; [F` ]), listed in this order. If K` is the maximal set of k process ids, and F` its maximal failure pattern, then

M1 (S n ) =

[`

i=0

M1K ;F (S n ): i

i

We will show inductively that if jK`j k,

[` i=0

M1K ;F (S n ) is (k ? 1)-connected: i

i

The claim then follows when (S n K`; [F` ]) is the maximal pseudosphere where jKi j = k. For the base case, when ` = 0, K0 = ;, F0 fails no processe, M1K0 ;F0 (S n ) = (S n ; [F0 ]) = S n, which is (n ? 1)-connected, and n ? 1 k ? 1. For the induction step, assume

K=

`[ ?1 i=0

M1K ;F (S n ) is (k ? 1)-connected. i

i

By Lemma 24,

L = M1K ;F (S n ) = (S n nK`; [F` ]): Since dim(S n nK`) n ? k, L is (n ? k ? 1)-connected by Corollary 10, and because n 2k, n ? k ? 1 k ? 1. `

`

By Lemma 25,

K\L =

`[ ?1 i=0

M1K ;F (S n ) \ FK1 ;F (S n ) = i

i

`

`

[ i2K`

(S n nK`; [F` " i])

Let Ai = [F " i]. \i2K` Ai 6= ;, and S n nK` has dimension at least n ? k, so Corollary 12 implies that K \ L is (n ? k ? 1)-connected. Since n 2k, n ? k ? 1 k ? 1. When n ? t m < n, recall that M1 (S m ) is the set of executions in which only the processes in S m take steps. This protocol is equivalent to the complex of (m + 1)-process, (k ? n + m)-faulty protocol executions. Substituting m for n and k ? n + m for k, M1 (S m ) is (m ? (n ? k) ? 1)-connected. When m < n ? k, m ? (n ? k) ? 1 < ?1, and M1 (S m ) is empty, so the condition holds vacuously. The reader is invited to note the similarities between the statements and proofs of Lemmas 14, 20, and 26. 21

10.2 Multi-Round Protocols

We now consider multi-round protocol executions in which at most fi processes fail at each round. As before, the degree of a protocol is the smallest value of fi for any round. Let Mr be the protocol complex for r-round, (n + 1)-process protocol executions where no more than than k processes fail in each round. We can decompose this complex as follows. Let (S n nK0; [F0 ]); : : : ; (S n nK`; [F` ]) be the sequence of pseudospheres less than or equal to (S n nK`; [F` ]), where K` is the maximal set of k process ids, and F` the maximal failure pattern for K , Recall that MKi ;Fi (S n ) = (S n nKi ; [Fi ]) is the complex of one-round executions in which exactly the processes in Ki fail with pattern Fi . The set of r-round executions in which exactly the processes in Ki fail in the rst round with pattern Fi can be written as Mri ?1 (MKi ;Fi (S n )), where Mri ?1 is the complex for an (r ? 1)-round, (n ? jKi j + 1)-process full-information protocol. The Mri ?1 are considered distinct protocols because the MKi ;Fi (S n ) have varying dimensions. Taking the union over all the Ki , we have

Mr (S n ) =

[`

i=0

Mri ?1(MK ;F (S n )): i

i

fr` ?1 to be the protocol identical to Mr` ?1 except that it fails at most De ne M f has degree k ? 1. k ? 1 processes in its rst round. While Mr` ?1 has degree k, M Lemma 27: If F`(j ) < for some j 2 K`, `[ ?1 i=0

0 1 [ Mri ?1 (M1K (S n )) \Mr` ?1 (M1K (S n )) = Mr` ?1 @ (S n nK`; [F` " j ])A : i

`

j 2K`

Proof: For the inclusion, we have seen in the proof of Lemma 25 that for each i there is a j such that M1Ki (S n ) \ M1K` (S n ) (S n nK`; [F` " j ]): It follows that Mri ?1 (M1Ki (S n )) \ Mr` ?1(M1K` (S n )) Mr` ?1( (S n nK`; [F` " j ])): For the reverse inclusion, let F be the failure pattern that agrees with F` on K` ? fj g, but F (j ) = F` (j ) + 1. M1K`;F (S n )\cMK1 ` ;F` (S n ) = (S n nK`; [F ])\ (S n nK`; [F` ]) = (S n nK`; [F` " j ]) Both Mri ?1 and Mr` ?1 agree on this intersection. Lemma 28: If F`(j ) = for all j 2 K`, `[ ?1 i=0

0 1 [ fr` ?1 @ Mri ?1 (M1K (S n )) \Mr` ?1 (M1K (S n )) = M (S n nK`; [F` " j ])A : i

`

j 2K`

22

Proof: Consider the inclusion. Since F` is the least failure pattern for K`, any Ki lexicographically precedes K` , and there exists j 2 K` ? Ki . We have seen in the proof of Lemma 25 that

(S n nKi \ S nK`; [Fi ] \ [F` ]) (S n nK`; [F` " j ]): Under what circumstances can Mri ?1 and Mr` ?1 agree on this complex? Because Pj has already failed in Mr` ?1 , the only executions of Mri ?1 that agree with Mr` ?1 are ones in which Pj fails without sending any messages to nonfaulty processes. But then Mri ?1 (and therefore Mr` ?1) can fail at most k ? 1 processes that do send messages to non-faulty processes. Any such execution is fr` ?1. also an execution of M For the reverse inclusion, let F be the failure pattern that agrees with F` on K` ? fj g, but does not fail Pj . [F ] \ [F` ] = [F` " j ]. The set of executions in which the two protocols agree are exactly those in which Mri ?1 immediately fails Pj , and in which Mr` ?1 fails no more than k ? 1 processes. These executions fr` ?1. comprise M

Lemma 29: If n (r + 1)k, then Mr (S m) is (m ? (n ? k) ? 1)-connected. Proof: We argue by induction on r. When r = 1, the claim follows from Lemma 26. As induction hypothesis, assume the claim for any (r ? 1)-round protocol. First, consider the case where m = n. Let (S n nK0; [F0 ]); : : : ; (S n nK`; [F` ]) be the sequence of pseudospheres less than or equal to (S n nK`; [F` ]), listed in this order. We will show inductively that if jK` j k,

[` i=0

Mri ?1 (M1K (S n )) is (k ? 1)-connected: i

The claim then follows when (S n nK`; [F` ]) is the maximal pseudosphere for jK` = k. For the base case, when ` = 0, K0 = ;, M1K0 (S n ) is (S n ; [F0 ]) = S n , and r ? 1 M0 (S n ) is (k ? 1)-connected by the induction hypothesis for r. For the induction step for `, assume

K=

`[ ?1 i=0

M1K (S n ) is (k ? 1)-connected. i

By Lemma 24,

L = Mr` ?1(M1K (S n )) = Mr` ?1 ( (S n nK`; [F` ])): Recall that Mr` ?1 is a k-faulty, (n ? jK`j + 1)-process, (r ? 1)-round protocol, where n ? jK` j (r ? 1)k, so by the induction hypothesis, for each simplex S d 2 (S n nK`; 2K ), Mr` ?1(S d ) is (d ? (n ? jK`j ? k) ? 1)-connected. By `

`

23

Theorem 9, Mr` ?1( (S n nK`; 2K` )) is (k ? 1)-connected. By Lemmas 27 and 28, either ! [ n r ? 1 K ?f i g ` K \ L = M` (S nK`; 2 ) : or

i2K`

fr` ?1 K\L =M

[ i2K`

!

(S n nK`; 2K` ?fig ) :

This protocol has (r ? 1)-rounds and degree either k or (k ? 1). By the induction fr` ?1(S n?k ) is at least (k ? 2)-connected. hypothesis, for any simplex S n?k , M K ?f i g Let Ai = 2 ` , for i 2 K`. \i2K` Ai 6=, so K \ L is also (k ? 2)-connected by Theorem 11. The claim now follows from Theorem 3. If n > m n ? k Mr (S m ) is equivalent to an m-process protocol in which at most k ? (n ? m) processes fail in the rst round, and k thereafter. This protocol has degree k ? (n ? m), so Mr (S m ) is (k ? (n ? m) ? 1)-connected, which is (m ? (n ? k) ? 1). When m < n ? k, m ? (n ? k) ? 1 < ?1, and Mr (S m ) is empty, so the condition holds vacuously.

10.3 Retiming

So far, we have shown that if n (r + 1)k, the r-round protocol executions in which at at most k processes fail comprise a (k ? 1)-connected complex, which cannot solve k-set agreement. These executions take time rd, which is short. We now show how to \stretch" the protocol. First, consider the following wait-free case. There are exactly n = (r +1)k +1 processes, and we have a \failure budget" of f = (r + 1)k. We have established that k-set agreement has no decision map on Mr (I ). Let Mr +1 (I ) denote the protocol complex at time (r +1)d ? for d > > 0, at time before round r +1. We claim that k-set agreement has no decision map on Mr +1 (I ). No process has received a message since time rd, so any decision it could make after waiting d ? without a message could have been made at time rd, implying that Mr (I ) has a decision map. Let Mr1+1 (I ) denote the protocol complex corresponding to the following executions: for each vertex ~v in Mr (I ), where P = id(~v), fail all processes but P , and run P as slowly as possible, (taking steps at multiples of time c2 ). At time rd + Cd, P will time out, but at time rd + Cd ? , this execution is indistinguishable to P from the corresponding execution in Mr +1 (I ), and thereforeMr1+1(I ) has no decision map. We have just shown a worst-case lower bound of time f k ? 1 d + Cd

to solve k-set agreement wait-free with f = n failures. For f < n, we use a group G of n ? f + 1 processes to \simulate" a single Pi as follows. Each process in G starts with the same input value, receives exactly 24

the same messages, and every other process receives all or none of the messages sent by G. Messages between a member of G and a non-member Each member of G receives messages from outside the group at intervals of d, as before, but it receives messages sent from other members of G to at multiples of d=c2 . The resulting r-round, (n + 1)-process protocol complex Gir (I ) is not the same as the r-round, (f + 1)-process protocol complex Mr (I ), but the two complexes have the same homotopy type, (just contract each simplex spanned by processes in G to a point), and are therefore both (k ? 1)-connected. is has the same homotopy type as but it has the same homotopy type as an f -dimensional pseudosphere (just contract each simplex spanned by processes in G to a point). In every round, every process in G receives exactly the same messages. The resulting protocol complex is not a pseudosphere, but it has the same homotopy type as an f -dimensional pseudosphere (just contract each simplex spanned by processes in G to a point). Using this result and the execution-stretching techniques from [ADLS94], we can prove that

Corollary 30: Any wait-free protocol for k-set agreement and n + 1 processes in the semi-synchronous model requires time nk d + Cd. As noted above, this corollary is the rst substantial new lower bound for the semi-synchronous model since the Attiya, Dwork, Lynch, and Stockmeyer consensus bound of 1993 [ADLS94]. We believe that this result can be extended to the f -resilient case, but this will require further work.

References

[ABND+ 87] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, Daphne Koller, David Peleg, and Rudiger Reischuk. Achievable cases in an asynchronous environment. In Proceedings of the 28th IEEE Symposium on Foundations of Computer Science, pages 337{346, October 1987. [ABND+ 90] Hagit Attiya, Amotz Bar-Noy, Danny Dolev, David Peleg, and Rudiger Reischuk. Renaming in an asynchronous environment. Journal of the ACM, July 1990. [ADLS94] Hagit Attiya, Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. Bounds on the time to reach agreement in the presence of timing uncertainty. Journal of the ACM, 41(1):122{152, January 1994. [Awe85] Baruch Awerbuch. Complexity of network synchronization. Journal of the ACM, 32:801{823, October 1985. 25

[BG93]

Elizabeth Borowsky and Eli Gafni. Generalized FLP impossibility result for t-resilient asynchronous computations. In Proceedings of the 25th ACM Symposium on Theory of Computing, May 1993. [BG97] Elizabeth Borowsky and Eli Gafni. A simple algorithmically reasoned characterization of wait-free computations. In Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, pages 189{198, August 1997. [Cha91] Soma Chaudhuri. Towards a complexity hierarchy of wait-free concurrent objects. In Proceedings of the 3rd IEEE Symposium on Parallel and Distributed Processing. IEEE, December 1991. Also appeared as Technical Report No. 91-024, Iowa State University, 1991. [CHLT93] Soma Chaudhuri, Maurice Herlihy, Nancy Lynch, and Mark R. Tuttle. A tight lower bound for k-set agreement. In Proceedings of the 34th IEEE Symposium on Foundations of Computer Science. IEEE, October 1993. To appear. [DM90] Cynthia Dwork and Yoram Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures. Information and Computation, 88(2):156{186, October 1990. [Fis83] Michael J. Fischer. The consensus problem in unreliable distributed systems (a brief survey). In Marek Karpinsky, editor, Proceedings of the 10th International Colloquium on Automata, Languages, and Programming, pages 127{140. Springer-Verlag, 1983. A preliminary version appeared as Yale Technical Report YALEU/DCS/RR-273. [FL82] Michael J. Fischer and Nancy A. Lynch. A lower bound for the time to assure interactive consistency. Information Processing Letters, 14(4):183{186, June 1982. [FLP85] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty processor. Journal of the ACM, 32(2):374{382, April 1985. [Gla70] L. C. Glaser. Geometrical Combinatorial Topology, volume 1. Van Nostrand Reinhold, New York, 1970. [GY97] Eli Gafni and Jiong Yang. A round-by-round failure detector | unifying synchrony and asynchrony. Unpublished manuscript, April 1997. [Had83] Vassos Hadzilacos. A lower bound for Byzantine agreement with fail-stop processors. Technical Report TR{21{83, Harvard University, 1983. 26

[HM90] [HR94]

[HS93]

[HS97]

[Lef49] [LL90]

[MR98] [Mun84] [PSL80] [Spa66] [SZ93]

Joseph Y. Halpern and Yoram Moses. Knowledge and common knowledge in a distributed environment. Journal of the ACM, 37(3):549{587, July 1990. Maurice Herlihy and Sergio Rajsbaum. Set consensus using arbitrary objects. In Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing, pages 324{333, August 1994. Maurice P. Herlihy and Nir Shavit. The asynchronous computability theorem for t-resilient tasks. In Proceedings of the 25th ACM Symposium on Theory of Computing, pages 111{120. ACM, May 1993. Gunnar Hoest and Nir Shavit. Towards a topological characterization of asynchronous complexity. In Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, pages 199{208, August 1997. S. Lefschetz. Introduction to Topology. Princeton University Press, Princeton, New Jersey, 1949. Leslie Lamport and Nancy Lynch. Distributed computing: Models and methods. In J. van Leewen, editor, Formal Models and Semantics, volume B of Handbook of Theoretical Computer Science, chapter 19, pages 1157{1199. MITpress, 1990. Yoram Moses and Sergio Rajsbaum. The uni ed structure of consensus: a layered analysis approach. Unpublished manuscript, January 1998. J. R. Munkres. Elements Of Algebraic Topology. Addison Wesley, Reading MA, 1984. Marshall Pease, Robert Shostak, and Leslie Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2):228{234, 1980. Edwin H. Spanier. Algebraic Topology. Springer-Verlag, New York, 1966. Michael Saks and Fotis Zaharoglou. Wait-free k-set agreement is impossible: The topology of public knowledge. In Proceedings of the 25th ACM Symposium on Theory of Computing, May 1993.

27