tDep&rtment of Computer Science and Engineering,. Mail Code .... paper, we present a new randomized consensus algorithm that matches ..... For exam- ple, in the case of the coin-flip of Figure. 3, this could mean that ... memo Ty_vecto T so that ..... of n2 coin flips; the value of the global coin is taken to be the majority value.
Chapter Optimal
Time
Making
Randomized
Resilient
Michael
40 Consensus
Algorithms
Sakst
Nir
Fast
in
Shavit$
Practice*
Heather
1
Introduction
1.1
Motivation
Well
+
Abstract In practice,
the design of distributed
ten geared
towards
ity of algorithms in which
in %orrnal”
at most
cur, while visions
optimizing a small
paper
we present
shared-memory
rithm
that
less failures
expected resilient
sensus algorithm, up resilient
n processors, a black with
previously
algo-
ical
if
“normal
expected
failures cur.
generates
properties,
that
time in executions
algorithm
as
an algorithm
The
ures occur.
Mail Code C-014, University of California, San Diego, La Jollal CA 92093-0114. ‘IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120.
memory
on no
a growing
ocinter-
can be proven
[7, 11, 13, 22, 16].
It
of asynchronous by
problem systems
lem:
for
reliable
is provably
[19])
Attiya,
Lynch
1
a paradigmatic that
(cf.
optimally
of failures
that
algorithms
vides gorithm
been
in the context
consensus
sha~ea’
even
patholog-
ones in which
number
properties
[7].
time
goal
runs
namely,
algorithms
memory
and Shavit
that
has recently
such
goal of hav-
running
practical
oc-
as bridg-
extremely
a small
was introduced
runs in only
●This work was supported by NSF contract CCR8911388 tDep&rtment of Computer Science and Engineering,
There
shared
where no fail-
the
executions,)’ or only
to have
on
good
opti-
of failures
the theoretical
exhibits
and
designing
perform
number
an algorithm
est in devising
for speeding-
for any decision problem
box, it modularly
behavior,
of
can be viewed
with
system
of having
issue that
a small
algorithms
the
known
the
algorithms
only
an algorithm
when
Using the novel con-
resilient
resilient when These
ing
re-
time
addresses
ing the gap between
In this
consensus
polynomial
occurred.
a highly
the same strong
cur.
in safety pro-
we show a method
given
O(log n) expected
Every
required
algorithms:
oc-
fast and highly
paper
mally
and takes at most O(*)
tim~ for any j. algorithm
highly
O(log n) expected
occur,
time even if no faults
i.e. ones
failures.
randomized
runs in only
This
of failures
building many
an optimally
silient @or
number
against
complex-
executions,
at the same time
to protect
systems is of-
the time
runs
for (defined
illustration
systems
there
in constant
no deterministic
asynchronous below)
pro-
of the
prob-
is a trivial time,
algorithm
but
althere
that
‘ [11, 13,22, 16] treat it in the context of synchronous message 351
passing
systems.
is
SAKS ET AL.
352
guaranteed
to
solve
processor
might
gorithms
have
an expected in the ily
fail.
gorithms
been
if
developed time
that
that
of processors, price
for
in the number
measure
real
time
time
In
The the
quadratic
processor
z gets
as input
and returns
as output
its
value)
decision
same
initial
turned
a boolean
subject
Validity
straints:
a boolean
:
If
value,
are equal
then
to that
number
We the
standard
of
owner,
communicate
all
processors possibly
operate
is possible
for
to halt
before
faiLstop
fault.
impossible between that
of the during
more
but
executes
10 operations
interval, while time
it.
one another
is at most
We time
Attiya, [5]
number
Dolev
acheived
algorithms
and
an algorithm
O(a)
(here
Shavit
use only
running
breakthrough
of faulty
a similar
that
to each pro-
A
time
expected
(unknown)
opera-
[6]
running bounded
~
processor); and
Asp-
time
with
size memory.
it
1.3
Our
results
a
it is
In
this
paper,
consensus
those
use the (see, e.g., is de-
unit
n.
in
runs
the
expected rithms
time for
expected faults
we present
algorithm
that
a new matches
performance
randomized the
of the
G
< f
< n, yet
time
O(log
n) in the
point
for our algorithm
0( J&)
above
exhibits
algo-
op tima~
presence
of -
or less.
in the execution
which
at least
and
in
to
solution
expected
that is the
The
to distinguish failed
available
[4] yielded
nes
causing
a model
about
exponential
[9] and
solutions
assumption the
a
to the
Li
case the
coins
to
version
and
and Herlihy
later
processors
task,
was
the as-
access
the first
in the latter
sev-
solutions
by Aspnes
its
manner,
one time
interval
during
reg-
In addition
non-faulty.
of asynchronous
algorithm
the elapsed
have
21]) in which
to be a minimal
some
read
in such
processors that
other
shared
of the the
that
other
notion
processor
can
n
and
consen-
under
has
in the first
of the random
time
of
one processor,
speeds.
or
Note
are delayed
[3, 17, 18,20, fined
one
processors
standard
each
Each
completing
for
consist
in an asynchronous
different
in
shaTed-
with
processors
at very
problem
systems
by only
tion
that,
Israeli,
but
cessor,
the
a probabilistic
[1] provided
a strong
syn-
algorithm,
are randomized
Abrahamson problem,
of its
of many
processor
the
requires
is finite.
asynchronous
that
but
the ex-
by the processor
value
Such
can be written
each
Chor,
a set of shared-registers.
ister
have shown
sumption
there
from
and
to solve
eral researchers
coin,
proved
a comprehensive
by a deterministic that
in this solution
problem,
is impossible
of termination.
systems.
processors
it
All
and Ter-
process,
a
(See also [23, 8]).
guarantee
consensus
model
While
sus problem
that
construction
that
taken
the
primitives.
problem
a decision
consider
memory
via
of steps
on the
chronization
fair
Consistency:
time
can be deduced
fundamental
re-
: For each non-faulty it returns
of this
values
are the same;
in
take
was directly
[17] presents
all decision
mination pected
con-
Herlihy
the
value;
under
to
shown
result
have
values
T
to run
be no deterministic This
processors
decision
before
z;
d~ (called
to the following
all
returned
value
value
can
implications
each
pToblem
time
processor
it has been
there
[12, 15].
consensus
in
A is the maximum
by [2, 9, 20] and implicitly
problem
fault-tolerant
where
to the problem.
of processors.
consensus
runs
is guaranteed
a non-faulty
Remarkably,
study
1.2
. A
T
for
model
and syn-
at least
that
of time,
step. al-
guarantee:
reliable
an algorithm
required
these
this
is fully
require
al-
this
if arbitrar-
pay
they
that
guarantee
even However,
the system
one
is polynomial
fail.
a stiff
even
randomization,
processors
even when chronous
problem
Using
execution
number
many
the
The
each non-faulty
one step. processor performs 10 time
Thus
if
performs 100, then units.
Note
plified
starting and
Herlihy
streamlined
algorithm.
2A straightforward lower
bound
version
From
there,
modification
of [7] implies
of the
an Q(log
is a simAspnes-
we reduce
the
of the deterministic n)
lower
bound.
OPTIMAL
TIME
running
RANDOMIZED
time
using
several
are potentially
applicable
ory
The
problems.
lows processors memory processors
property
that
non-trivial sors
to
studied tations
models
three
distinct
construct time
n)
The
assumptions
about
(z) the
step
have
writes
above
logarithmic time.
tree to collect
of
of the
our
pected rithm in
general,
both
a variation
algorithms
‘(slow”
primitive
the
case ~,
of
in terms
acheives
by
the pro-
n)
expected f run
and time
is not
implementagive
rise
to
to the consen-
the
an informal timing
in expected
time
last
a fast
work.
and
2
An
im-
the
extensions
we
correctness
algorithms.
timing
and
possible,
of the
of the
of
describes
and concludes
When
indication
in the final
section
concerning
analysis
pear
This
Proofs
analysis
will
ap-
paper.
Outline
section
of
contains
the consensus simple
Each
has
when
used
but
very
efficient
fi,
problem,
a Consensus
in
slow
sections
the
algo-
consensus
algorithm
bounded, 0(~).
the
we express
using which
implementation,
this
which
yields
algorithm.
paper,
a correct The
presented one,
the
are new highly
implementations
properties
main in
for these
these implementations yields
for
write-once-vector.
algorithm
following
exand
a shared
this
Using
algorithm
shaTed-coin,
consensus of
randomized
with
main
which
a “natural”
primitives. rithm
our
abstractions:a
contributions
number
n + j),
specially
can be scan and
and in Sec-
application
of our
and
- Herlihy
algorithm
4, we present
The
remarks
give
is
our fast implementations
primitive.
some
by
second
O(log
Aspnes
this
of the scan primitive
improvements
two
optimal
our
of the
which
of the two primitives,
5, we describe coin
a pre-
algorithm
that
In Section
plementation
as fol-
we present
Algorithm for
the
in failure-free
the first
procedure
bounded
runs
need
of memory
that
is
with
that
is organized
section
We show
shared-flip.
computation from
solutions
algo-
algorithm
time
consensus
was used in [4] and
different
solution
complexity,
paper
on the structure
algorithm.
the
the
what
two
any
given
one can modularly
wait-free
next
about
scan
the-
P,
of our the
to the vector.
of O(log when
of the
two
algorithm
acheives
rest In
uses a binary
In
time
The lows.
which
processors,
first
mem-
box,
expected
algorithm
wait-free
algorithm.
n)
scan
our
wait-free
time
collective
global
different
O(log
of correctness
operations
of shared
is obtained
written
summary,
number
it uses registers
information
have
failing
in
eliminates
expected as a black
an expected
only
problem
of [7],
powerful
executions.
with
standard
addressable for
decision
algorithm
method
the following
the above-mentioned
the
the randomized
the
a deterministic
sus
that
assumptions:
one by replacing performing
two
reads
We provide
algorithm
~.
in the
time
in
tion
W,
of
size and has low local
This
cessors
the
and
algorithm
~