Optimal Time Randomized Consensus

0 downloads 0 Views 986KB Size Report
tDep&rtment of Computer Science and Engineering,. Mail Code .... paper, we present a new randomized consensus algorithm that matches ..... For exam- ple, in the case of the coin-flip of Figure. 3, this could mean that ... memo Ty_vecto T so that ..... of n2 coin flips; the value of the global coin is taken to be the majority value.
Chapter Optimal

Time

Making

Randomized

Resilient

Michael

40 Consensus

Algorithms

Sakst

Nir

Fast

in

Shavit$

Practice*

Heather

1

Introduction

1.1

Motivation

Well

+

Abstract In practice,

the design of distributed

ten geared

towards

ity of algorithms in which

in %orrnal”

at most

cur, while visions

optimizing a small

paper

we present

shared-memory

rithm

that

less failures

expected resilient

sensus algorithm, up resilient

n processors, a black with

previously

algo-

ical

if

“normal

expected

failures cur.

generates

properties,

that

time in executions

algorithm

as

an algorithm

The

ures occur.

Mail Code C-014, University of California, San Diego, La Jollal CA 92093-0114. ‘IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120.

memory

on no

a growing

ocinter-

can be proven

[7, 11, 13, 22, 16].

It

of asynchronous by

problem systems

lem:

for

reliable

is provably

[19])

Attiya,

Lynch

1

a paradigmatic that

(cf.

optimally

of failures

that

algorithms

vides gorithm

been

in the context

consensus

sha~ea’

even

patholog-

ones in which

number

properties

[7].

time

goal

runs

namely,

algorithms

memory

and Shavit

that

has recently

such

goal of hav-

running

practical

oc-

as bridg-

extremely

a small

was introduced

runs in only

●This work was supported by NSF contract CCR8911388 tDep&rtment of Computer Science and Engineering,

There

shared

where no fail-

the

executions,)’ or only

to have

on

good

opti-

of failures

the theoretical

exhibits

and

designing

perform

number

an algorithm

est in devising

for speeding-

for any decision problem

box, it modularly

behavior,

of

can be viewed

with

system

of having

issue that

a small

algorithms

the

known

the

algorithms

only

an algorithm

when

Using the novel con-

resilient

resilient when These

ing

re-

time

addresses

ing the gap between

In this

consensus

polynomial

occurred.

a highly

the same strong

cur.

in safety pro-

we show a method

given

O(log n) expected

Every

required

algorithms:

oc-

fast and highly

paper

mally

and takes at most O(*)

tim~ for any j. algorithm

highly

O(log n) expected

occur,

time even if no faults

i.e. ones

failures.

randomized

runs in only

This

of failures

building many

an optimally

silient @or

number

against

complex-

executions,

at the same time

to protect

systems is of-

the time

runs

for (defined

illustration

systems

there

in constant

no deterministic

asynchronous below)

pro-

of the

prob-

is a trivial time,

algorithm

but

althere

that

‘ [11, 13,22, 16] treat it in the context of synchronous message 351

passing

systems.

is

SAKS ET AL.

352

guaranteed

to

solve

processor

might

gorithms

have

an expected in the ily

fail.

gorithms

been

if

developed time

that

that

of processors, price

for

in the number

measure

real

time

time

In

The the

quadratic

processor

z gets

as input

and returns

as output

its

value)

decision

same

initial

turned

a boolean

subject

Validity

straints:

a boolean

:

If

value,

are equal

then

to that

number

We the

standard

of

owner,

communicate

all

processors possibly

operate

is possible

for

to halt

before

faiLstop

fault.

impossible between that

of the during

more

but

executes

10 operations

interval, while time

it.

one another

is at most

We time

Attiya, [5]

number

Dolev

acheived

algorithms

and

an algorithm

O(a)

(here

Shavit

use only

running

breakthrough

of faulty

a similar

that

to each pro-

A

time

expected

(unknown)

opera-

[6]

running bounded

~

processor); and

Asp-

time

with

size memory.

it

1.3

Our

results

a

it is

In

this

paper,

consensus

those

use the (see, e.g., is de-

unit

n.

in

runs

the

expected rithms

time for

expected faults

we present

algorithm

that

a new matches

performance

randomized the

of the

G

< f

< n, yet

time

O(log

n) in the

point

for our algorithm

0( J&)

above

exhibits

algo-

op tima~

presence

of -

or less.

in the execution

which

at least

and

in

to

solution

expected

that is the

The

to distinguish failed

available

[4] yielded

nes

causing

a model

about

exponential

[9] and

solutions

assumption the

a

to the

Li

case the

coins

to

version

and

and Herlihy

later

processors

task,

was

the as-

access

the first

in the latter

sev-

solutions

by Aspnes

its

manner,

one time

interval

during

reg-

In addition

non-faulty.

of asynchronous

algorithm

the elapsed

have

21]) in which

to be a minimal

some

read

in such

processors that

other

shared

of the the

that

other

notion

processor

can

n

and

consen-

under

has

in the first

of the random

time

of

one processor,

speeds.

or

Note

are delayed

[3, 17, 18,20, fined

one

processors

standard

each

Each

completing

for

consist

in an asynchronous

different

in

shaTed-

with

processors

at very

problem

systems

by only

tion

that,

Israeli,

but

cessor,

the

a probabilistic

[1] provided

a strong

syn-

algorithm,

are randomized

Abrahamson problem,

of its

of many

processor

the

requires

is finite.

asynchronous

that

but

the ex-

by the processor

value

Such

can be written

each

Chor,

a set of shared-registers.

ister

have shown

sumption

there

from

and

to solve

eral researchers

coin,

proved

a comprehensive

by a deterministic that

in this solution

problem,

is impossible

of termination.

systems.

processors

it

All

and Ter-

process,

a

(See also [23, 8]).

guarantee

consensus

model

While

sus problem

that

construction

that

taken

the

primitives.

problem

a decision

consider

memory

via

of steps

on the

chronization

fair

Consistency:

time

can be deduced

fundamental

re-

: For each non-faulty it returns

of this

values

are the same;

in

take

was directly

[17] presents

all decision

mination pected

con-

Herlihy

the

value;

under

to

shown

result

have

values

T

to run

be no deterministic This

processors

decision

before

z;

d~ (called

to the following

all

returned

value

value

can

implications

each

pToblem

time

processor

it has been

there

[12, 15].

consensus

in

A is the maximum

by [2, 9, 20] and implicitly

problem

fault-tolerant

where

to the problem.

of processors.

consensus

runs

is guaranteed

a non-faulty

Remarkably,

study

1.2

. A

T

for

model

and syn-

at least

that

of time,

step. al-

guarantee:

reliable

an algorithm

required

these

this

is fully

require

al-

this

if arbitrar-

pay

they

that

guarantee

even However,

the system

one

is polynomial

fail.

a stiff

even

randomization,

processors

even when chronous

problem

Using

execution

number

many

the

The

each non-faulty

one step. processor performs 10 time

Thus

if

performs 100, then units.

Note

plified

starting and

Herlihy

streamlined

algorithm.

2A straightforward lower

bound

version

From

there,

modification

of [7] implies

of the

an Q(log

is a simAspnes-

we reduce

the

of the deterministic n)

lower

bound.

OPTIMAL

TIME

running

RANDOMIZED

time

using

several

are potentially

applicable

ory

The

problems.

lows processors memory processors

property

that

non-trivial sors

to

studied tations

models

three

distinct

construct time

n)

The

assumptions

about

(z) the

step

have

writes

above

logarithmic time.

tree to collect

of

of the

our

pected rithm in

general,

both

a variation

algorithms

‘(slow”

primitive

the

case ~,

of

in terms

acheives

by

the pro-

n)

expected f run

and time

is not

implementagive

rise

to

to the consen-

the

an informal timing

in expected

time

last

a fast

work.

and

2

An

im-

the

extensions

we

correctness

algorithms.

timing

and

possible,

of the

of the

of

describes

and concludes

When

indication

in the final

section

concerning

analysis

pear

This

Proofs

analysis

will

ap-

paper.

Outline

section

of

contains

the consensus simple

Each

has

when

used

but

very

efficient

fi,

problem,

a Consensus

in

slow

sections

the

algo-

consensus

algorithm

bounded, 0(~).

the

we express

using which

implementation,

this

which

yields

algorithm.

paper,

a correct The

presented one,

the

are new highly

implementations

properties

main in

for these

these implementations yields

for

write-once-vector.

algorithm

following

exand

a shared

this

Using

algorithm

shaTed-coin,

consensus of

randomized

with

main

which

a “natural”

primitives. rithm

our

abstractions:a

contributions

number

n + j),

specially

can be scan and

and in Sec-

application

of our

and

- Herlihy

algorithm

4, we present

The

remarks

give

is

our fast implementations

primitive.

some

by

second

O(log

Aspnes

this

of the scan primitive

improvements

two

optimal

our

of the

which

of the two primitives,

5, we describe coin

a pre-

algorithm

that

In Section

plementation

as fol-

we present

Algorithm for

the

in failure-free

the first

procedure

bounded

runs

need

of memory

that

is

with

that

is organized

section

We show

shared-flip.

computation from

solutions

algo-

algorithm

time

consensus

was used in [4] and

different

solution

complexity,

paper

on the structure

algorithm.

the

the

what

two

any

given

one can modularly

wait-free

next

about

scan

the-

P,

of our the

to the vector.

of O(log when

of the

two

algorithm

acheives

rest In

uses a binary

In

time

The lows.

which

processors,

first

mem-

box,

expected

algorithm

wait-free

algorithm.

n)

scan

our

wait-free

time

collective

global

different

O(log

of correctness

operations

of shared

is obtained

written

summary,

number

it uses registers

information

have

failing

in

eliminates

expected as a black

an expected

only

problem

of [7],

powerful

executions.

with

standard

addressable for

decision

algorithm

method

the following

the above-mentioned

the

the randomized

the

a deterministic

sus

that

assumptions:

one by replacing performing

two

reads

We provide

algorithm

~.

in the

time

in

tion

W,

of

size and has low local

This

cessors

the

and

algorithm

~