Coordination Failure and Congestion in Information Networks

3 downloads 0 Views 994KB Size Report
lize the scenario posed by Arthur (1994) as a sim- ... optimizing agents will not increase consumption of ... in aggregate attendance that Arthur's simulations.
From: AAAI-00 Proceedings. Copyright © 2000, AAAI (www.aaai.org). All rights reserved.

Coordination

Failure and Congestion

A. M. Bell*, W.

A. Setharest,

in Information

and J. A. Bucklewl

than sixty people

Abstract

The El Far-01or Santa Fe bar problem

the action of other agents, may be an important source of congestion in large decentralized systems.

sizes the difficulty

The El Far01 or Santa Fe bar problem provides

independent

a simple paradigm for congestion and coordina-

nism.

tion problems that may arise with over utilization

decentralized

of the Internet.

This paper recasts the problem framework and derives a simple

go, but no one has a good time

when the bar is overcrowded.

Coordination failure, or agents’ uncertainty about

in a stochastic

Networks

of coordinating

agents without

The analogy

between

a centralized

resource allocation

adaptive strategy that has intriguing optimization

work

(Sethares

sion makers, each acting in their own best interests

the dynamics

and with limited knowledge, converge to a solution

Internet

that (optimally)

with public goods prevail.

solves a complex congestion and

consider

as well as in our 1998).

and Huberman

(1994)

(1997)

when externalities

and

is noted by Green-

and Bell

and Huberman

properties; a large collection of decentralized de&

of

mecha-

the bar problem

wald, Mishra and Parikh (1998), previous

empha-

the actions

Glance

and Lukose

of congestion

on the

similar to those found Unlike the standard pub-

social coordination problem. A variation in which

lic good framework,

agents are allowed access to full information is not

optimizing

agents will not increase consumption

of

nearly as successful.

a publicly

available resource until it experiences

an

inefficient

level of congestion.

Introduction This paper

focuses

coordination

diet the behavior

on imperfect

failure

across

information

agents

as a source

and of

in this scenario fully informed

of other agents perfectly

a good

time.

The only

We uti-

(1994)

as a sim-

of agents to coordinate

plified model of a large class of congestion

and coor-

congestion

in large decentralized

dination

posed by Arthur

problems

and economic

that arise in modern engineering

systems.

El Far01 is a bar in Santa

Fe. The bar is popular,

but becomes

when more than sixty people evening.

Everyone

systems.

enjoys

overcrowded

attend on any given

themselves

when fewer

*NASA Ames Research Center, Mail Stop 269-3, Moffett Field, CA 94035-1000, [email protected] nasa.gov. ‘Department of Electrical and Computer Engineering, University of Wisconsin-Madison. tDepartment of Electrical and Computer Engineering, University of Wisconsin-Madison. Copyright @ 2000, American Association for Artificial Intelligence (www.aaai.org).

All rights reserved.

the bar

would never be crowded and all patrons would have least in a deterministic

lize the scenario

if agents could %pre-

In

a previous

1998)

we proposed

gorithm

based

in aggregate

of congestion,

their actions.

treatment

(Sethares

formation

attendance

and

adaptive which

in a decentralized

the seemingly

at

is the inability

a deterministic

on habit

agents to coordinate while avoiding

source

framework,

Bell al-

enabled

environment

random

fluctuations

that Arthur’s

simulations

demonstrated. Here we consider tic setting

where agents’

ized by a probability time.

the bar problem strategies

of attending

version of the adaptive

a clearer problem

statement,

are character-

that evolves over

There are several advantages

the stochastic

in a stochas-

to considering learning r+le:

a simpler

algorithm

that

is amenable

general results librium

to detailed We analyze

characteristics

the mixed

analysis,

of the system

and pure strategy

responding

imize pleasure

to

of the cor-

game.

observed

limiting

agents

leads them

Pareto

efficient

information

of the equilibria

crucially

available

we show that

to agents.

to successfully

emphasize

leviating

congestion

failure.

For

coordinate

example,

role that

to

the agent goes more often the bar is uncrowded,

supplying about

system-wide

and al-

routers may not

information

parameter

p. This learning

Let agents

an agent receives g is the payoff uncrowded

for attending

bar.

Without

M be the total maximum

capacity

of two strategies, stay home all s-f, ui(l,

the ith

(iteration)

of agents

a characteristic

agent

of agents

counter

attending

pure

>

N - 1. where

are (‘Qb”) such equilibria.

parameter

that

ric pure strategy no agent another

worse

strategy

Because

better

equilibrium

of the variance

Let

by 1) and

value

Bernoulli

evolution

of the p;(k)

pi(k) - p(iV(L) pi(k) - p(N(k) - p(N(lc)

The operation

-

of the algorithm

otherwise

to N(k) -N

and decreasing

it proportionally

occurs

bar is crowded.

making

each period. that

the equilibrium

When the agent attends,

proportionally

is one symmet-

results is not

attends p percent

to N(k) -N

Since the pi(k) represent

pi(k + 1) = pi(L). makes it feasible

z;(k)

is zero and

simplicity

in section

does not

of the scheme

to analyze the resulting

and as demonstrated In Arthur’s

The

if the

probabili-

Note that the stepsize

over time.

it

to lie within 0 and 1.

the agent does not attend

decrease

increasing

if the bar is uncrowded

ties, they must be constrained When

behavior,

.

formulation

of the problem,

agents

have access to information

about attendance

at the

bar even on evenings attend. rithm

efficient.

(2)

At each time k the agent flips a biased coin, attend-

utilize

strategy

de-

is uncomplicated.

N - 1, and

Pure

is then

N)‘zi(k) > 1

q(k)

-N)

pi(lc) is adjusted,

There

and zero

z;(k) < 0

-N)

pi(k).

agents

random

pi(k)

then the parameter

where every agent

Suppose that the agent initially

(1)

ing with probability

in equilibrium

of attending

Zi(lc)

= 0 for

are no symmet-

in attendance

from agents’ mixed strategies Pareto

how much

to new informa-

the instantaneous

are independent

The

bar. The game

off without

There

otherwise.

if

efficient:

off.

has the same probability

be the

fined by pi (k + 1) =

1

equilibria.

are Pareto

can be made agent

ric mixed

Nash

defines

that are 1 with probability

p;(k)

equilibrium

There

variables

and N be the

sixty agents choose to attend.

Nash equilibria

and N(lc)

Let

of pi at the time Ic. Let

if

5

setting a Nash

an

where Si consists

= g when I,_,

strategies

when exactly

is pi.

at time /c. Let p be

tion and let pi(k) designate

let h,

be zero.

by 0) and ui(O,s_i)

= b when C,_i

attends

each agent changes pi in response

0

home,

go to the bar (indicated

In a deterministic only

loss of generality

u;(s~, s-i)]

(indicated

ui(l,s_;)

s-i)

for attending

of an uncrowded

is then G = [M, {Si},

there are M

notation

for the N spaces at the bar. The

where the pi

a crowded bar and

for staying

number

of

i=l

b is the payoff

payoffs:

an agent receives

the payoff received

the state

performance.

Statement

have identical

Over time,

about

or stimulus-response.

N(lc) = 5

Algorithm

if

rule can be interpreted

the previous

that

L be a time number

p slightly)

to go less often

this in the form of the

as a kind of habit formation

probability

to max-

experiences,

(increases

but prefers

and ‘remembers’

Summarizing

environment

congestion

the bar,

painful

p) if the bar is crowded

gathers

agents competing

increased

individual

the agent

on a Our

with the desire

and minimize

more

arises from coordination

in a complex

with more information result in better

outcome.

may play in creating that

such as the Internet

available

while providing

the critical

exchange

of

In particular,

the information

equilibrium

ac-

on the nature

leads to an inefficient

information

Consistent

(to decrease

depend

the information

results

of the time.

and equi-

in relation

equilibria

The type and characteristics tually

and more

the dynamic

This

when they do not themselves

can be incorporated

into

(a), giving the update pi(lc + 1) =

0 if pi(k) - ,u(N(~) -N)

< 0

the algo-

if pi(k) - p(N(k)

1 p;(k) which

Arthur’s

otherwise

(3)

the information

structure

used by

agents.

mation

As will become

structure

of the algorithm.

clear, this infor-

as in (2) they utilize

In contrast,

(3) utilizes ‘%_ill

to whether

algorithm

the bar is crowded

if pi(Ic) - p sgn(N(k)

1

if p;(k)

- N)

- p sgn(l\r(k) -N)

- ,U sgn(N(lc)

-N)

Behavior explores

z;(k)

xi(k)

for 60 that

ing 40 agents attend less and less frequently,

with

their probability

This

parameter

of the population

very near zero. appears

nowhere in the

algorithm statement;

rather, it is an emergent prop-

or

erty of the adaptive

solution

to the El Far01 prob-

lem. When agents follow this adaptive

0

section

themselves

parameter

up-

not: pi(k + 1) =

system

the agents have divided The probability

pi(k)

run. By the

they go to El Far01 nearly every time. The remain-

division

A related version of the stochastic

This

final iteration,

simulation

of the agents has risen very near 1, indicating

agents base their updates

because agents base their decisions on

Generic

Figure 2 shows values of the probabilities over the course of a typical

When

the full record of attendance.

pi(k)

value and

about half the time.

into two groups.

“partial information”.

dates according

about the optimal

the bar is overcrowded

is a key element in the behavior

on only their own experiences information”

far greater excursions

> 1

- N)

- p(N(L)

mimics

-N)

Farol looks more like Cheers.

< 0

Despite

tic nature of agents of the adaptive converges to a pure strategy

zi(le) > 1

strategy, El the stochas-

learning rule it

Nash equilibrium.

(4)

otherwise

of the Algorithms

the generic

behavior

of the

when each of the M = 100 agents follows

the strategy

defined by (2) above.

of the various

simulations

Though

differ, a typical

illustrated

in Figure 1. The probabilities

initialized

randomly.

details case is

pi (0) were

80 $

70 15’00

Ej 60 z 0 50 t?i < 40

time (iteration number)

1500

1000

500

Figure 2: Probabilities

time (iteration number)

with Partial Information

Finally Figures 3 and 4 use the “full information” Figure I: Attendance

with Partial Information

algorithm

(3) to investigate

the effect of allowing

the agents to update their probabilities Perhaps

the most striking

aspect of these simu-

lations is the rapid convergence

to near the optimal

value of N = 60 and the associated variance

of attendance.

decline in the

The outcome

approaches

that which would be chosen with centralized

con-

eration, whether they have personally in Arthur’s proximately over time,

simulations.

Note that the transient

ences.

In comparison,

attendance

is ap-

does not decline

and often the bar is overcrowded. behavior

in the initial pe-

that is, on its own experi-

riods is masked by the long time scale.

in Arthur’s

should

setup, there are

the

structure

that seats in the bar ,often

remain unfilled,

information,

Mean

60, but the variance indicating

and makes the decision on local

attended

bar or not. This reflects the information

trol, despite the fact that each agent is autonomous, to go (or not to go) based

at every it-

be compared

to Figure

Figufe

2; the probability

4

parameters

for these agents continue to bounce ran-

tration,”

or similar responses

domly about some fixed value as their probabilities

tion, and that consequently,

all increase

lead to increased congestion.

or decrease

simultaneously

in response

to common

informa-

more information

can

to the same signals.

Analysis

of the Adaptive

Solutions

The first step in the analysis of the dynamic ior of the algorithms

is to determine

under which the means of the p(k) that is, to determine aged system.

’ 0

m

500

1500

1000

We consider

the partial

Taking the expectation E{Pi(k+

with Full Information

1)) = E{&(k))

assuming exactly

1

-@{(N(k)

that the pi(k)

information

-N)

G(k)),

are not at the boundary

when the update

remains unchanged

portion

is zero, that is,

when

_$

09

E{(N(k)

’c

08

Using (1) this can be rewritten

3 2 $2 a& Em gr

07

2 %

fixed;

of both sides of (2) gives

points 0 or 1. This expectation

x .z .z

remain

the steady states of the aver-

case first.

time (iteration number)

Figure 3: Attendance

behav-

the conditions

El&

0.6

Xj(k)-N)

- N)

G(k))

Q(k)}

= S{(F

j=l

0.5

since the xi(k)

03

otherwise.

0.2

pendent of pi(k)

1000

500

1500

is 1 with probability

R(k))

pi(lc) and zero

Because the term in parenthesis

dent Bernoulli

OO

z:j(k)+l-N) j=1 j#a

04

0.1

= 0.

( recall that the zj(k) random

variables)

is inde-

are indepen-

this becomes

M

2000

= (1 -N

time (iteration number)

+ ~E{q(k)})

pi(k)

~

j=l j#i

M

Figure 4: Probabilities Somewhat

their

a Pareto

efficient

access

with Full Information

paradoxically,

ordinate

behavior

noted a similar phenomena ulations

sue a strict

Several

authors

in transportation

strategy,

as a whole tion.

Arnott,

pur-

changing

their over

De Palma

show that congestion

of the system

if more than 25% of drivers

have access to real time information and Lindsey

PI

(1-N+Cjf2~jW)

pa(k)

tk> .

(5)

i Consider

1991)

can arise because of “concen-

any candidate

ones and N - M zeroes.

steady

state p* with N

Let II be the indices of

the ones and IO be the indices of the zeroes. there are two kinds of terms in (5).

When

Then i E 11,

C$-pj=N-landso

about conges(1996,

(1- N + C;, c(k))

rout-

how small the improvement

degrades

‘&&+)) pi(k).

have

(1991) use sim-

their current choice, the performance

(I- N +

In full vector form this is

achieves

that when individuals

best response

co-

only when agents have

and Jayakrishnan

to demonstrate

route no matter

agents successfully

and the system

outcome

to less information.

ing. Mahmassani

=

M

(l-N+xp;)

p;’ = (I-N+N-1)

pi’ = O.l (6)

When i

x%-ipj’ = N,

IO,

E

pt = 0, and hence

can be derived as an approximation neous gradient

to an instanta-

descent for minimization

of the cost

function j#a

Hence p* is a steady

0 < pi

the relevant

(7)

< 1 for at least one n.

M

i=l

i=l

In this case,

term in (5) is

M

M

is not of the form of N ones and N - M zeroes. Thus

-N)”

where

state.

any p* for which CyZ1 pj = N that

Now consider

= (E{N(k)}

J(lc)

j=l

i=l

(8) is the expected typical

number

gradient

of attendees

strategy

at time k. The

is to update

the state us-

ing

j=l

j=l

j#n

pi(k + I) = Pi(k) - P(#.

=(l-N+N-p;)p;=(l-p;)p:. This state.

With

Hence the only steady states of algorithm

are at p* consisting In particular,

at pj

=

is not a steady

In contrast, out for the

Nash

a similar

“full information”

analysis

algorithm.

Taking

states

= E(pi(k)}

-PE{Fsj(k)

occur when E{Cg,

dpi(k)

-N}.

zj(k)

which

-N}

= 0,

N.

j=l

j=l

any

p’

of this

with

j=l

xy=ip;

algorithm.

mixed strategy

equilibria

state

ual probabilities a higher

these

are not

of the El Far01 game un-

less pi = .6 for every agent. the steady

is an instantaneous of J(k).

gradient

In the

limited

The expected

return in

state.

is lower for agents whose individare lower than average as they face

actually

observed

and vice versa for those whose individual ties are higher than average.

probabili-

Consequently,

a sensi-

information Similarly,

the

(N(k) - N),

assuming

analogy,

To further

understand

system

we relate

viduals

to a global

the global behavior

the algorithms cost function.

utilized The

of the

by indialgorithm

(LMS)

in the context adaptive

J(k)

these algorithms

Mean Square

oc-

regardless

Adding the a priori lim(2) and (3).

= N in a steady

the limited information equilibria

based

on the

althe

with full

costs will be $-Var[N

algorithm

equilibrium

their parame-

update

(k)].

sign of

4, can be derived from the absolute

value cost function

adjust

.(IO)

costs will be 0, whereas

the expected

rule might converge to a pure strategy agents

this

to a pure strategy

ble learning

ters slowly overtime.

case

E{N(k)}

However, because converges

that the bar will be crowded,

-N).

every iteration

attendance.

algorithms

to the

= 1, in the full information

occurs

gorithm

probability

P (N(k)

its on pi(k) then gives the algorithms For both

value gives

this into (9) gives

information

curs only when xi(k) of the agent’s

-N.

approximation

Substituting

case this update

N is a steady

=

Note that

= I, and hence

N(k) -N

M

pi(k + I) = pi(k) -

E{erj(k)] =gE{zj(k)] =cpj(k) = state

is w

by its instantaneous

-dJ(k)

i e., whenever

Hence

E{N(k)}

Replacing

of both sides of (3) gives

+ I)}

d;$L;)}. *

3 % =E(N(k)}

carried

j=l

Steady

From (8), the derivative

-N)

1, b x

(for g =

state of this algorithm.

consider

the expectation

E(G(~

mixed strategy

.6 for all j

as in (7),

dJO =(E{N(k)} dpi(k)

(2)

of N ones and M - N zeroes.

the symmetric

equilibrium

-0.98)

J(lc)

be zero and hence p’ is not a steady

cannot

(9)

2

LMS algorithm.

are variants

algorithms

of linear

filtering;

= IE{N(k)}

system

-

nil.

By

of the Least

which are common identification

(4) is an analog

and

of the signed

The

adaptive

mechanism ized decision interests

thus

provides

a large collection

makers,

a simple

of decentral-

each acting in their own best

and with only limited knowledge,

a complex lem.

solution

whereby

congestion

Moreover,

atively

rapid

can solve

and social coordination

convergence

(depending

prob-

to the solution

on the initial

is rel-

conditions)

and robust.

References Arnott,

R.;

De Palma

Information Facilities

and Usage of Free-access with Stochastic

International Arnott, Does

Capacity Review

Economic

R.;

De Palma

Providing

Traffic

A.; and Lindsey,

R. 1996.

Congestible and Demand.

37(1):181-203.

A.; and Lindsey,

Information

Transportation

Congestion?

R. 1991.

to Drivers

Reduce

Research

A

25A(5):309-318. Arthur,

W.

B.

1994.

Bounded

Rationality:

American

Economic

Inductive

Reasoning

EZ Far01

The Review:

Papers

and

Problem.

and Proceed-

ings 1994 84(May):406-411. Glance,

N

S.;

Dynamics

and Huberman,

of Social

B. A. 1994. Scientific

Dilemmas.

The

Ameri-

can (March):76-83. Greenwald, The

A.; Mishra,

Santa

Fe Bar

cal and Practical

B.;

and Parikh,

Problem

Revisited:

Implications.

R. 1998. Theoreti-

Technical

Report,

New York University. Huberman,

B. A.;

cial Dilemmas 277(July

and Lukose,

and Internet

R. M. 1997.

Congestion.

So-

Science

25):535-537.

Mahmassani, System

H. S.;

and Jayakrishnan,

Performance

Real-time ridor ,”

and User

Information Transportation

in a Congested Research

R.

Response A

1991. Under

Traffic Cor25A(5):293-

307. Sethares,

W. A.; and Bell, A. M. 1998. An Adap-

tive Solution

of the

to the El Far01 Problem.

36th Annual

munication, Sept. 1998.

Control,

Allerton

Proceedings

Conference

and Computing,

on Com-

Allerton

IL,