Load Balancing in Dynamic Networks - CiteSeerX

3 downloads 0 Views 293KB Size Report
email: {elsa,bm,schaum}@uni-paderborn.de. Abstract. Efficient load ... If the global imbalance vector w w is known, it is possible to compute a balancing flow by solving a ...... Hall, 1989. [13] J. H. Wilkinson, The Algebraic Eigenvalue Problem.
Load Balancing in Dynamic Networks  Robert Elsässer

Burkhard Monien

Stefan Schamberger

Fakultät für Elektrotechnik, Informatik und Mathematik Universität Paderborn Fürstenallee 11, D-33102 Paderborn email: {elsa,bm,schaum}@uni-paderborn.de

Abstract Efficient load balancing algorithms are the key to many efficient parallel applications. Until now, research in this area mainly focused on static networks. However, observations show that diffusive algorithms, originally designed for these networks, can also be applied in non static scenarios. In this paper we prove that the general diffusion scheme can be deployed on dynamic networks and show that its convergence rate depends on the average value of the quotient of the second smallest eigenvalue and the maximum vertex degree of the networks occurring during the iterations. In the presented experiments we illustrate that even if communication links of static networks fail with high probability, load can still be balanced quite efficiently. Simulating diffusion on ad-hoc networks we demonstrate that diffusive schemes provide a reliable and efficient load balancing strategy also in mobile environments. Keywords: Load balancing, diffusion schemes, dynamic networks, mobile ad-hoc networks



This work was partly supported by the German Science Foundation (DFG) project SFB-376 and by the IST Program of the

EU under contract numbers IST-1999-14186 (ALCOM-FT) and IST-2001-33116 (FLAGS).

1

1 Introduction Load balancing is a very important prerequisite for an efficient use of parallel computers. Many parallel applications produce work load dynamically and its amount per processor often changes dramatically during run time. Therefore, to reduce the overall computation time, the total work load of the network has to be distributed evenly among all nodes while the computation proceeds. Obviously, we can ensure an overall benefit of the computation only if the balancing scheme itself is highly efficient. Formally, the load balancing problem is defined as follows. Given a graph G representing the network with n nodes where each node contains work load wi , the goal is to move load across the edges so that finally the weight of each node is (approximately) equal to n

wi =

∑ wj

n

= :

j =1

If the global imbalance vector w

w is known, it is possible to compute a balancing flow by solving a

linear system of equations [1]. However, assuming that processors of the parallel network may only access information of their direct neighbors, load information has to be exchanged locally in iterations until a balanced load situation is obtained. In [2] and [3], the authors introduce the token distribution problem dealing with unit sized jobs called tokens. In this model, load cannot be split further and in every iteration only a single token can be migrated over an edge. The results provide bounds for the convergence rate of simple load balancing algorithms using the expansion properties of the graphs. Authors of other papers [4, 5] only guarantee a linear gap between upper and lower bounds on the number of needed iterations to approximately balance the load. In contrast to the token distribution model, our model allows load to be divided arbitrarily and any amount can be shifted over the network’s edges. There is plenty of work focusing on local iterative load balancing algorithms on static networks using this model [6, 7, 8]. Two subclasses of such algorithms are the diffusion schemes [6, 9] and the dimension exchange schemes [6, 10]. These two classes reflect different communication abilities of the network. Diffusion algorithms assume that a processor can send and receive messages to/from all of its neighbors simultaneously, while the dimension exchange approach is more restrictive and only allows a processor to communicate with one of its neighbors during each iteration. We concentrate on the general diffusion scheme defined in [6]. Let wki denote the load after the kth iteration 2

step on node i of the graph G = (V; E ). Then wki satisfies the equation wki = wik

1



fi jg2E

αi; j (wki

1

wkj

1

);

(1)

;

where αi; j is a properly chosen constant ensuring convergence as described later. The flow moved over each edge depends on the load of its adjacent nodes. Thus these schemes are performed locally, i.e. each step of the algorithm can be implemented in a local manner with all nodes consulting only information of their neighbors which avoids expensive global communication. As presented e.g. in [8], the convergence rate of this scheme can be described by the condition number of the unweighted Laplacian defined in the next section. There, it is also shown that known diffusion schemes can be generalized to edge weighted graphs. For node weighted networks this has been shown in [11], where load balancing on heterogeneous processor networks has been analyzed. However, the mentioned papers only consider static, synchronous networks, meaning that the topology of the network is fixed during the iterations. In this paper, we deal with diffusion schemes in dynamic networks, and show, that the general diffusion scheme from [6] can be generalized to these scenarios. We consider dynamic networks, in which the set of communication edges may vary at each time step. In any time step, a live edge is one that can transmit messages in each direction. We also assume that each node knows which of its incident edges are live. The dynamic network can be described by a sequence of undirected graphs (Gk )k0 ; where Gk = (V; Ek ) represents the network with edge set Ek occurring at time step k  0.

In [2] and [3], the already mentioned token distribution problem is also studied for dynamic networks. Moreover, the authors analyze asynchronous networks by modeling them as dynamic networks. In [12], another model of asynchronous networks is studied. There it is assumed that the load transfer can take some time to be completed and that a message sent from a processor to its neighbor is received with a delay. It is shown, that if the latency on the edges is bounded by some constant called partial asynchronism, then convergence can be guaranteed, but it is unknown how fast the system converges into a balanced state. One of the most common approaches for load balancing is the 2-step model (e. g. [8]). The first step calculates a balancing flow. This flow is then used in the second step, in which load elements are migrated accordingly. One approach for computing a balancing flow consists of the general diffusion scheme defined 3

by equation 1 and denoted as the First Order Scheme in [8]. In [7] another scheme has been introduced (called Second Order Scheme), which improves the time of iteration steps needed to compute a balancing flow by an almost quadratic factor. In [8] other schemes have been introduced and the authors have shown that all these schemes compute an l2 -minimal flow. However, the 2-step model cannot be used for dynamic networks since there is no guarantee that links used for the calculation will be present when needed for migration. Thus, in this scenario, calculating the flow and distributing the load must be performed simultaneously, resulting in some overhead since more load can be send than required in earliy iterations what is revised later. The results of this paper are concentrated on the generalized diffusion scheme of equation 1. For other schemes (like the Second Order Scheme), we could not show convergence on dynamic networks. The rest of the paper is organized as follows. In section 2 we prove that the general diffusion scheme can be applied on dynamic networks and we also determine its convergence rate. Section 3 contains the results of our experiments performed on different graphs in various dynamic scenarios. In the last section we give some conclusions.

2 Theoretical Results As mentioned in the preceding section, this paper deals with diffusion schemes in dynamic networks. We describe dynamic networks by a sequence of undirected graphs (Gk

= (V; Ek ))k0

and every node vi

2V

contains a load wi initially. The goal is to construct a local iterative load balancing algorithm which in each iteration k, migrates load only via the edges in Ek . At the end of the computation, all nodes must contain an approximately equal amount of work load. We now generalize the diffusion scheme defined in equation 1 to dynamic networks. The general diffusion scheme can be written in matrix notation as wk = Mwk where M

=

I

1

;

(2)

L˜ with I being the identity matrix and L˜ is the weighted Laplacian of G which is defined

as follows: L˜ contains

αi; j at position (i; j), and the weighted node degree ∑ j2N (i) αi; j as diagonal entry,

where N (i) represents the set of neighbors of vi (all other entries of L˜ are 0). Furthermore, for the unweighted Laplacian L of G holds αi; j = 1, iff fi; jg 2 E, and αi; j = 0; i 6= j otherwise. 4

The parameters αfi; jg control the amount of load sent over an edge fi; jg and, in order to ensure convergence, they have to be chosen such that all eingenvalues of L˜ are in the range ( 1; 1℄. We call ek

=

w the error after iteration k. As shown in [8], equation 2 implies that jjek jj2

wk

maxfjµ2j; jµnjg  jjek



jj2, where µ2 and µn are the second largest and the smallest eigenvalue of M, respectively. Furthermore, we call a system ε-balanced, if jjjjee jjjj  ε. As shown in [7], load in synchronous 1

k

0

2

2

and static networks can be ε-balanced in 

O

ln(1=ε)  dmax λ2



steps, where λ2 represents the second smallest eigenvalue of the unweighted Laplacian L of G. The diffusion scheme defined in equation 1 can be generalized to dynamic networks by using a different diffusion matrix Mk corresponding to the graph Gk in each iteration. Let L˜ k be the weighted Laplacian of Gk , where edge fu; vg 2 Ek is assigned a weight of αu;v = cmaxf1d k ;d k g with 1 < c < 2 being a constant and u v k dv is v’s degree in iteration k. The diffusion matrix Mk is set to Mk = I L˜ k . In the next lemma, we show that the smallest eigenvalue µkn of Mk is bounded by the constant

c 2 c

for

any k  0. Then, we prove that the convergence rate at time step k is determined by the second smallest eigenvalue λk2 of the corresponding unweighted Laplacian Lk , and that the convergence rate of the diffusion algorithm depends on the average value

i ∑ki=1 λi2 =dmax , k

i where dmax and λi2 are the maximum vertex degree and

the second smallest eigenvalue of the graphs Gk occurring in iteration k. Lemma 1 Let (Gk = (V; Ek ))k0 be a sequence of graphs, and let L˜ k be the corresponding weighted Laplacians. Then the smallest eigenvalue µkn of Mk = I

L˜ k fulfills the inequality µkn 

c 2 c .

Proof Denote the entry (i; j) of L˜ k with likj . Gerschgorin’s theorem (see e.g. [13]) implies, that for the largest ˜ k of L˜ k eigenvalue λ n ˜k λ n



( n

max i=1

n

)

∑ jlikj j 

j=1

!

2 n max c i=1

1 ∑ maxfd k ; d k g i j j2N (i)

 2c

;

where dik represents the degree of vertex i and N (i) is the set of the neighbors of i. This yields µkn  1

2 c

=

c

2 c

:

ut 5

˜ k of L˜ k has an upper bound of λ2 , where λk is Next, we show that the second smallest eigenvalue λ k 2 2 cdmax k

k the second smallest eigenvalue of the corresponding unweighted Laplacian Lk and dmax is the largest vertex

degree in Gk . Lemma 2 Let (Gk = (V; Ek ))k0 be a sequence of graphs, and let L˜ k be the corresponding weighted Laplacians. For each k let λk2 be the second smallest eigenvalue of the unweighted Laplacian Lk of Gk . Then the eigenvalue µk2 of Mk fulfills the inequality µk2  1

λk2 k , cdmax

k where dmax is the largest vertex degree in Gk .

Proof First, we show that if we increase the weight of an edge in a graph G, then the second smallest ˜ 2 of the weighted Laplacian L˜ of G cannot decrease. Since 1 = (1; : : : ; 1) is an eigenvector of L˜ eigenvalue λ with eigenvalue 0, it holds that

˜ 2 = min ∑fi; jg2E αi; j (xi λ x?1 ∑i2V x2i

x j )2

;

where x 2 IRn , and xi denotes the ith entry of the vector x. Let G0 be the graph obtained from G by increasing ˜ 0 of the corresponding Laplacian the weight of fu; vg 2 E by α0u;v . Then for the second smallest eigenvalue λ 2 L˜ 0 we have

˜ 0 = min ∑fi; jg2E αi; j (xi λ 2 x?1

Assume that

α (y ˜ 0 = ∑fi; jg2E i; j i λ 2

for a y ? 1. It follows that

x j )2 + α0u;v (xu

xv )2

∑i2V x2i

y j )2 + α0u;v (yu ∑i2V y2i

∑fi; jg2E αi; j (yi ∑i2V y2i

y j )2