Distributed Computing System - Semantic Scholar

2 downloads 0 Views 283KB Size Report
Binary Linear Programming (BLP) has been used to solve problems of this type, e.g., [4]. However, BLP re- quires a routing algorithm to be run a priori. and its so ...
Optimal and Near Optimal Web Proxy Placement Algorithms for Networks with Planar Graph Topologies Abd-Elhamid M. Taha Dept. of Electrical and Computer Eng. Queen’s University Kingston, ON K7L 3N6, Canada [email protected]

Abstract This paper investigates the placement of a finite set of web cache proxies in a computer network with a planar graph topology, based on a dynamic programming approach. Three criteria are defined for the web proxy placement: 1) the min-sum strategy with consolidated request forwarding (MS-CRF); 2) the min-sum criterion with immediate request forwarding (MS-IRF); and 3) the minmax with immediate request forwarding (MM-IRF). Optimal placement is obtained under MS-CRF. Under the MSIRF and MM-IRF, a near optimal placement is achieved. This can be improved through a refinement strategy. The paper deals with the cylindrical mesh topology as a general topology on which any planar graph can be embedded.

1. Introduction Recently, the cache location problem have been receiving increasing interest in order to alleviate the degrading Internet performance. In [3], it was shown that placing the caches at core nodes provides a more significant improvement in network performance than at networks’ ends. Binary Linear Programming (BLP) has been used to solve problems of this type, e.g., [4]. However, BLP requires a routing algorithm to be run a priori. and its solution techniques usually require exponential time. Meanwhile, algorithms based on Dynamic programming (DP) have a polynomial time complexity, and exhibit more flexibility towards changes in link or node costs. DP was used in [6] for tree topology graphs. It was also employed in [5] on tree, linear and ring networks according to the p-median (min-sum) criterion. As an example of approximations, [2] presents heuristics for the p-center (min-max) problem. The tree topology was considered in [6, 5] because routing algorithms used in internets construct the shortest routes

Ahmed E. Kamal Dept. of Electrical and Computer Eng. Iowa State University Ames, IA 50011-3060, U.S.A. [email protected]

in the form of spanning trees. While this approach is optimal for proxy allocation given the spanning tree, it may not be optimal if the two problems are treated jointly. Based on a DP approach, this paper investigates the optimal placement of a finite set of web proxies in a computer network according to three different criteria: 1. Min-sum with consolidated request forwarding (MSCRF): Minimizing the total communication cost, where the cost of information reachability from a node is taken as the cost of the direct link to the neighboring node(s) 2. Min-sum with immediate request forwarding (MS-IRF): Minimizing the total search cost, where the information reachability cost from a node is the total cost from the node to the proxy/server. 3. Min-max with immediate request forwarding (MMIRF): Minimizing the maximum information reachability cost, where the cost of information reachability is the same as in MS-IRF. A general topology of a two-dimensional unidirectional grid is considered, such that any planar graph can be mapped onto the grid in linear time [1] The joint routing/proxy allocation problem is considered as one problem, The routes in this case may not be the shortest, but will produce the optimal objective function. An optimal allocation strategy is developed under the MS-CRF criterion, while near optimal allocation strategies, improved through a proposed refinement, are developed for the MS-IRF and MM-IRF criteria. The rest of the paper is organized as follows. Section 2 defines the problem and presents the DP definitions. Section 3 presents the algorithms, while Section 4 discusses a numerical example. Finally, conclusions are given in5.

Proceedings of the 23 rd International Conference on Distributed Computing Systems Workshops (ICDCSW’03) 0-7695-1921-0/03 $17.00 © 2003 IEEE

Column 0

Column 1

Column 2

Column 3

0

1

2

3

4

5

6

7

row 2

8

9

10

11

row 3

12

13

14

15

row 0

row 1

With each node, and given the number of available proxies after processing the previous node, avail, the DP may, or may not allocate a proxy at the node. The allocation of the proxy results in decrementing the value of avail, if non-zero. The binary variable choose indicates whether a proxy has been allocated at the node or not (1 and 0, respectively). Therefore, there are 2MaxAvail+1 possible allocations of proxies to a node1 . MS-CRF: After processing i nodes, given that the number of proxies allocated at the ith node is given by choose, and the number of available proxies was avail . In this case, the objective function is given by: φM S−CRF (i, avail, choose) =

Figure 1. A two-dimensional grid.

min

s∈{0,1},k∈{neighboring nodes}

{φmin−sum (i − 1, avail + s, s) + w(i) ·

2. Problem formulation 2.1. The Network and the Problem The web proxy placement is studied on a simple, yet very versatile network, namely, the two-dimensional grid shown in Figure 1. The grid is a unidirectional grid, where all the edges in a row (column) have exactly the same direction as all other rows (columns). The grid can be represented by the graph G, where G = (V, E). V is the set of graph vertices, and E is the set of edges. In addition, we define D as the set of weights associated with each edge, and W is the set of weights associated with each vertex. Vertices correspond to nodes, while edges correspond to links. w(vi ) ∈ W is the weight associated with node vi , which may represent the traffic generated by the node, etc. d(vi , vj ) ∈ D is the weight associated with link (vi , vj ), which is used to represent link cost, hop count, etc. Nodes are numbered 0 through (N − 1), where N is the number of nodes, i.e., |V | = N . The Width of the grid is the number of columns and the Height is the number of rows, such that N = W idth × Height. The problem at hand can then be stated as follows: Given the two-dimensional grid represented by G, and given a total of P proxies, find a way to distribute the proxies among the |V | vertices in the graph, such that one of the proxies is at node 0, the root, and a certain objective function, φ, is minimized.

2.2. Dynamic Programming Definitions A proxy is always placed at node 0 - the root. The optimal proxy distribution of MaxAvail , equal to P − 1, proxies is found using our criteria. The algorithm for each criterion proceeds row by row, traversing the nodes in a row from left to right. Each stage of DP corresponds to a new node being considered, according to the above order.

costMS−CRF (i, avail, choose, k)}

(1)

where the cost of communication, per traffic unit, is given by  costM S−CRF (i, avail, choose, j) =

d(i, j) 0

choose = 0 choose = 1 (2)

MS-IRF: The form of the objective function is the same as that for the MS-CRF case, and is repeated here for completeness: φM S−IRF (i, avail, choose) =

min

s∈{0,1},k∈{neighboring nodes}

{φMS−IRF (i − 1, avail + s, s) + w(i) · costMS−IRF (i, avail, choose, k)}

(3)

However, the cost of communication, per traffic unit, is given by costM S−IRF (i, avail, choose, j) =



d(i, j) + minl,k {costM S−IRF (j, avail + l, l, k)} 0

choose = 0 choose = 1

(4)

MM-IRF: With MM-IRF, the objective function at node i is calculated as φMM−IRF (i, avail, choose) = min

s∈{0,1},k∈{neighboring nodes}

{φMM−IRF (i − 1, avail + s, s),

(5)

w(i) · costMM−IRF (i, avail, choose, k)} 1 When avail = 0, we only have one choice for the number of proxies at the node, namely, 0.

Proceedings of the 23 rd International Conference on Distributed Computing Systems Workshops (ICDCSW’03) 0-7695-1921-0/03 $17.00 © 2003 IEEE

and similar to MS-IRF, the cost of communication, per traffic unit, is given by

The above algorithm has a computational complexity that is O(N P ).

costM M −IRF (i, avail, choose, j) =

  d(i, j) + minl,k {costMM−IRF (j, avail + l, l, k)}  0

End For

choose = 0 (6) choose = 1

The number of available proxies, when inspecting a node i, can be due instances (avail,0) or (avail+1,1) at node i − 1. Referring to these two instances at node i − 1 as states 0 and 1, respectively, the value of the state is stored in the variable comeFrom[i][avail][choose] in order to keep track of how avail was reached at node i.

2.3. Backtracking, Tie Breaking and Revision The communication cost at any node i ≥ W idth depends on both the previous stage, and the node in the same column, but in the previous row. Thus, one has four options for calculating the minimal cost: Two for horizontal cost considerations, one for each originating state, and likewise for vertical. To cater for the vertical choice, a backtracking operation to the node above is employed, during which the variable comeFrom defined is used. As the outcome of two options may result in a tie, three schemes were considered for tie-breaking , namely either choosing the option corresponding to state 0, or 1, or randomly breaking the tie. Note that since decisions are made on stage by stage basis, backtracked paths could be overlapped. This could result in a minor sub-optimality2 under criteria MS-IRF and MM-IRF. Finally, when initially considering the first node in a row, the last node is not considered since the latter’s instances are yet to be calculated. Therefore, the row is repeatedly re-evaluated until a stable value is reached.

3.2. The Second (MS-IRF) and Third (MM-IRF) Criteria The objective function here is similar to the first one and, hence, the algorithm is also similar. However, the cost function is different, and row revisions must be performed: For row = 0:Height-1, step 1 revNo = 0 revDiff = ∞ While (revDiff > 0) For column = 0:Width-1, step 1 node = row*Width + column For avail= 0:MaxAvail, step 1 For choose= 0:min(1, avail) If (revNo = 0 and column = 0) use d(i, j) = ∞ for j=node to the left. evaluate φmin−sum (node,avail,j) using equations (3) and (4). update cost

and comeFrom

variables.

End For End For End For If (revNo != 0) revDiff = maxnodes cost

in row

(difference in

between this and previous iteration)

End While End For

3.1. The First Criterion: MS-CRF

This algorithm is more expensive than that of the first criterion, and it has a computational complexity that is O(N 2 P 2 ). For MM-IRF, the algorithm is identical to that employed in MS-IRF, except that the maximization employs equations (5) and (6). It also has the same computational complexity.

Here, we apply equation (1) when the cost of communication, per traffic unit, is given by equation (2):

3.3. A Refinement Algorithm

3. The Algorithms

For node = 1:N, step 1 For avail= 0:MaxAvail, step 1 For choose= 0:min(1, avail), step 1 evaluate φmin−sum (node,avail,j) using equations (1) and (2). update cost

and comeFrom

variables

End For End For 2 In the cases in which the planar graph does not allow the overlapping of the backtracked paths, the solutions are guaranteed to be optimal.

A refinement could be made to the second and third algorithms in order to enhance the solutions. Basically, the main algorithm is run twice, one with each tie-breaking scheme. Then, the mesh is tried with the better of the two schemes while toggling the decision scheme at each node individually. When the decision at a certain node results in a better solution, the new decision is maintained. The procedure is repeated until the last node is reached. This increases the computational complexity of the algorithm to O(N 3 P 2 ).

Proceedings of the 23 rd International Conference on Distributed Computing Systems Workshops (ICDCSW’03) 0-7695-1921-0/03 $17.00 © 2003 IEEE

Table 1. Results of the three criteria when applied to the network of Figure 2 Alg.

Value

MSCRF MSIRF MMIRF

9

0

Objective fn. Proxy nodes Objective fn. Proxy nodes Objective fn. Proxy nodes

3

1

5

6

w=0.2 7

9

10

10

8

11

7 w=0.7

7 10

10

12

13 w=0.6

16

5 w=0.9

17

1 w=0.2

these proxies by the different nodes is not the same, which leads to the sub-optimality of the disjoint design approach3.

5. Conclusions

w=0.2

4

1

8

14

w=0.8 9

2

Tree Network 27.7 1, 13, 15, 16 55.8 1, 3, 6, 15 10.8 1, 13, 15, 19

9 w=0.3

w=0.3

w=0.3 10

15

8

5

w=0.6

1

6

7

w=0.8

w=0.6

Optimal Solution 22.7 1, 13, 15, 16 49.2 1, 10, 13, 16 6.0 1, 13, 15, 16

9

9

3

4

w=0.3 2

2

Algorithm Results Second fit With refinement 22.7 22.7 1, 13, 15, 16 1, 13, 15, 16 50.8 49.2 1, 6, 13, 15 1, 10, 13, 16 6.0 6.0 1, 13, 15, 16 1, 13, 15, 16

6

3

3

w=0.5

6

5

3

2

w=0.7 4

First fit 22.7 1, 13, 15, 16 51.2 1, 3, 13, 15 6.0 1, 13, 15, 16

w=0.6 5 7

6 18

w=0.3

19 w=0.5

Figure 2. A sample two dimensional grid.

4. Numerical Examples To test our algorithms, we mapped the network in [6] onto a rectilinear graph, and obtained the same results. Figure 2 shows a 4 × 5 randomly generated grid, with the nodes’ weights uniformly distributed between 0.1 and 1, while the links’ weights are uniformly distributed between 1 and 10. Five proxies are to be allocated to this grid, including the root. We applied the three criteria, including the refined versions of algorithms 2 and 3. The results are shown in Table 1. The optimal solution obtained by complete enumeration. We have also applied the shortest path algorithm in order to obtain the spanning tree with the root (node 0) as the source, then applied our algorithms in order to show the effect of treating the routing and proxy location problems disjointly. For the MS-CRF criterion, the algorithm results in the optimal solution, as expected. The values for the MS-IRF criterion show how close are the first and second fit to the optimal. It also shows the improvement made by the refinement. Finally, The results of the first and second fit algorithms for the MM-IRF, in addition to the refinement, are all optimal. The disjoint treatment of the routing and proxy assignment problems resulted in worse results in the three criterions. Notice that, for the MS-CRF criterion, although all solutions result in the same choice of proxies, the use of

The web proxy placement problem in a two-dimensional grid was investigated. Three objectives were employed, namely, the min-sum problem where the link cost is the search cost, the min-sum problem where the search cost is total cost of a node reaching its nearest proxy, and the min-max problem. Comparison to exhaustive enumeration showed that the algorithms were optimal for the MS-CRF criterion, and near optimal for the other two. A refinement was introduced to enhance the latters’ solutions. The computational complexities of all algorithms are reasonable.

References [1] M. Chrobak and T. H. Payne, ”A linear-time algorithm for drawing a planar graph on a grid”, Info. Proc. Letters, Vol. 54, No. 4, pp. 241-246, 1995. [2] E. Cronin et al., Constrained Mirror Placement on the Internet, IEEE Journal on Selected Areas in Communications, Vol. 20, No. 7., pp. 1369-1382, 2002. [3] P. B. Danzig et al., A case for Caching File Objects Inside Internetworks, ACM Sigcomm, pp. 239-243, Sept. 1993. [4] A. E. Kamal and H. El-Rewini, On the Optimal Selection of Proxy Agents in Mobile Network Backbones, Proc. of Intl. Conf. on Parallel Processing, 2001. [5] P. Krishnan, D. Rax and Y. Shavitt, The Cache Location Problem, IEEE/ACM Transactions on Networking, pp. 568-582, VOL 8, Issue 5, 2000. [6] B. Li et al., On the Optimal Placement of Web Proxies in the Internet., Proc. of INFOCOM’99, March 1999. 3 Several other examples were conducted and they show similar trends. They are not shown due to space limitations.

Proceedings of the 23 rd International Conference on Distributed Computing Systems Workshops (ICDCSW’03) 0-7695-1921-0/03 $17.00 © 2003 IEEE