A Hybrid Estimation of Distribution Algorithm for ... - Semantic Scholar

A Hybrid Estimation of Distribution Algorithm for CDMA Cellular System Design Jianyong Sun1 , Qingfu Zhang2 , Jin Li1 , and Xin Yao1 1

Cercia, School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT {j.sun,j.li,x.yao}@cs.bham.ac.uk 2 Department of Computer Science, University of Essex, Wivenhoe Park, CO4 3SQ [email protected]

Abstract: While code division multiple access (CDMA) is becoming a promising cellular communication system, the design for a CDMA cellular system configuration has posed a practical challenge in optimisation. The study in this paper proposes a hybrid estimation of distribution algorithm (HyEDA) to optimize the design of a cellular system configuration. HyEDA is a two-stage hybrid approach built on estimation of distribution algorithms (EDAs), coupled with a K-means clustering method and a simple local search algorithm. Compared with the simulated annealing method on some test instances, HyEDA has demonstrated its superiority in terms of both the overall performance in optimisation and the number of fitness evaluations required. Keywords: CDMA cellular system configuration design, hybrid estimation of distribution algorithm

1

Introduction

In the last decade, code division multiple access (CDMA) has become a promising technology for the mobile communication [7]. The rapidly increasing demands for cellular communication raise an important optimisation challenge for cellular service providers. In the design of a cellular communication network, the quality of service, the service coverage and the network cost are three most concerned objectives among many others. The three objectives are largely influenced by certain design parameters, such as the number of based stations (BSs), the locations of BSs, as well as the powers and antennae heights adopted in BSs. The cellular system configuration problem in this study is taken from [12]: Given a static, spatial distribution of customer demands, find a CDMA cellular system configuration, i.e., the locations of BSs, their corresponding transmitting powers and antennae heights, with the aim of optimizing call quality, service coverage and total cost of system configuration. To simplify this configuration design problem, Melachrinoudis and Rosyidi [12] have transformed this multiobjective optimisation problem into a single objective optimisation problem by aggregating three objectives with weights. In this way, the simulated annealing (SA) method was adopted by [12] to search for optimal solutions. In this paper, we shall present a hybrid estimation of distribution algorithm (HyEDA). We apply HyEDA to tackle the problem using the transformed single-

objective function given in [12]. HyEDA is a two-stage hybrid evolutionary approach, built on estimation of distribution algorithm [5]. It is incorporated with the well-known K-means clustering and a simple local search algorithm. The first stage aims to find optimal or near-optimal locations of BSs using the hybrid estimation of distribution algorithm (EDA). Given the locations of BSs found in the first stage, the second stage is intended to find optimal or nearoptimal corresponding power and antennae height for each BS using a similar simple local search method. Taking this two-stage optimisation is motivated by the intuition that the locations of BSs plays a vital role in achieving good overall performance of a cellular service network in terms of three objectives discussed aforementioned. The effectiveness of this two-stage approach has been further illustrated by our experimental results in this study, in comparison to SA. The rest of the paper is organized as follows. Section 2 describes the mathematical model of the cellular system configuration design problem proposed by Melachrinoudis and Rosyidi [12]. In Section 3, we shall discuss HyEDA in more detail. Experimental results of HyEDA on several test problems, in comparison to that of the simulated annealing method, are given in Section 4. Section 5 concludes the paper.

2

Mathematical Model

In this section, we would like to describe, in details, some notations for the design problem of a CDMA cellular system configuration, as well as, in brief, the final transformed single-objective mathematical model. The concepts discussed below are generally borrowed from the work done by Melachrinoudis and Rosyidi [12], simply because their formulation is more close to the reality of the problem, thereby suited to be taken for the application. Given a service area, A, which is a two dimensional geographical region, we could discretise A into a lattice of grid points, each of which, e.g., g(j, k), is identified by their discrete coordinates (j, k), where 1 ≤ j ≤ M and 1 ≤ k ≤ K. M and K are the maximum row and column of the lattice respectively, which are set by service engineers. Grid points are used to denote the locations of potential BSs. A cell i is defined as a set of grid points covered by BSi . Suppose that for a cellular system, n BSs and the BSs’ corresponding powers and antenna heights are going to be configured. The values of power and antenna height can only take discrete values from P = [pmin , pmax ] and H = [hmin , hmax ] with cardinalities |P | and |H|, respective. The overall performance of the system configuration is measured by three objectives, i.e., call quality, service coverage and the cost of system. The call quality can be measured by the bit error rate (BER) at MSs in the process of demodulation. The smaller the BER, the clearer the voice, and the higher the call quality. The BER value of a MS depends on its location within a cell. We denote the maximum (worst) BER value within a cell as a measure of that cell’s call quality, e.g., max(j,k)∈Si BERjk for cell i. For the whole area covered with all cells, optimizing call quality is transformed into minimizing the

maximum BER among all cells. In practice, the deviation value d+ 1i from t1 (a threshold for the BER) is defined to reflect the call quality of Cell i as follows: max(j,k)∈Si BERjk − t1 , if max(j,k)∈Si BERjk ≥ t1 ; + d1i = (1) 0, otherwise. Similarly, for the measurement of the service coverage for Cell i, the deviation d+ 2i of uncovered service (UCS) from a threshold, t2 is defined as follows. U CSi − t2 , if U CSi ≥ t2 ; + d2i = (2) 0, otherwise. The goal regarding service coverage is to minimize the max d+ 2i . i

The total cost of the system configuration, denoted by TCsystem, includes the cost of the base stations construction, their costs of powers and antennae heights. A normalized figure, TC = TCsystem/TCsystem max, is used, where TCsystem max is the maximum cost of of the whole system, which is defined as the total cost of n BSs, built at the most expensive locations using the largest possible size, the highest possible antenna. Note that the Tchebycheff distance among grid points is used to calculate the evaluation factors for the system configuration [12]. Eventually, the goal of the cellular system configuration design problem is to minimize the following objective function Z: + Z = w1 max d+ 1i + w2 max d2i + w3 T C. i

i

(3)

with the locations of BSs, their corresponding powers and antenna heights as decision variables. Readers please refer to [12] for details of the considered problem and its formulation.

3 3.1

HyEDA Algorithm Framework

HyEDA is built mainly on the hybrid estimation of distribution algorithms (EDA) [1][2][9][5]. EDA is a type of evolutionary algorithm (EAs). EAs, such as genetic algorithms (GAs), generate new offspring by using genetic operators like crossover and mutation. Different from GAs, the main characteristic of EDA is that new offspring is generated by sampling from a probabilistic distribution model. EDA extracts global statistical information from visited promising solutions to build the probabilistic distribution model. However, EDA alone cannot efficiently search for optimal or near-optimal solutions of difficult optimisation problems [5]. To improve the efficiency of EDAs, hill climibing algorithms and other techniques can be incorporated [8]. In our previous work, we have hybridized EDA with hill climbing algorithms and other techniques to solve some N P-hard optimisation problems, such as maximum clique problem [17], and the

routing and wavelength assignment problem under shared-risk-link-group constraints [16] arisen in the telecommuncation area, and continuous optimisation problems [15][6]. The general template of the hybrid estimation of distribution algorithm is summarized as follows. Initialization. Initial the probability model p0 (x), sample popsize feasible solutions from it, apply local search algorithm to each solution. The resultant solutions consists of the initial population P (0); Set t := 0; While (stop criteria are not met), do : 1) Selection Select selsize solutions as the parent set Q(t); 2) Modeling Construct a probabilistic model pt (x) according to Q(t); 3) Sampling Sample cresize offspring from pt (x). Apply local search algorithm to each sampled solution; 4) Replacement Replace partially the current population with the offspring to constitute a new population P (t + 1); set t := t + 1; The template consists of five major components, including the fitness definition, the selection, the probability construction, the sampling and the replacement. In the evolutionary process of the proposed first-stage algorithm for the cellular system design problem, the Z value of a solution is defined as its fitness. The truncation selection is adopted. At generation t, N solutions with the smallest Z values are selected to constitute the parent set Q(t) (Step 1). We use the Kmeans clustering to help the construction of the probability model pt (x) (Step 2). Details of Step 2 will be described in Section 3.3. The sampling operation creates new offspring using the previously created probability model (step 3). We take the “Accept-Reject” sampling method, which shall be described in Section 3.3. Notably, a local search algorithm is incorporated into the template, in order to tune each solution into local optimum after sampling both in population initialization and in offspring generation. We shall describe the local search algorithm in Section 3.4. To produce P (t + 1), the popsize solutions with the smallest Z values are selected from the union of the sampled solutions so far and the solutions in P (t). 3.2

Solution Representation and Probability Model Construction

A solution x of the cellular system configuration can be represented as a vector [(x1 , y1 , p1 , h1 ), ..., (xn , yn , pn , hn )], where (xi , yi ) represents the location of BS i, pi and hi are the power and antenna height of BS i, respectively. In the first stage of HyEDA, for each BS, its power and antenna height values are randomly picked from P and H. Those values are fixed over the whole process of optimisation. In the second stage, given its location, (xi , yi ) is found, we use a simple local search algorithm to optimize (pi , hi ) with respect to Function Z for each BS. The construction of the probability model resorts to the K-means clustering. In [4][11], the K-means clustering was used to help the construction of a Gaussian distribution model for continuous optimisation problem. Here, the K-means clustering method is used to cluster a set of grid points with discrete values.

The value of pij indicates the probability that a BS is located in (i, j). Probability model defines the way of assigning a probability value to each grid at each generation. At generation t, to construct the probability model from the parent K set Q(t), i.e., pt (x) (here, pt (x) = {pij }M i=1 j=1 ), we first cluster the totally N × n points distributed in the service area into n groups. Then we assign a probability pij to each grid point (i, j) based on the results of clustering. Suppose that the coordinates of the cluster centroid are {(x∗k , yk∗ ), 1 ≤ k ≤ n} after clustering, and the corresponding standard deviation of distances between points to the centroid of cluster is {σk , 1 ≤ k ≤ n}. Generally speaking, a larger σk means a bigger uncertainty to locate base station k at (x∗k , yk∗ ). Whereas a smaller σk means a smaller uncertainty, i.e., we have more confidence to set base station k at the centroid (x∗k , yk∗ ). Let dl denote the Tchebycheff distance (The Tchebycheff distance between two vectors a and b is defined as maxi |ai − bi |) between the grid point (i, j) and the centroid (x∗l , yl∗ ) for 1 ≤ l ≤ n, and k = arg min dl , then pij is assigned as follows l

pij =

+ N ( dkdk ; 0, σk2 ) for dk ≤ dkmax ; max , otherwise.

(4)

where is a positive value to guarantee that all grid point has a probability to be chosen; N is the normal probability density function (pdf); dk is the Tchebycheff distance between (i, j) to its nearest centroid (xk , yk ); and dkmax is the maximum Tchebycheff distance among the points of cluster k to the centroid. Based on Equation (4), it can be seen that the degree of variation in the values, pij , assigned to all grid points in a cluster k, would crucially depend on the size of the value of σk . If σk were smaller, the resultant probability values would vary to a greater extent. Otherwise, they are not much different. 3.3

Sampling Method

EDA employs the same sampling method for generating both an initial population and a number of offspring populations over the generations during evolution. For generating an initial population, probabilities of all M × K grid points are each set to be 1/(M ∗ K). Whereas, in generating a population of offspring, the pij for each grid point produced in Step 2 is taken. The sampling process for an individual solution is as follows. To locate the n grid points required for the solution, we select locations based on the probability model one by one. In each step, we employ the well-known roulette wheel selection method to select a location from all available grid points based on their probabilities. A grid point with a higher probability is more likely to be selected. Once the grid point is picked as a BS location, its neighborhoods are not considered as the candidates for the next potential BS location any more. We achieve this through setting the probabilities of its neighborhood to be zero. The neighborhood set of the grid point gi is defined as D(gi ) = {z : d(z, gi ) ≤ dmin }, where dmin is a user-defined parameter and d(z, gi ) is the Tchebycheff distance between z and gi .

3.4

Local Search

A local search algorithm has been used in the first stage of HyEDA, to tune each solution after initialization and offspring generation. In the second stage, HyEDA merely takes this local search method to find optimal or near-optimal values of (p, h) for each BS. The only difference of the local method between two stages lies in the definitions of neighborhood. A solution z = [(x1 , y1 ), · · · , (xn , yn )] = (g1 , · · · , gn ) represents the BS locations assigned in the first stage. For each location gi = (xi , yi ), its neighborhood N (gi ) is defined as the other 8 grid points around gi , i.e., N (gi ) = {u : d(u, gi ) = 1.}. Whereas, for each pi (,hi ), to be optimized in the second stage, its neighborhood N (pi ) (,N (hi )) is defined as all of other |P | − 1 (|H| − 1) possible values. The local search algorithm starts from a randomly initialized solution, for each component of the solution, the algorithm searches the component’s neighborhood for the “best” component value, which minimize the current objective function value. The best component value updates the current one to form a new solution. The search continues until no better solution can be found.

4

Experimental Results

In [12], a case study for a cellular system configuration was conducted for the service area of the city of Cambridge and its vicinity in Easter Massachusetts, and SA was applied to solve the problem. For that case, the service area is divided into 13 × 20 grid points, with n = 11 base stations supposed to be located for the whole area, and the weights for the objectives set to be wi = 1/3, 1 ≤ i ≤ 3. We denote this problem as problem 1. To our best knowledge from literature, we are not aware of other algorithms which have been used to address the cellular system configuration design problem, except for the simulated annealing (SA) approach given in [12]. Therefore, our experimental comparison is carried out only between the SA and HyEDA. To make the comparison between SA and HyEDA more rigourously, apart from the above problem 1, additional four problems, which are newly derived by us based on problem 1, have also been used to compare effectiveness of both algorithms. All 5 problems share the same settings of the service area and the number of BSs required to be located, but have a different scalable setting in resolutions of grid points, which determines how precisely 11 BSs can be located in the area. In problem 2 to 5, the same area are divided into 20 × 30, 26 × 40, 39 × 60, and 65 × 100 grid points respectively. As a result, finding an optimal locations turns to be even harder progressively from problem 1 to problem 5, simply because of the enlarged search space. In our experiment, the dmin values are set to 2 for the first two problems, 3 for problems 3 and 4, 4 for problem 5. The population size is set to 10 for problems 1 and 2, 20 for the rest problems. The algorithms will terminate if either the algorithm cannot find a better solution in successively 10 generations after 30 generations or a maximum 1,000,000 objective function evaluations are achieved.

Table 1. Comparisons of SA and HyEDA in terms of the mean best objective function values (Mean) and the number of function evaluations (Nfe). problem SA HyEDA instance Mean Nfe. Mean Nfe. 1 0.2414 157,893 0.2302 59,140 2 0.2440 451,120 0.2310 209,053 3 0.2491 618,278 0.2415 380,304 4 0.3167 1,000,000 0.2662 482,288 5 0.3618 1,000,000 0.3058 529,205

For each problem, we run both SA and HyEDA 10 times each. Listed in Table 1 are the average fitness objective value and the average number of fitness evaluations of 10 runs, produced by both methods on each problem. For each of 5 problems, the average fitness value produced by HyEDA is smaller than that achieved by SA, whilst the number of objective funcation evaluations required to achieve the results by HyEDA is less than that in SA. The results have demonstrated that HyEDA outperforms SA in terms of both the solution quality and the number of fitness evaluations that are required. In other words, HyEDA is more likely to find better solutions than SA using a smaller number of fitness evaluations in solving a cellular CDMA system configuration design problem.

5

Conclusion

In this paper, we proposed a two-stage hybrid estimation of distribution algorithm, HyEDA, for solving the CDMA cellular system configuration design problem. HyEDA resorts to the K-means clustering method for probability model construction, and a simple local search algorithm for solution quality improvement. HyEDA has been applied to tackle a problem given in [12], as well as some of its derived and more difficult cases. The experimental results have demonstrated that the proposed algorithm outperforms the simulated annealing approach used in [12], in terms of the quality of the solutions found and the quantity of function evaluations used. In the future, HyEDA could be improved in several ways. Firstly, we would like to carry out more experiments to thoroughly understand the effects of the components of HyEDA including the two-stage framework, the clustering method, and the Techebycheff distance metric. Secondly, we will try to improve HyEDA itself for its convergence and robustness for solving optimisation problems. Thirdly, we may enhance HyEDA by embedding multiobjective search engines [10], thereby enabling itself to handle the inherently multiobjective optimisation in the CDMA cellular system configuration design. Finally, we shall explore the principle of HyEDA further in solving similar other optimisation problems, such as the terminal assignment [13], the task assignment problem [14] and other optimisation problems raised in telecommunication area.

References 1. S. Baluja, Population-based Incremental Learning: A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning, CMU-CS94-163, Carnegie Mellon University, Pittersburgh, PA, 1994. 2. H. M¨ uhlenbein and G. Paaß, From Recombination of Gens to the Estimation of Distribution I: Binary Parameters, in Lecture Notes in Computer Science 1411: Parallel Problem Solving from Nature - PPSN IV, pp. 178-187, 1996. 3. V.K. Garg, K. Smolik and J.E. Wilkes, Applications of CDMA in Wireless/Personal Communications, Prentice-Hall, Upper Saddle River, NJ, 1997. 4. M. Pelikan and D.E. Goldberg, Genetic Algorithms, Clustering, and the Breaking of Symmetry, Illinois Genetic Algorithm Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 2000. 5. J. Sun, Hybrid Estimation of Distribution Algorithms for Optimization Problems, PhD thesis, University of Essex, 2005. 6. J. Sun, Q. Zhang and E.P.K. Tsang, “DE/EDA: A New Evolutionary Algorithm for Global Optimization,” Information Sciences, 169(3), pp. 249–262, 2005. 7. K. I. Kim, “Handbook of CDMA System Design, Engineering and Optimisation,” Prentice Hall PTR, 1999. 8. N. Krasnogor, W.Hart and J. Smith (editors), Recent Advances in Memetic Algorithms and Related Search Technologies, to appear, Springer, 2003. 9. P. Larraanaga and J.A. Lozano, Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Norwell, MA, USA, 2001. 10. J. Li, S. Taiwo, Enhancing Financial Decision Making Using Multi-Objective Financial Genetic Programming, to appear in Proceeding of 2006 IEEE Congress on Evolutionary Computation (CEC’06), July 16-21, 2006 Sheraton Vancouver Wall Centre, Vancouver, BC, Canada , 2006 11. Q. Lu and X. Yao, “Clustering and Learning Gaussian Distribution for Continuous Optimisation,” IEEE Transcations on Systems, Man, and Cybernetics, Part C: 35(2):195-204, May 2005. 12. E. Melachrinoudis, B. Rosyidi, “Optimizing the Design of a CDMA Cellular System Configuration with Multiple Criteria,” Annals of Operations Research, 106, pp. 307-329, 2001. 13. S. Salcedo-Sanz, Y. Xu and X. Yao, “Hybrid Meta-Heuristic Algorithms for Task Assignment in Heterogeneous Computing Systems,” Computers and Operations Research, 33(3):820-835, 2006. 14. S. Salcedo-Sanz and X. Yao, “A Hybrid Hopfield Network – Genetic Algorithm Approach for the Terminal Assignment Problem, ” IEEE Transactions on System, Man and Cybernetics, Part B: Cybernetics, 34(6): 2343-2353, December, 2004. 15. Q. Zhang, J. Sun, E.P.K. Tsang, J.A. Ford, “Hybrid estimation of distribution algorithm for global optimisation,” Engineering Computations, 21(1), pp. 91-107, 2003. 16. Q. Zhang, J. Sun, G. Xiao and E. Tsang, “Evolutionary Algorithm Refining a Heuristic: A Hybrid Method for Shared Path Protection in WDM Networks under SRLG Constraints,” to be appeared in IEEE Transactions on System, Man and Cybernetics, 2006. 17. Q. Zhang, J. Sun and E.P.K. Tsang, “Evolutionary Algorithm with the Guided Mutation for the Maximum Clique Problem”, IEEE Transactions on Evolutionary Computation, 9(2), pp. 192-200, 2005.