Unconstrained and Constrained Blackbox ... - Semantic Scholar

8 downloads 0 Views 918KB Size Report
ward blackbox optimization (BBO)|optimization in presence of little domain ... In blackbox optimization, the objective function is often available as a black box, i.e. ...
Unconstrained and Constrained Blackbox Optimization: The SEARCH Perspective Hillol Kargupta, Vijay Hanagandi

David E. Goldberg

Computational Science Methods division Department of General Engineering Los Alamos National Laboratory University of Illinois at Urbana-Champaign Los Alamos, NM, USA. Urbana, IL, USA.

Abstract

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995a; Kargupta & Goldberg, 1995) o ered an alternate perspective toward blackbox optimization (BBO)|optimization in presence of little domain knowledge. The SEARCH framework investigates the conditions essential for transcending the limits of random enumerative search using a framework developed in terms of relations, classes and partial ordering. This paper rst reviews some of the main results of that work. A closed form bound on the sample complexity in terms of the cardinality of the relation space, class space, desired quality of the solution and the reliability is presented. The class of order-k delineable problems that can be eciently solved is identi ed and some general algorithm independent strategies for making BBO algorithms ecient are identi ed. This paper also introduces an extension of SEARCH for constrained BBO (cBBO)|the cSEARCH framework. Fundamental issues of cBBO are viewed in the light of SEARCH and analyzed. Finally, the SEARCH perspective of genetic algorithms is developed to identify their strengths and weaknesses.

1 Introduction The increasing desire for solving large, high dimensional optimization problems with little available domain knowledge has aided the development of many algorithms such as genetic algorithms (De Jong, 1975; Holland, 1975), simulated annealing (Kirpatrick, Gelatt, & Vecchi, 1983), and tabu search (Glover, 1989). Unfortunately each of these algorithms has created a separate eld on itself with little understanding about the fundamental processes and little cross-fertilization. The SEARCH (Search Envisioned As Relation and Class Hierarchizing) framework introduced elsewhere (Kargupta, 1995a) made an e ort to capture the fundamental computations in blackbox optimization (BBO)|optimization in presence of little domain knowledge|in terms of relations, classes and partial ordering. SEARCH is primarily motivated by the observation that searching for optimal solution in a BBO is essentially an inductive process (Michalski, 1983) and in absence of any relation among the members of the search space, induction is no better than enumeration (Watanabe, 1969). SEARCH decomposed BBO into (1) relation, (2) class, and (3) sample spaces. SEARCH also identi ed the importance of searching for appropriate relations in BBO. No BBO algorithm can eciently solve a reasonably general class of problems unless it searches for relations. Kargupta (1995a) also showed that the class of order-k delineable problems can be solved in SEARCH with sample complexity polynomial in problem size, desired quality and reliability of the solution.  The author can be reached at, P.O. Box 1663, XCM, Mail Stop F645, Los Alamos National Laboratory, Los Alamos,

NM 87545, USA. e-mail: [email protected]

1

These results are relevant for any BBO algorithm. Therefore evolutionary optimization algorithms must pay attention to them. In this paper we present a brief description of the SEARCH e ort. We present some of the main results without going through the detailed derivations. Section 2 reviews the previous e orts in this area. Section 3 brie y describes the main concepts and the foundation of SEARCH. Section 4 describes what it means to be a BBO algorithm in SEARCH. Section 4.1 and Section 4.2 point out some possible ways to make BBO algorithms ecient. This is followed by a discussion on problem diculty in Section 5. Section 6 specializes the results for sequence representation and identi es the class of order-k delineable problems, that can be solved in polynomial sample complexity in SEARCH. Section 7 introduces the cSEARCH|an extension of the SEARCH framework for constrained BBO. Computational issues in constrained BBO (cBBO) are viewed in the light of SEARCH and analyzed for developing the cSEARCH. Section 10 discusses the pros and cons of genetic algorithms in the light of SEARCH. Finally, Section 11 concludes this paper.

2 Previous Work In blackbox optimization, the objective function is often available as a black box, i.e., for a given decision variable, x in the feasible domain, it returns the function value (x). No local or global information about the function is assumed. Let us denote the nite input and the output spaces by X and Y , respectively. The general blackbox optimization problem can be formally de ned as follows. Given a blackbox that somehow computes (x) for an input x, :X !Y (1) The objective of a maximization problem is to nd some x 2 X such that (x)  (x) for all x 2 X . Performance of an optimization algorithm in this model depends on the information collected by sampling di erent regions of the search space. Genetic algorithms (GAs), simulated annealing (SA), tabu search are some examples of popular BBO algorithms. With all these di erent BBO algorithms in our arsenal, it is quite natural to ask whether they can be studied on common grounds using common principles. Several previous e orts have been made to address this question. Although Holland's (1975) framework for adaptive search was primarily motivated by evolutionary computation, the underlying concepts of search based on schema processing and decision making are fundamental issues that are equally relevant in the context of any other adaptive BBO algorithms. In fact, Holland's work (1975) is the root of the current thesis. Davis (1987) made an e ort to put literature on SAs and GAs under a common title. Unfortunately, this book did not make any direct e ort to link them; rather, it simply discussed them separately. Sirag and Weisser (1987) combined several genetic operators into a uni ed thermodynamic operator and used it to solve traveling salesperson problems. However, their paper did not study the fundamental similarities and di erences between SAs and GAs. Goldberg (1990) addressed this issue. He presented a common ground to understand the e ects of the di erent operators of SAs and GAs. He also proposed the Boltzmann tournament selection operator, which attempts to achieve Boltzmann distribution over the population. Mahfoud and Goldberg (1992) introduced a parallel genetic version of simulated annealing called parallel recombinative simulated annealing. This algorithm attempted to harness the strengths of both SAs and GAs. Recently Rudolph (1994) developed a Markov chain formulation of SAs and GAs for analyzing their similarities and di erences. Jones and Stuckman (1992) made an interesting e ort to relate GAs with Bayesian approaches to global optimization. They noted the similarities and di erences between these two approaches and concluded that they share many common grounds. They also developed hybrid algorithms that try to harness the strengths of both approaches. Recently Jones (1995) proposed a framework to study 2

f # ##

# # # f

Relation space

1 # ## 0 # ##

1 1 0 0 1 0 1 1

# # #1

0 1 1 0

# # #0

0 0 0 1

Class space

Sample space

Figure 1: Decomposition of blackbox optimization in SEARCH. the correspondence between evolutionary algorithms and heuristic state space search of graph theory. In this approach the search domain of the objective function is viewed as a directed, labeled graph. Jones and Forrest (1995) also proposed the tness-distance-correlation measure for quantifying search diculty and applied this measure to several classes of objective functions. Wolpert and Macready (1995) proposed the so called NFL theorem that con rmed the fundamental lessons of Watanabe's (Watanabe, 1969) ugly duckling theorem. (Radcli e & Surry, 1995) made a similar e ort and emphasized the need for representational bias. Unfortunately, very few of the previous works actually made a quantitative e ort to study the computational capabilities of blackbox search algorithms that lead to identifying a class of BBO problems that can be solved eciently. Little attention has been paid to quantifying the role of relations in BBO, which is essential for transcending the limits of random enumerative search. We still lack any common framework that describes these di erent algorithms in terms of the basic concepts of theory of computation. The SEARCH framework, makes an attempt to do that. The following section presents a brief description of SEARCH.

3 The SEARCH Framework This section presents a brief review of the SEARCH framework. The main objective is to understand the di erent decision makings in blackbox search and to identify the class of BBO problems that can be eciently solved. Section 3.1 describes the general concepts. Section 3.2 discusses the bound on sample complexity in SEARCH.

3.1 Foundation

The foundation of SEARCH is laid on a decomposition of the blackbox search problem into relation, class, and sample spaces. A relation is a set of ordered pairs. For example, in a set of cubes, some white and some black, the color of the cubes de nes a relation that divides the set of cubes into two subsets|set of white cubes and set of black cubes. Consider a 4-bit binary sequence. There are 24 such binary sequences. This set can be divided into two classes using the equivalence relation1 f ###, where f denotes position of equivalence; the # character matches with any binary value. This equivalence relation divides up the complete set into two equivalence classes, 1### and 0###. The class 1### contains all the sequences with 1 in the leftmost position and 0### contains those with a 0 in that position. The total number of classes de ned by a relation is called its index. The order of a relation is the logarithm of its index with some chosen base. In a BBO problem, relations among the search space members are often introduced through di erent means, such as 1 An equivalence

relation is a relation that is re exive, symmetric, and transitive.

3

representation, operators, heuristics, and others. The above example of relations in binary sequence can be viewed as an example of relation in the sequence representation. In a sequence space of length `, there are 2` di erent equivalence relations. The search operators also de ne a set of relations by introducing a notion of neighborhood. For a given member in the search space, the search operators de ne a set of members that can be reached by one or several application of the operators. This introduces relations among the members. Heuristics identi es a subset of the search space as more promising than others often based on some domain speci c knowledge. Clearly this can be a source of relations. Relations can sometimes be introduced in a more direct manner. For example, Perttunen and Stuckman (1990) proposed a Bayesian optimization algorithm that divides the search space into Delaunay triangles. This classi cation directly imposes a certain relation among the members of the search space. The same goes for interval optimization (Ratschek & Voller, 1991), where the domain is divided into many intervals and knowledge about the problem is used to compute the likelihood of success in those intervals. As we see, relations are introduced by every search algorithm, either implicitly or explicitly. The role of relations in BBO is very fundamental and important. Relations divide the search space into di erent classes and the objective of sampling based BBO is to detect those classes that are most likely to contain the optimal solutions. To do so requires constructing a partial ordering among the classes de ned by a relation. The classes are evaluated using samples from the search domain and a class comparison statistic is used for comparing di erent classes. For a given class comparison statistic T and some number M , a relation is said to properly delineate the search space if the class containing the optimal solution is within the top M classes, when the set of all classes de ned by the relation are ordered using T . This basically means that if a relation satis es the delineation constraint then, given sucient samples, the relation will pick up the class containing the optimal solution within the top M ranked classes. If a relation does not satisfy this, then the relation leads to wrong decision and as a result success in nding the optimal solution is very unlikely. A particular relation may not satisfy the delineation constraint for di erent problems, di erent class comparison statistics, and di erent values of M . One relation may work for a particular case and may fail to do so for a di erent setting. Therefore, any algorithm that aspires to be applicable for a reasonably general class of problems, must search for appropriate relations. Determining whether or not a relation satis es this delineation constraint requires decision making in absence of complete knowledge. For a given relation space r , a BBO algorithm must identify the relations that properly delineate the search space with certain degree of reliability and accuracy. This requires comparing one relation with another using a relation comparison statistic and constructing a partial ordering among them. A BBO algorithm in SEARCH cannot be ecient if it needs to consider relations that divide the search space in classes, with the total number of classes growing exponentially with the problem dimension. For example, in an `-bit sequence representation, if there is a class of problems which requires considering the equivalence relations with (` ? 1) xed bits then there is a major problem. This relation divides the search space into 2`?1 classes and we cannot solve this problem in complexity polynomial in `. However, in BBO the ultimate objective is to identify the optimal solution which basically de nes a singleton class. The smaller the cardinality of the individual classes, the larger the index of the corresponding relation. So we need the higher order relations for nally identifying the optimal solution, but we cannot directly evaluate them since their index is large. The solution is to limit our capability and realize that we can only solve those problems which can be addressed using low order relations and when high order relations are decomposable to those low order relations. This means that the information about low order relations can be used to evaluate the higher order relations. Consider the following example. Let r0 be a relation that is logically equivalent to r1 ^ r2, where r1 and r2 are two di erent relations; the sign ^ denotes logical AND operation. If either of r1 4

F ’(Φ

j, i )

α F( Φ d

Φ[τ] , j, i Φ[τ] , k, i

k, i

)

Φ

Figure 2: Fitness distribution function of two classes Cj;i and Ck;i. or r2 was earlier found to properly delineate the search space, then the information about the classes that are found to be bad earlier can be used to eliminate some classes in r0 from further consideration. This process in SEARCH is called resolution. Resolution basically evaluates the relations of higher order using the information gathered by direct evaluation of lower order relations. The above description gives a brief informal overview of the SEARCH framework. As we saw, SEARCH addresses BBO on three distinct grounds: (1) relation space, (2) class space, and (3) sample space. Figure 1 shows this fundamental decomposition in SEARCH. The major components of SEARCH can be summarized as follows: 1. classi cation of the search space using relations; 2. sampling; 3. evaluation, ordering, and selection of better classes; 4. evaluation, ordering, and selection of better relations; 5. resolution. A detailed description of each of these processes can be found elsewhere (Kargupta, 1995a). In the following part of this section we consider the expression for sample complexity in SEARCH derived elsewhere (Kargupta, 1995a) and de ne the class of order-k delineable problems that can be eciently solved in SEARCH.

3.2 Sample complexity in SEARCH

For a given relation space, and an algorithm in SEARCH, it is possible to derive the bound on sample complexity for the desired quality and reliability of decision making. De ning an algorithm in SEARCH rst requires specifying class and relation comparison statistics. Although, most of the existing BBO algorithms do not explicitly de ne them, SEARCH does so in order to quantify and understand the role of decision making in the relation and class spaces. Kargupta (1995a) considered distribution free ordinal comparison statistics. In an ordinal comparison statistic, two distributions are compared on the basis of some chosen percentile. Figure 2 shows the cumulative distribution function F 0 and F of two arbitrary subsets Cj;i and Ck;i , respectively. Indices j , k represent the two classes de ned by some relation ri . When these two classes are compared on the basis of the 5

quantile, then we say Cj;i  Ck;i, since [ ];j;i  [ ];k;i; [ ];j;i and [ ];k;i are the solutions of F 0 (j;i ) = and F (k;i ) = , respectively. Let us de ne d = F ([ ];k;i) ? F ([ ];j;i):

The variable d de nes the zone of indi erence, which is basically the di erence in the percentile value of [ ];k;i and that of [ ];j;i computed from the same cdf F . Figure 2 clearly explains this de nition. We can quantify the decision making process using such ordinal class and relation comparison statistics. Let r be the given relation space and Sr  r be the set of relations needed to solve the given problem. We denote the index of a relation ri by Ni . De ne, Nmax = maxfNi j8ri 2 Sr g d = minfF ([ ];;i) ? F ([ ];j;i)j8j; 8ig: 0

where F ([ ];;i) is the cdf of the class containing the optimal solution. The index j varies over all the classes 0de ned by a relation ri. Index i varies over all the relations in r . If d is a constant such that d  d, that corresponds to the desired quality of decision making in the class space, the bound on overall success probability is, [(1 ? 2nH ( )( ? d ) n )(Nmax?Mmin ) ]kSr k qr  q

(2)

where H ( ) is the binary entropy function, H ( ) = ? log2 ? (1 ? ) log2 (1 ? ). From this we can derive the bound on overall sample complexity, 1   NmaxkSr k log(1 ? qqr kSr k(Nmax?Mmin) ) : (3) SC  ?d where q is the overall desired success probability in SEARCH and qr is the desired success probability in the relation space. Mmin is a constant that depends on the memory used by the algorithm. The

success probability in the relation space depends on the appropriate decision making and the statistic de ned for comparing relations with each other. Similar expression for the computational complexity in the relation space can be derived as shown elsewhere (Kargupta, 1995a). Inequality 3 presents the upper bound on the sample complexity in BBO. As we increase the success probability in the relation space, the overall success probability in the combined relation and class spaces increases. The sample complexity should therefore decrease as success probability in the relation space increases. This also shows that SC decreases with increase in qr . Note that the ratio   kS k(N 1 ?M ) q r max min approaches 1 in the limit as kS k(N r max ? Mmin) approaches in nity. Therefore, qr SC grows at most linearly with the maximum index value Nmax and the cardinality of the set Sr . Recall that d de nes the desired region of indi erence; in other words, it de nes a region in terms of percentile within which any solution will be acceptable. The sample complexity decreases as the d increases. Kargupta (1995a) also showed that when no relations are considered, this expression points out that the sample complexity will be of the order of the size of search space; in other words search will be no better than enumeration. For a given relation space and a class of problems that can be solved considering a bounded number of relations from that space, inequality 3 gives the bound on sample complexity for desired quality and reliability of the decision making. This expression is valid for any nite blackbox search space. The following section identi es the di erent components of BBO algorithms in SEARCH. 6

4 Blackbox Optimization Algorithm The SEARCH framework decomposed BBO into di erent components such as the relation space, class space, and the sample space. Searching in these spaces requires some fundamental tools, such as some comparison statistics and a perturbation operator for generating samples. In this section, we list them together and project a complete picture of what it means to be a BBO algorithm in SEARCH. 1. SEARCH views the solution domain through relations, classes, and samples. An algorithm in SEARCH should be provided with a set of relations r . Representation in genetic algorithms (GAs), perturbation operators of simulated annealing (SA), and the neighborhood heuristic in k-opt algorithm are some examples of di erent sources of relations. 2. SEARCH also needs explicit storage for processing relations, classes, and samples. The maximum possible values of Mr 2 and Mi determine the size of the memory for storing relations and classes respectively. In a genetic algorithm the population serves as the memory for all three of these spaces. In SA the evaluation of di erent classes de ned using the perturbation operators is distributed over time and only one sample is taken at a time. The state of the SA algorithm serves as the memory for the sample space. 3. Two statistic measures for comparing classes and relations are required. A selection operator is used for comparing classes in the simple GA. The simple GA does not really search for better relations. On the other hand, in simulated annealing, the Metropolis criterion is used for comparing two states; this can be viewed as a class comparison statistic. 4. A perturbation operator, P is required for generating samples. In GA, crossover and mutation generate new samples. SA makes use of a neighborhood generator for sample generation. 5. Accepting criterion of success probability, q , and qr are necessary. Almost every practical application of GA and SA either implicitly or explicitly makes use of an acceptance criterion for success. Neither simple GA nor SA actually searches for better relations. Neither of them has any explicit criterion like qr . 6. Required precision in solution quality, d is required. Again, in practice, both GA and SA somehow introduce the factor controlling the desired solution quality. In SEARCH, this is introduced in an non parametric way. Goldberg, Deb, and Clark (1993) and Holland (1975) quanti ed this using a parametric approach. In many algorithms, some of these parameters are not explicitly speci ed. For example, in random enumerative search, no importance is given on relations. However, even this kind of search can be shown as a special case of SEARCH. Kargupta (1995a) showed that the random enumerative search is essentially a search with one and only one relation in consideration that divides the search space into singleton classes. In GAs, the class comparison statistic is not explicitly speci ed. The selection operator implicitly de nes all the comparison statistics together. One of the main objectives of this whole e ort is to develop a more systematic approach toward the design and use of BBO algorithms by explicitly identifying di erent components of a BBO algorithm. SEARCH provides a common ground for developing new blackbox algorithms in the future. Regardless of the motivation and background, any blackbox optimization algorithm should clearly

M the relation space memory comes into the picture when the memory is not large enough to process all the relations in r 2 r,

7

###

f##

#f#

##f

ff#

f#f

#ff

fff

Figure 3: Hasse diagram representation of the set of equivalence relations ordered on the basis of the relation