On Expressiveness of Evolutionary Computation: Is EC ... - CiteSeerX

2 downloads 0 Views 146KB Size Report
there is no best universal algorithm for problem solving. EC consists of four main subareas Genetic Algorithms,. Genetic Programming, Evolution Strategies, and ...
On Expressiveness of Evolutionary Computation: Is EC Algorithmic? Eugene Eberbach

Computer and Information Science Dept. University of Massachusetts Dartmouth 285 Old Westport Road North Dartmouth, MA 02747-2300 [email protected]

Abstract - Evolutionary Computation has been used traditionally for solution of hard optimization problems. In a general case, solutions found by evolutionary algorithms are satis cing given current resources and constraints, but not necessary optimal. Under some conditions evolutionary algorithms are guaranteed (in in nity) to nd an optimal solution. However, evolutionary techniques are helpful not only to deal with intractable problems. In this paper we demonstrate, that EC is not restricted to algorithmic methods, and is more expressive than Turing Machines. 1

Introduction

Evolutionary Computation (EC) [1] is a relatively new approach to computer problem solving. EC is based on principles of natural selection using genetic methods which are tness driven. Evolution is a two-step process of random variation and natural selection: variation creates diversity, which is heritable, and selection eliminates inappropriate individuals. Evolutionary computation has been invented as many as 10 times independently from 1953-1970 [6]. Evolutionary Computation o ers a new approach to solution of NP-hard problems. To do it, EC uses evolutionary algorithms with problem speci c representation, variation and selection operators, tness measure, initialization and termination condition. It turns out that there is no such thing as a best evolutionary algorithm, in the same way as there is no best universal algorithm for problem solving. EC consists of four main subareas Genetic Algorithms, Genetic Programming, Evolution Strategies, and Evolutionary Programming. It can be understood as a probabilistic beam search directed by tness optimization to solve hard optimization problems. EC is related both to random-restart hill climbing and simulated annealing, i.e., the approaches which do not keep the entire search tree, but dissimilar to hill climbing it allows to overcome partially the problem to be trapped in local optima. EC could

be understood as a multi-agent competitive search, where agents/genes compete not by using maximization and minimization (like in minimax) of their tness functions, but the genes compete for survival indirectly to have the best tness to t to the search beam to allow to participate in a new generation search. This is di erent compared to the nature where animals compete both indirectly and by direct interaction. In evolutionary computation (at least in its classical form, not in co-evolution), genes from the population coexist in \alternative universes" in parallel but they are isolated and do not interact directly (they interact with the environment, which tests their tness). In this paper we concentrate our attention on expressiveness of Evolutionary Computation. In section 2 we overview Turing Machine, algorithms, undecidability and intractability. In section 3, an Evolutionary Turing Machines are de ned and some their properties are investigated. In section 4, we demonstrate that EC paradigm is more expressive than Turing Machines. Section 5 describes how EC deals with intractable problems.

2

Turing Machines, Algorithms, Undecidable and Intractable Problems

Turing Machines and algorithms are two fundamental concepts of computer science and problem solving. Turing Machines describe the limit of the algorithmic problem solving, and laid the foundation of computer science. Evolutionary computation is the instance of problem solving, it uses special types of algorithms - evolutionary algorithms - thus concepts of TM and EC are interrelated. Turing Machine (TM) is considered a formal model of a (digital) computer running a particular program. The Turing Machine is the invention of Alan Turing, who introduced his a-machine in his 1936 paper [13] as a byproduct to show the unsolvability of the Hilbert's decision problem in mathematics to prove or disprove all mathematical

statements. TM supposed to model any human solving algorithmically an arbitrary problem using mechanical methods. TM has a nite control (having a nite number of states), and in nite tape of cells (each cell keeping one symbol) and a read/write head. Initially, the tape contains a nite-length string of symbols from the input alphabet. All other cells, extending in nitely to the left and right are blank (contain a special blank symbol B from the tape alphabet). A TM being in some state reads a symbol from the tape alphabet, and moves head one position left or right. A read-write tape serves both as input, output and unbounded storage device. An abstract tape head is the marker - it marks the \current" cell, which is the only cell that can in uence the move of a TM. After a nite number of moves TM should stop and the tape should contain the answer to the problem. It turns out that some problems cannot be solved by TMs and some of them are solvable, but require too many steps. The number of di erent languages (problems) over any alphabet of more than one symbol is not countable (justi cation by diagonalization argument), however the number of all possible TMs is enumerable. Each TM represents one problem (language). Thus it is clear that there exist problems which cannot be mapped (solved) by TMs. In such a way Turing received his result about unsolvability of the "halting problem" of the Universal Turing Machine representing all possible TMs. This corresponds directly to the unsolvability of Hilbert's Entscheidungsproblem - the main reason of introduction by Turing his amachines. An algorithm is one of most fundamental concepts of current mathematics and computer science having roots in ancient Egypt, Greece, and Babylon, with the attempt to \mechanize" computational problem solving. The algorithm should consist of a nite number of steps, each having well de ned meaning. The general belief that every algorithm can be expressed in terms of a Turing machine, is now known as the Church-Turing thesis. We use the simplicity of the TM model to prove formally that there are speci c problems (languages) that the TM cannot solve [10]. Solving the problem is equivalent to decide whether a string belongs to the language. A problem that cannot be solved by computer (Turing machine) is called undecidable (TM-undecidable). The class of languages accepted by Turing machines are called recursively enumerable (RE-)languages. For RE-languages, TM can accept the strings in the language but cannot tell for certain that a string is not in the language. There are two classes of unsolvable languages (problems): recursively enumerable RE but not recursive - TM can accept the strings in the language but cannot tell for certain that a string is not in the language (e.g., the language of the universal Turing machine, or Post's

Correspondence Problem language). A language is decidable but its complement is undecidable, or vice versa: a language is undecidable but its complement is decidable. non-RE - no TM can even recognize the members of the language in the RE sense (e.g., the diagonalization language). Neither a language nor its complement is decidable. Decidable problems have an algorithm, i.e., TM halts whether or not it accepts its input. Decidable problems are described by recursive languages. Algorithms as we know are associated with the class of recursive languages, a subset of recursively enumerable languages for which we can construct its accepting TM. For recursive languages, both a language and its complement are decidable. We now bring our discussion of what can or cannot be computed down to the level of ecient versus inecient computation. We focus on problems that are decidable, and ask which of them can be computed by TMs that run in an amount of time that is polynomial in the size of the input. The threshold what is tractable (easy) and intractable (hard) is between problems having polynomial versus exponential complexity. Practical problems requiring polynomial time are almost always solvable in an amount of time that we can tolerate, while those that require exponential time generally cannot be solved except small instances. The theory of \intractability" deals with techniques for showing problems not to be solvable in polynomial time. Decidable problems can be either tractable or intractable. A problem that requires exponential amount of time or memory is called intractable. A problem that requires polynomial amount of time or memory is called easy or tractable.

3

Universal Evolutionary Algorithm and Evolutionary Turing Machines

A generic Evolutionary Algorithm can be described in the form of the functional equation working in a simple iterative loop in discrete time t [5]: x[t + 1] = s(v(x[t])); where x - is a population under a representation (e.g., xed binary strings for GAs, Finite State Machines for EP, parse trees for GP, vectors of reals for ES), s - is selection operator (e.g., truncation, proportional, tournament), v - is variation operator (e.g., variants of mutation and crossover), x[0] - is initialization of population.

The above equation is applicable to main subareas of EC: GAs, EP, ES, GP. Sometimes only the order of variation and selection are reversed, i.e., selection is applied rst, and variation second. Variation and selection depend on the tness function. In the above equation only terminal condition is omitted. The true termination condition is to nd a population with a global maximum tness of one of individuals in the population. Usually, this is not possible to achieve (or verify), and EC terminates after a prede ned number of iterations, or when no more improvements are observed through several generations. Of course, it is possible to think and implement more complex variants of evolutionary algorithms, like for instance Reynold's cultural algorithm, or Michalewicz's Genocop algorithm, or Ant Colony System [1]. Evolutionary algorithms evolve population of solutions x, but they may be the subject of self-adaptation (like in ES) as well. In general, every evolutionary algorithm EA can be encoded as the instance of TM M operating on population x, i.e., pairs (M; x) consisting of a Turing machine M encoding EA and its input x being the speci c input of the Universal Turing Machine UTM. Then a Universal Evolutionary Algorithm represents instances of Turing Machines from a Universal Turing Machine representing all possible evolutionary algorithms. In other words, the Universal Evolutionary Algorithm consists of all pairs (M; x), where M is TM encoding of evolutionary algorithm, and x is its input population. We de ne an Evolutionary Turing Machine ETM as a series (possibly in nite) of Turing Machines M [t]: (M [t]; x[t]) = [(M [0]; x[0]); (M [1]; x[1]); :::], where each M [t] represents (encodes) an evolutionary algorithm with population x[t], and evolved in generations t = 0; 1; 2; :::. The outcome of generation t = i, i = 0; 1; 2; ::: is the pair (M [i +1]; x[i +1]), and the goal (or halting) state of ETM is represented by the optimum of tness performance measure f [t]. Note rstly that because the tness function can be subject of evolution as well (it is a part of TM encoding M ), evolution is an in nite process. Secondly, there is a question: are evolutionary algorithms a subset of all algorithms, or other way around? We may have (a wrong) impression that evolutionary algorithms are the subset of all algorithms. However, it is opposite - ETM restricted to one generation only becomes a conventional Turing Machine TM . It is claimed that a unique feature of EC is that it operates on the population of solutions, and processes of variation and selection are probabilistic. However, Evolution Strategies may operate on populations of size 1, selection process (e.g., by truncation) or variation operators can be deterministic, and even tness function can be implicit not explicit (e.g., Holland's ECHO system). Thus such widely understood evolutionary algorithms are a superset of all algorithms

(although it is correct that typically EAs operate on multiple search points, and many operators are probabilistic). This is consistent with the intuition that adaptive (evolutionary) processes are a superset of static (conventional) processes. We can de ne a Universal Evolutionary Turing Machine UETM as ETM which takes as the input vector [(M [0]; x[0]); (M [1); x[1]); :::] and behaves like M [t] with input x[t] for t = 0; 1; 2; ::: Many properties derived for TMs are carried out to ETMs. In particular, the Universal Turing Machine cannot solve the halting problem of ETM. Theorem 1 The halting problem (reaching global optimum) of UETM is unsolvable by the Universal Turing Machine. Proof: UTM cannot solve the halting problem for ETMs with one generation only, thus it cannot solve it for multiple generations either. Theorem 2 UETM cannot solve it own halting problem Proof: by the diagonalization argument - similar like for UTM, i.e., if such ETM existed it should halt given itself as an input. We obtain contradiction that UETM halts if it doesn't, and if doesn't halt then it halts. Thus such ETM cannot exist. The two classical examples of TM undecidable languages are the universal language and the diagonalization language (see e.g. [10]). They allow us to decide about the unsolvability of many other problems (languages). In an analogous way we de ned the universal language and the diagonalization language for UETM The universal language Lu of the Universal ETM (Halting Problem for Evolutionary Turing Machines) is RE but not recursive. We de ne Lu for ETM M [t] to be the set of inputs x[t] such that M [t] halts given input x[t], regardless of whether or not M [t] accepts x[t], t = 0; 1; 2; :::. Then the halting problem for ETM is the set of pairs (M [t]; x[t]) such that x[t] is in Lu . The diagonalization language Ld, is not a recursively enumerable (RE) language. That is, there is not Evolutionary Turing Machine that accepts Ld. The diagonalization language, Ld, is the set of strings xi [t] such that xi [t] is not in L(Mi [t]). That is, Ld consists of all strings x[t] such that the TM M [t] whose code is x[t] does not accept when given x[t] as input. Based on the undecidability of the universal and diagonalization languages, the typical way to prove that other problems (languages) are undecidable is to reduce Lu or Ld to a new problem. Then because of the undecidability of Lu and Ld, a new problem becomes to be proven to be undecidable too. Reduction: if we have an algorithm to convert all instances of a problem P1 to instances of a problem P2 that have the same answer, then we say that P1 reduces to P2 (i.e., P1 becomes a special case/subset of P2 ):

Reduction of Lu to P : knowing that the Universal Language Lu is undecidable (i.e., not a recursive language) is useful by applying the reduction of Lu to another problem P to prove that there is no algorithm to solve P , regardless of whether or not P is RE. Reduction of Ld to P : is only possible if P is not RE, so Ld cannot be used to show undecidability for those problems that are RE but not recursive. On the other hand, if we want to show a problem not to be RE, then only Ld can be used; Lu is useless since it is RE. By analogy to TM, we can conclude that ETM cannot decide about the following problems (languages): 

Every nontrivial property P (a nonempty proper subset of RE) of the RE languages (extension of the Rice Theorem [10] for ETM). A property is trivial if it is either empty (i.e., satis ed by no language at all), or is all RE languages. Otherwise, it is nontrivial. Consequences of Rices Theorem: we cannot decide whether ETM accepts the empty language f;g, a nite language, a regular language, context-free language, etc.



Post's Correspondence Problem (PCP) [10] asks, given two lists of the same number of strings, whether we can pick a sequence of corresponding strings from the two lists and form the same string by concatenation. PCP is the principal technique to prove a variety of problems to be undecidable.

Any nontrivial property that involves what the program does (rather than a lexical or syntactic property of the program itself. In particular, the problems whether ETM after nding its solution will print \Hello World!", will halt, will nd a global optimum, is deadlock-free, is partially or totally correct, are undecidable. All these problems are reduced from PCP and indirectly from Lu ). Note that in particular, the whole excitement about the \No Free Lunch Theorem (NFL)" (see, e.g., [1]) is overstated - NFL would be exactly another example of the trivial (using Rice's terminology) property. No Free Lunch Theorem states that (for all possible tness functions) all evolutionary algorithms/operators are equivalent on average. However, we very rarely are interested in all problems/ tness, but usually in their proper nonempty subsets, and not necessarily in their average values. 

4

Is Evolutionary Computation algorithmic?

An algorithm should consist of the nite number of steps, each having well de ned and implementable meaning. We

can suspect that Evolutionary Computation being in general in nite, violates a classical de nition of algorithms, i.e., may lead to non-algorithmic computations. In fact, that suspicion is correct - ETM can be understood as a special case of, so called Persistent Turing Machines - one of several models of computation more expressive than Turing Machines. In [4], several such models have been presented, including Turing's o-machine (TM with oracles), c-machines (choice machines), u-machines (unorganized machines), cellular automata, neural networks [7], Interaction Machines [14, 15], Persistent Turing Machines PTM, -calculus [12], $-calculus [2, 3]. In particular, Evolutionary Turing Machine can be understood as a special case of Persistent Turing Machines [8, 9] which preserve contents of the working tape to a new computation (in ETM results of one generation, are used in the next generation) Such models derive their higher than TM expressiveness using three principles: 1. Interaction: opening a closed model by interacting with the external world. The world can be in the form of a singular entity or multiple agents, which actively participate in computation. The external component has to be either "smarter" than TM or it has to be in nite many of "dump/ordinary" components. This approach is represented by Turing c-machines and omachines, Wegner's Interaction Machines [14], and by Van Leeuwen Site and Internet machines. 2. Evolution: allowing adaptation of TM to a possibly smarter component or in nite sequence of components. This approach is used by $-calculus [2] and evolutionary computation paradigm. 3. In nity: releasing restriction on boundedness of resources, i.e., allowing either an in nite initial con guration, in nite many computing elements, in nite time, or in nite/uncountable alphabets. For example:  Allowing an in nite initial con guration (persistence): is represented by cellular automata, Interaction Machines [14], Persistent Turing Machines [8] and $-calculus [2].  Allowing in nite many computing elements (in nity of operators) can be modeled by an in nite number of tapes of TM, or in nite number of read/write heads - an unbounded parallelism. The approach is represented by cellular automata, neural networks, random automata networks [7], -calculus [12], $-calculus [2]. It should be used by models of massively parallel computers or Internet, where we do not put restriction on the number of computing elements. Note that we should allow an in nite number of computers modeling massively parallel computers and Internet, because by the same argument

 

as Turing allowed (non-existing in nature) an in nite tape in TM, we should allow an in nite supply of (non-existing in nature) machines in models of parallel and distributed computation. Allowing in nite time is represented by reactive systems. Allowing uncountable alphabets is represented by analog computers, neural networks and hybrid automata.

strings (there the search space is nite). However, cellular programming evolves cellular automata, ENN evolves neural networks, GP evolves unbounded parse tress, and ES evolves vectors of reals - operating on potentially in nite objects, which by the in nity principle can lead to higher expressiveness than TM. In practical applications, they are trimmed to nite solutions, because nobody so far has found an applicable way to evolve in nite objects.

5

EC approach to deal with in-

Evolutionary Computation applies both evolution and tractability in nity principles to gain its higher than TM expressiveness. We will show 2 ways how EC solves the halting Nobody so far was applying EC to nd solutions of TM problem of UTM using either in nity, or evolution princi- unsolvable problems. Typically EC is used to solve TM ples. decidable polynomial (easy), and exponential (intractable) optimization problems. Theorem 3 The halting problem for the Universal Turing There important is whether EC will nd all solutions, Machine is solvable by ETM using in nity principle. whether solutions are optimal, and what was the search Proof: Through recursive Godel numbering, each instance cost associated with evolution. (M; x) of TM will be encoded as a positve integer corresponding to one generation of ETM (M [t]; x[t]). We as- We say that Evolutionary Computation [3] is sume that ETM = [(M [0]; x[0]); (M [1]; x[1]); :::]. Fitness  Complete if it reaches all its solutions. is de ned to reach optimum for halting instances. ETM is evolved to appropriate generation corresponding to the  Optimal if the solutions found are with the optimal tness. Godel numbering, and then that con guration is checked whether it reached an optimum (corresponding to halting)  Totally optimal if both solutions and the search (evoor not (not halting). This may require checking all generlution) process are with optimal tness. ations and an in nite time. But ETM has inifnite number of generations, and this is consistent with the in nity prin- Evolutionary computation in a general case is incomplete (simply probabilistic search can miss some solutions) ciple. and not optimal (if some solutions are missing, then they The diagonalization argument which caused the prob- might be exactly the optimal ones). Under some condilem (inconsistency) in one UTM trying to decide about tions EC can be complete and optimal (using elitist stratehalting or not will not be the problem in this case, be- gies and having variation operators allowing to explore the cause instead of a single UTM, we have an in nite series whole search space will lead in in nity to global optimum). of TMs working in successive generations. However, evolutionary computation, even if optimal, is not totally optimal, i.e., it is computationally very expensive, Theorem 4 The halting problem for the Universal Turing with missing mechanisms for control search cost (which Machine is solvable by ETM using evolution principle. would allow scalability and evolution on-line, e.g., for evoProof: The idea is quite simple. The TM will be evolved lutionary robotics). The total optimality is possible, but (e.g., using nonrecursive variation operator v to TM with this requires to use composable tness (something in the oracle (o-machine). TM with oracle can solve the halting style of $-calculus [2, 3]), appropriate variation operators problem of UTM. (not missing any search points), and yes - an in nite time. The basic ways to approach intractable problems are Note that this approach requires that variation or selection varied and they employ: nondeterministic, randomized, operators are nonrecursive functions. heuristic, approximation, probabilistic, greedy, anytime, Theorems 3 and 4 represent the main result of the paper evolutionary, interactive and parallel algorithms. Evoluthat Evolutionary Computation is more expressive that tionary Computation, together with quantum computing, TM, and may represent non-algorithmic computation. anytime algorithms, $-calculus can be used to solve inNote that Evolutionary Turing Machine can solve the tractable problems. halting problem of Turing Machine, but not its own halting In [4] four methods to deal with TM intractability were problem. The potential for higher expressiveness of EC is asso- identi ed: ciated with the representation used to encode individual 1. Interaction: by getting feedback from the external solutions. For example, it is dicult to expect higher exworld. The world can be represented by an envipressiveness from GAs, operating on xed length binary ronment or other agents. Feedback can be in the

form of direct advice (represented by interactive al- References gorithms, or interactive Turing test for intelligence), or using some performance measure obtained by inter- [1] Back T., Fogel D.B., Michalewicz Z. (eds.), Handbook of Evolutionary Computation, Oxford Univeraction (measurement, observation) with environment sity Press, 1997. describing \goodness" of a solution and/or resources used to nd it. Performance measures are utilized by [2] Eberbach E., $-Calculus Bounded Rationality evolutionary algorithms, anytime algorithms, A* or = Process Algebra + Anytime Algorithms, in: Minimax search algorithms, dynamic programming, (ed.J.C.Misra) Applicable Mathematics: Its Perspec$-calculus. tives and Challenges, Narosa Publishing House, New Delhi, Mumbai, Calcutta, ISBN 81-7319-406-8, Part 2. Evolution: transforming problem to a simpler (less III: Comp.Sci. & Inf. Techn., Chapter 22, 2001, 213complex) or incomplete one. This approach is used by 220. structured programming, approximation algorithms, neural networks (performing approximate classi ca- [3] Eberbach E., Evolutionary Computation as a MultiAgent Search: A $-Calculus Perspective for its Comtion/regression), anytime algorithms, and $-calculus. pleteness and Optimality, Proc. 2001 Congress on In spite, of their name, evolutionary algorithms, in Evolutionary Computation CEC'2001, Seoul, Korea, general, do not simplify problem to solve (excluding, 2001, 823-830. perhaps, Genetic Programming with Automatically De ned Functions [11]). [4] Eberbach E., Wegner P., Beyond Turing Machines, Brown University Techn. Report, 2001. 3. Guessing: selecting randomly a path to nd a solution. This approach is used by nondeterministic, [5] Fogel D.B., Evolutionary Computation: Toward a probabilistic, randomized, ergodic, evolutionary alNew Philosophy of Machine Intelligence, IEEE Press, gorithms, simulated annealing, random restart hill 1995. climbing. The ability to guess is indeed a powerful [6] Fogel D.B. (ed.), Evolutionary Computation: The gift! [7] Fossil Record, IEEE Press, NY, 1998. 4. Parallelism: exploring all possible solutions simulta- [7] Garzon M., Models of Massive Parallelism: Analyneously by trading usually time complexity for space sis of Cellular Automata and Neural Networks, An complexity. This approach is represented by quantum EATCS series, Springer-Verlag, 1995. computing, DNA-based computing, neural networks, parallel computing on supercomputers or on Internet, [8] Goldin D., Persistent Turing Machines as a Model of Interactive Computation, FoIKS'00, Cottbus, Gerand $-calculus. many, 2000. In the solution of hard computation problems, Evo- [9] Goldin D., Smolka S., Wegner P., Turing Machines, lutionary Computation uses interaction principle ( tness Transition Systems, and Interaction, 8th Int'l Workmeasures a solution quality in a given environment), guessshop on Expressiveness in Concurrency, Aarlborg, ing principle (search points are selected randomly with Denmark, August 2001. bias towards tter individuals), and parallelism principle (a whole population of solutions (co-existing in parallel in [10] Hopcroft J.E., Motwani R., Ullman J.D., Introduction to Automata Theory, Languages, and Computathe same environment) is tested in one generation). tion, 2nd edition, Addison-Wesley, 2001.; [11] Koza J., Genetic Programming I, II, III, The MIT 6 Conclusions Press, 1992, 1994, 1999. In this paper, we demonstrated that Evolutionary Compu- [12] Milner R., Elements of Interaction, CACM, vol.36, no.1, Jan. 1993, 78-89. tation paradigm is more expressive than Turing Machines. For this purpose, we de ned an Evolutionary Turing Ma- [13] Turing A., On Computable Numbers, with an Applichine being the extension of the conventional Turing Macation to the Entscheidungsproblem, Proc. London chine. The higher expressiveness is obtained either evolvMath. Soc., 42-2, 1936, 230-265. ing in nite populations in in nite number of generations, or using nonrecursive variation operators. We presented [14] Wegner P., Why Interaction is More Powerful Than Algorithms, CACM, May 1997, vol.40, no.5, 81-91. also how Evolutionary Computation can and should deal with intractable problems to nd optimal solutions with- [15] Wegner P., Interactive Foundations of Computing, out or together with minimizing search cost of the evoluTheoretical Computer Science, 192, 1998, 315-351. tion process.