evolutionary computation - Nature

REVIEWS

EVOLUTIONARY COMPUTATION James A. Foster Evolution does not require DNA, or even living organisms. In computer science, the field known as ‘evolutionary computation’ uses evolution as an algorithmic tool, implementing random variation, reproduction and selection by altering and moving data within a computer. This harnesses the power of evolution as an alternative to the more traditional ways to design software or hardware. Research into evolutionary computation should be of interest to geneticists, as evolved programs often reveal properties — such as robustness and non-expressed DNA — that are analogous to many biological phenomena. C O M P U TAT I O N A L G E N E T I C S

Initiative for Bioinformatics and Evolutionary Studies (IBEST), Department of Computer Science, University of Idaho, Moscow, Idaho 83844-1010, USA. e-mail: [email protected]

428

In nature, evolution uses basic principles to adapt populations to the particular challenges that are posed by their environments. These principles — selective reproduction of the fittest individuals among a set of randomly varied ones — can be exported outside the realm of biology. In the field of computer science, evolutionary computation (EC) incorporates the principles of biological evolution into algorithms that are used to solve large, complicated optimization and design problems. For example, EC might be used to search for piston width, fuel input diameter and other features of an internal combustion engine that maximize fuel efficiency for a given horsepower1; or to determine how much to invest in long and short positions in a stock market portfolio to maximize return and minimize risk2; or to find the largest number of nodes in a graph that form a completely connected subgraph3. EC has also been applied to various biological problems, such as multiple sequence alignment4–6, phylogenetic inferencing7, metabolic pathway inferencing8 and drug design9,10. EC can even use evolutionary principles to produce designs for computer hardware or software. However, EC is not just a tool for other disciplines; it is also a fully fledged research area with its own questions. Many of these questions deal with technical implementation details that have no direct biological counterpart. In contrast to biological research, in which the object under study and the rules that govern it pre-exist, in EC the computer sci-

entist decides how to organize the ‘genetic’ material, how selection takes place, what the population structures and reproduction constraints will be, and even where and how mutations occur. These design decisions dominate most EC research, particularly because the algorithm designer is often addressing a poorly understood problem. Theoretical EC research also raises broader questions about evolution. How quickly do well-adapted individuals arise? What genomic organization strategies and evolutionary processes affect that process, and how? Which characteristics of evolved objects are necessary consequences of the evolutionary process? What population structures, selection mechanisms and evolutionary processes increase or decrease population diversity? These questions begin to resemble those that theoretical biologists ask about natural systems. EC might repay its debt to biology in part by using the computer itself as a model system with which to approach these questions. In this article, I outline the principles of EC research and the large number of questions that can be addressed by using EC, with an emphasis on how it can help to solve some pressing theoretical as well as practical problems in biology. The field of functional genomics, for example, is likely to benefit from the application of EC, as the rapidly increasing wealth of genomic information demands ever more reliable algorithms to understand genomic organization and to predict protein structures.

| JUNE 2001 | VOLUME 2

www.nature.com/reviews/genetics

© 2001 Macmillan Magazines Ltd

REVIEWS Generation t

Generation t + 1

Population

Population

* Mutate

*

Recombine

* * Evaluate fitness

Figure 1 | Evolutionary computation schematic. Evolutionary computation proceeds by transforming a population (box) of chromosomes (coloured lines). The chromosomes are genome-like data that represent potential solutions to a target problem. A chromosome is represented by whatever data structure the algorithm designer selects; this might be a sequence of bits, a sequence of real values, a collection of different data structures, or a tree, to name just a few possibilities. The fitness function measures the quality of the chromosomes for a target problem. Mutation (indicated here by asterisks) alters the selected chromosome. Recombination combines data from several locations in the population (either several chromosomes or several sites on the same chromosome). Recombination can occur either by one-point crossover, two-point crossover, or uniform crossover. Repeating the processes of selection, mutation and recombination creates the next generation of chromosomes (t + 1) from the current one (t).

FITNESS FUNCTION

Taxonomy of evolutionary computation

A function that measures the quality of an individual with respect to the given target function, usually as a real value.

Although several varieties of EC exist in the literature, they all function in essentially the same way: they work on collections of one or more chromosomes or ‘individuals’, which are grouped into ‘populations’. (Terms from the EC literature appear in quotation marks or as glossary terms when first used, and should not be confused with their biological counterparts.) The chromosomes are simply organized collections of data representing the units that are to be evolved (FIG. 1). These structures are analogous to genomes in nature, but might be implemented in any way that the algorithm designer decides is useful, and so might be much more complicated than a simple sequence of values. A FITNESS FUNCTION quantifies the degree to which chromosomes solve a given ‘target problem’. The process usually begins with randomly generated chromosomes, which by design are likely to have very low fitness values. EC transforms chromosomes by ‘selecting’ some at random with a probability distribution that is induced by their fitness values, then either alters some of them (known as ‘mutation’) or combines data from one or more of them (known as ‘recombination’). In nature, the mechanisms of mutation derive from the implementation of chromosomes as molecules. In EC, mutational mechanisms are built into the system a priori, and might be specifically designed to help solve a particular target problem. After choosing and potentially modifying individual ‘parents’, the resulting chromosomes proceed into the next ‘generation’ (FIG. 1). Suppose the target problem is to infer the most likely phylogeny for a large collection of taxa — on the basis of their DNA sequences, for example. Chromosomes might be defined as different phylogenetic trees, and the fitness function could compute the probability that the observed taxa were produced by the tree in a chromosome. The initial population

might consist of phylogenetic trees for the given taxa, with randomly determined topologies and assignments of taxa to leaves (terminal nodes). Such trees will probably be very poor evolutionary explanations for the observed taxa. Notice that the chromosomes in this case are not linear structures, like DNA molecules, but rather are branching structures. This is an example of the flexibility available to EC practitioners. Trees with higher likelihood scores (higher fitness) would tend to be chosen for reproduction, so that fitness would be correlated with reproductive opportunities by fiat. Two possible mutation mechanisms in this example would be to rotate individual branches or to swap leaves. Recombination might exchange the phylogenies for identical subsets of taxa between two trees. After iterating this process for several generations, the best resulting phylogenies will be expected to explain the relationship between the taxa very well. This process effectively searches the vast (infinite, in fact) space of potential phylogenies, using the individuals in any generation as points from which to continue explorations in the next generation (for an implementation of this example, see REF. 7). For the purposes of discussion, I will divide the subspecies of EC into two broad classes: those that optimize the parameters for a single problem, and those that build descriptions of objects that themselves carry out computations. The first class evolves answers to a given question, whereas the second evolves processes that answer many related questions. The first group is sometimes collectively referred to as ‘evolutionary algorithms’ (EAs), because they use evolution as part of an algorithm to solve a specific problem11. I call the latter class ‘genetic programming’ (GP), because it uses ‘genetic’ manipulations to build computational objects. I further divide GP into applications that produce software, and those that design hardware (BOX 1).

NATURE REVIEWS | GENETICS

VOLUME 2 | JUNE 2001 | 4 2 9


REVIEWS

DATA STRUCTURE

An organized collection of data, such as a list, tree, or array. GENE

A field in a chromosome, usually associated with a single parameter (allele) in the target problem. OPERATORS

Processes that act on chromosomes, including selection, mutation and recombination.

Evolutionary algorithms

EC is often applied to optimize parameters for a specific target problem. In the example above, the EC searches for the phylogenetic tree topology and branch lengths that maximize the probability of correctly describing relationships between a given set of DNA sequences7. In every case, however, an EC represents the parameters or features of interest as a DATA STRUCTURE in the computer and evolves them, measuring fitness as progress towards solving the target problem. The literature often divides this sort of algorithm into three broad categories: genetic algorithms (GAs), evolutionary programming (EP) and evolution strategies (ES).

FINITE (STATE) AUTOMATA

A finite state machine is one that takes a sequence of input ‘letters’ (symbols, inputs), and begins in a ‘start state’, then repeatedly reads one input letter, optionally outputs a letter and moves to a new state.

Genetic algorithms. GAs evolve discrete values to particular instances of a problem — such as the number of nodes in a graph, the amount of investment in a stock, or the topology of a phylogenetic tree. GAs are usually associated with the early work of Holland12, although essentially the same type of algorithm existed much earlier13. Chromosomes in GA systems typically have several fields (called GENES) that might contain specific sets of values (called ‘alleles’) that in turn represent the parameters to be optimized. Mutation in GAs might be as simple as changing a bit in the chromosome (the computational equivalent of generating a new allele), or it might involve an arbitrarily complicated alteration of one allele into another. Recombination in GAs occurs by selecting and swapping sets of genes from each parent, usually by simply cutting two sequences and exchanging the resulting fragments.‘One-point crossover’, ‘two-point crossover’, and ‘uniform crossover’ cut the parents at a single point, two points, or at multiple random points, respectively (FIG. 1). It is less clear how to combine information if data structures are more complicated. For example,

Box 1 | Taxonomy of evolutionary computing approaches The taxonomy presented here is purely an organizational device for introducing different approaches to evolutionary computation, as there is no generally accepted taxonomy in the literature. It does not indicate the pedigree of various techniques, many of which appeared before the modern terminology was in place12–16. It is conventional, although perhaps confusing, to use the term ‘genetic programming’ to define both a broad class of EC approach and one of its subspecies. Evolutionary computation • Produces solutions to a specific target problem.

Evolutionary algorithm (EA)

• Evolves alleles with discrete values; varies chromosomes by recombination; selects parents according to their fitness, which is calculated using probabilistic functions.

Genetic algorithm (GA)

• Relies on specialized mutations in chromosomes, rather than recombination (but see text).

Evolutionary programming (EP)

• Evolves alleles with real values; is self-adapting.

Evolution strategies (ES)

• Designs computational objects that solve many related target problems.

Genetic programming (GP)

• Designs computer software.

Genetic programming (GP)

• Designs or configures computer hardware.

Evolvable hardware (EH)

430

suppose one wanted to use GAs to discover potential phylogenies7, to predict protein structures14, or to infer metabolic pathways8. It would be natural to use trees or graphs rather than sequences. For such chromosomes special recombination OPERATORS would be required. In GAs, chromosomes are chosen stochastically for replication into the next generation, with a probability distribution that depends directly or indirectly on their fitness values. There are several algorithms for selecting these parents. The most straightforward strategy, sometimes called ‘fitness proportional selection’, is to scale fitness values to a range from zero to one, and choose chromosomes according to those probabilities. The probabilities that govern whether a chromosome will move to the next generation can be modified — this is often necessary to enhance the selective pressure when fitness values are very similar. A totally different approach, which works very well in practice, is to randomly choose several chromosomes, then to select the fittest of these to be parents. This strategy, known as ‘tournament selection’, is computationally very efficient and still proportionally allocates reproductive opportunities according to the fitness of the chromosomes15. In any case, mutation and recombination only take place in parents that have been selected for reproduction, and are also applied stochastically. Evolutionary programming. EP systems tend to rely more on inducing variation in chromosomes than on recombination between them. In GA, the chromosomes that are made to evolve are the individuals within a reproductive population, that is a species. By contrast, EP implementations model the behavioural or phenotypic adaptation of species themselves — which by definition cannot undergo sexual recombination16,17. The practical effect of this is that EP systems often rely on special-purpose methods for inducing variation in chromosomes, which are tailored for particular problems. These mutation operators are usually applied to every individual in the population. EP systems also show a rich variety of recombination operators, for example to average the numerical values in several genes when searching for a value that optimizes some function, or to select alleles that occur in the majority of individuals in the population. Historically, Lawrence J. Fogel first used EP to produce FINITE AUTOMATA as a step towards evolving artificial intelligence18, essentially using the automata as embodiments of strategies for responding to external stimuli. EP does not seek to emulate specific genetic operators, as observed in nature (as do GAs). For example, Fogel’s implementation blended information from different genes of multiple chromosomes — rather than selecting one of two regions of different chromosomes, Fogel blended the two regions to generate a new, third allele. In this case, the progeny were built from the most frequent states that appear in the parent finite automata. This is an interesting alternative to the usual recombination operators in GAs. The term ‘evolutionary programming’ is now broadly applied to include far more than finite automata19.




REVIEWS

Box 2 | Example of genetic programming While wall ahead

Turn right

While wall ahead

Do, then

Turn left

value of one. All parameters evolve. This SELF-ADAPTATION, in which the genotype adapts to alter the evolutionary process itself, could be an interesting model for mutation and recombination rate heterogeneity in biological systems.

Turn left

Genetic programming Move

Move

Consider the problem of evolving a program to direct a robot through the maze shown. In this maze, an arrowhead represents the robot and its orientation. Such programs accept input data from sensors, such as wall detectors, and might use instructions such as (Move), (Turn right), (Turn left), (While wall ahead, do X, then do Y), or (Do X, then do Y). The output of the program is the movement of the robot. A program to spin clockwise when blocked, then move two steps forward would be: (While wall ahead (Turn right), then (Do (Move), then (Move))). Mutations of this program simply transform one instruction into another, for example by replacing a (Turn right) with a (Turn left), shown in green type. Recombination might select and exchange random subtrees, which are well-formed programs in their own right, from two individuals. In the above example, ((Do (Move), then (Move))) in blue type, is a tree with three nodes (the leaves are both (Move) instructions and the root is a (Do X, then do Y)), which is a fully formed program even though it is a small part of the larger program. It can be replaced with any other program, for example by (Turn left), and can be moved into any other program. In this example, the result of these two operations does an about-face. The tree in the above example can also be replaced to produce a correct program. Notice that neither program advances the robot far, so both have low fitness.

SELF-ADAPTATION

The ability of an individual to alter the evolutionary processes that affect its genotype through the use of strategy parameters.

Evolution strategies. ESs were invented to solve technical optimization problems. ES numbers are usually real numbers rather than discrete data structures. This sort of parameter optimization problem is common in engineering applications19, to which ES was first applied. ES individuals are described both by ‘problemspecific variables’, which are optimized to solve the target problem, and ‘strategy parameters’, which modify the behaviour of the evolutionary algorithm itself. The term ‘strategy parameter’ is given to genes that affect the evolutionary process for a particular individual, usually by specifying a probability distribution or a rate for random processes. (Strategy parameters are sometimes said to be a distinguishing characteristic of ES, but have in fact been used in the other forms of EC for some time20–22.) A simple strategy parameter might consist of two real numbers that represent the mean and standard deviation of the amount by which a gene (a real number) will change when it is mutated (assuming a normal distribution). For example, an ES approach to finding the optimal branch lengths for a given phylogenetic tree topology for n taxa might encode a triplet of values for each branch. The first value in each triplet is an actual branch length, and the other two are the standard deviations and means for a normal distribution from which mutation values (which will alter the length) will be drawn. The first values are problem-specific variables and the latter two are strategy parameters, which mutate according to a normal distribution with a mean value of zero and a standard deviation

Sometimes the target application of EC is to find a black box that manipulates arbitrary sets of parameters, rather than to find a single optimal parameter set. Towards this end, GP uses evolution to get the computer to write its own programs, or to design other computers. For example, one GP approach to searching the amino-acid sequence of a protein for transmembrane regions would be to discover a program or device that works for any amino-acid sequence. The non-GP approach would identify the target regions only for a single amino-acid sequence23,24, and would begin a new search for features of transmembrane motifs in each new sequence. This distinction between evolving a problem solver and solving a particular problem therefore distinguishes GP in our taxonomy (BOX 1) from other types of EC. (There is, in fact, no accepted term for the class of EC approaches that produce computational objects. My use of the term encompasses systems that evolve traditional software, physical computing devices and abstract models of computations — such as finite automata18, artificial neural nets25 and expert systems26–28 — which are not discussed here.) Evolving programs. GP chromosomes are often interpreted as computer programs24,29. The fitness of a program can be measured by running it on selected inputs, and by validating its performance on unseen data. This is essentially how software engineers everywhere test their products. For evolution to work on programs, it must be possible to randomly generate an initial population of programs, and to induce variations and combine several program fragments without crashing the computer that is running the GP. This is non-trivial, but can be achieved through at least three general strategies: by using a special-purpose programming language, by using machine language, or by evolving programs indirectly. It is also possible to use evolution-friendly operators. The first approach is to code the individuals in a carefully chosen programming language that is compatible with mutation and recombination. There are many types of programming language. Most familiar languages, such as C, C++, Perl and Java, have a very rigid syntax, which is why they are so brittle. Misplacing a single semicolon leads to disaster. But some programming languages, such as Scheme and Lisp, exclusively build programs by compositionally combining smaller ones — in much the same way that one can build large mathematical expressions by assembling smaller ones. With this sort of language, all program fragments are fully formed programs that can stand alone, and that can therefore be inserted into other programs almost at random.


VOLUME 2 | JUNE 2001 | 4 3 1


REVIEWS

Box 3 | Example of grammatical evolution Chromosome A grammatical evolution approach to the 4 3 5 1 1 5 4 robot controller problem presented in BOX 2 might use the grammar in green. This is a set Grammar Rule Result of rules that specify how the character S is to S (initially) 1. S → (Move) be transformed into elements of the 4 (While wall (S) then (S) ) 2. S → (Turn left) 3 (While wall ( (Turn right) ) then (S) ) programming language for the robot. The 3. S → (Turn right) 5 (While wall ( (Turn right) ) then ( (Do (S) then (S) ) ) ) grammar is fixed beforehand, and only the 4. S → (While wall (S) then (S) ) 1 (While wall ( (Turn right) ) then ( (Do ( (Move) ) then (S) ) ) ) chromosomes evolve. Given this grammar, the 5. S → (Do (S) then (S) ) 1 (While wall ( (Turn right) ) then ( (Do ( (Move) ) then (Move) ) ) ) ) chromosome that is composed of alleles 4,3,5,1,1,5,4 produces the program from BOX 2 by replacing occurrences of the variable S one at a time from left to right, using the grammar rule indicated by each allele. The sequence of transformations showing this is in blue. Notice that the last two genes are unexpressed, because no more instances of S remain to be re-written. In this case, the initial S is the embryonic ‘program’. Grammatical evolution has been expanded to include multiple variables (such as S) and alleles, the values of which are not restricted by the number of grammar rules. An ontogenetic approach would be similar, but with more complicated transformations.

EMBRYONIC INDIVIDUAL

For this reason, most GP chromosomes are written in a special-purpose Lisp-like programming language, which has been tailored for the problem at hand. In these cases, chromosomes are trees (or nested parentheses), with functions for nodes. The functions themselves are problem specific (see BOX 2 for an example). The second approach is to evolve programs in socalled MACHINE CODE. This is a machine-friendly format tailored to a particular device, such as a Pentium processor30 or the Java Virtual Machine. Machine code is always a sequence of ones and zeros, so simple GA mutation and recombination operators work. But because the individuals are already in a form that the computer understands, fitness evaluations execute at machine speeds, which provides a dramatic speed-up over other GP methods. Of course, one must take care lest recombination or mutation produce invalid machine instructions, as these will crash the computer. In the third approach, chromosomes are recipes for building programs that solve the target problem, but are not programs themselves. There are two ways to implement this indirect approach, one borrowed from linguistics and one from developmental biology. The linguistic approach, known as ‘grammatical evolution’31, is illustrated in BOX 3. In the biologically inspired approach, known as ‘developmental’ or ‘ontogenetic’ GP, a chromosome contains instructions that describe a sequence of steps that transform a fixed EMBRYONIC program skeleton into a full program32. For example, to evolve a circuit one might begin with a wire. The chromosome might describe transformations that replace the wire with two wires in parallel, or with a capacitor and a wire, or with a resistor and two wires. The end result would be a complete circuit, built from the embryonic wire by a sequence of transformations that were evolved. So, what evolves is not a program, but the developmental processes for building a program.

A simple individual which is transformed into a more complex one by applying instructions evolved in a grammatical or developmental genetic programme.

Evolvable hardware. Finally, some GP applications skip software altogether, and evolve directly descriptions of computing devices (real or simulated). In this approach, which is called evolvable hardware (EH), individuals often represent analog or digital circuits32.

MACHINE CODE

Instructions for a particular computer, expressed in directly machine-readable form.

432

These circuits are interesting in their own right, but EH has also led to biological applications and has raised biologically relevant questions (see below). Analog circuits are electrical devices built from transistors, resistors, inductors and similar parts. They accept continuously varying inputs, rather than simple ones and zeros, and produce (usually) continuously varying outputs. For example, a noise filter that removes static from a radio signal is an example of an analog device. It works by removing low- and high-frequency signals, but passing the middle-frequency signals at which the sounds of our favourite jazz station sit. Another example of an analog circuit would be an electrical thermometer, which responds to ambient temperature changes by altering its output voltage33. Digital devices are built from logic circuits that in turn are often built from transistors. These circuits output discrete values, usually represented as zeros and ones. Examples that have been evolved include circuits

Figure 2 | Field programmable gate array — an example of a sorting network. Some field programmable gate arrays (FPGAs) are a grid of cells (boxes here) each of which can implement any Boolean function. Lines represent the connections between cells. Lines on the left are inputs to the device, and those on the right are outputs. In this case, the two functions are “or” (red cells), which pass a 1 on the right-hand side if either left-hand side has a 1; and “and” (blue cells), which pass a 1 on the right-hand side only when both lefthand sides are 1. Neutral boxes are other (irrelevant) functions. This particular FPGA is an example of a sorting network (mentioned in the text), which correctly sorts any four-bit input.




REVIEWS

Result-producing programs

Automatically defined functions

Chromosome n Chromosome 2 Chromosome 1 Use A

ADF C ADF B ADF A Repeat Use B

Figure 3 | Example of automatically defined functions. Complex problems can be solved by computers by using techniques that create large and complex problem-solving programs from smaller building blocks. This multipart program might consist of a main result-producing program and one or more reusable, hierarchical subroutines, called automatically defined functions (ADFs), which are encoded in the chromosome. To evolve programs that identify transmembrane regions in an amino-acid sequence, it might be useful to have a subroutine (B) that measures hydrophobicity (in practice, this is usually built into the programming language itself). A second useful subroutine (A) might call the hydrophobicity routine repeatedly, with the number of calls determined by selective pressure. Result-producing programs will tend to converge on those that recognize transmembrane regions, and these are likely to be accurate because of their use of helpful ADFs. In this example, chromosome 1 contains ADFs A and B, and also contains an unused ADF C (each chromosome contains its own ADFs, which it might or might not use). During a run, genetic programming evolves the main result-producing program, the various subroutines, and their parameters (for example, how often they are called).

BOOLEAN OPERATOR

Logical operators, such as AND or OR, that form expressions with binary values, denoted ‘true’ and ‘false’, or 1 and 0. REPRESENTATION PROBLEM

Decision on how to organize data in individuals, and the choice of variation and recombination operators that work well with that organization.

for adding or multiplying numbers34,35, computing 36 37 BOOLEAN functions , sorting input numbers , or 33,38,39 . enhancing images Few EH researchers actually build the electrical devices they evolve to evaluate their fitness. This would be very expensive and time consuming, especially in the early generations when the devices are randomly generated and usually have very low fitness. Rather, most researchers rely on simulations for fitness evaluations. However, special-purpose hardware exists to allow researchers to implement chromosomes as they are evolving. A ‘field programmable gate array’ (FPGA), for example, is a silicon-based chip that looks much like a microprocessor. Unlike a microprocessor, the way in which an FPGA works can be altered while it is running. A common FPGA design has a large grid of circuits, which can implement any Boolean function when directed to do so, and that can be connected in almost arbitrary ways (FIG. 2). To change the FPGA from a chip that does image recognition to one that implements a thermometer, for example, a sequence of ones and zeros can be loaded into the memory on the chip. This ‘configuration string’ directly alters the functions that the individual grid cells carry out and how these cells are connected to each other. The whole reconfiguration process happens in milliseconds, which allows the EH practitioner to rapidly instantiate each circuit design in a population and evaluate the fitness of the actual electronics.

Hardware evolved in this way might take advantage of peculiarities in the physics of individual FPGA chips, so that manually disabling grid cells that are completely unconnected from the circuit can alter the behaviour of an evolved circuit40. Evolution is very good at finding completely unexpected uses for its raw materials, and at finding surprising and unexpected solutions to the problems that one poses for it. EC practitioners are often surprised by their own programs. Natural evolution probably takes advantage of the physical properties of biological molecules in a similar way — one challenge is to understand which physical properties are important and why. Interestingly, disabling individual connections is less likely to significantly degrade the performance of a circuit when it is evolved than when it is designed by traditional techniques. This robustness effect has been shown in sorting circuits37, and it has been shown that the effect is not a simple consequence of evolved circuits being larger than traditional ones41. This also seems to be a general property in nature. For example, biochemical networks, such as those that define bacterial chemotaxis, are insensitive to surprisingly large variations in their constituent chemical pathways42. Perhaps the EC phenomenon and the natural one share a common explanation, although I know of no adequate current theory to provide one. Representations

In nature, genomes represent proteins using threenucleotide words from a four-character alphabet. Similarly, evolutionary computation needs an underlying genotypic ‘language’ with which to represent evolving individuals. But in EC there is no required a priori genetic code. In general, the EC practitioner must find a way to encode the parameters that are being evolved with the EC into chromosomes in such a way that the EC can efficiently converge to populations with good solutions. This is known as the REPRESENTATION PROBLEM. Care must also be taken to incorporate mutation or recombination operators that will not impede progress when applied to the chromosomes. In theory, the aim of EC is to evolve individual chromosomes towards a very high fitness. To achieve this aim, the representation must find ways of keeping good alleles together. Suppose that two genes represent related parameters in the target problem, such as the branch lengths for two closely related taxa in the phylogeny inference example, and that the data for each of three taxa is arranged on a chromosome in a linear fashion, in the order A, B, C. If one uses an EC with recombination operators that cut two parents and exchange fragments, then such genes should sit close to each other on the chromosome. Otherwise, this recombination is likely to separate two good alleles. ‘No free lunch’ (NFL) theorems43,44 prove that there is no single solution that is sure to work well for all problems, and any choice of representations and operators that works well with one class of problems is certain to work poorly with another45. However,‘all problems’ in this context is a vast domain, and includes problems for which there is no solution at all. It is unclear how


VOLUME 2 | JUNE 2001 | 4 3 3


REVIEWS While wall ahead

Turn left

Turn left

Do, then

Move

Do, then

Turn right

Original parent, with subtree to be replaced in green.

While wall ahead

Move

Randomly generated tree.

Turn left

Do, then

Move

Do, then

Turn right Move Result of headless chicken crossover.

Figure 4 | Example of headless chicken crossover. The chromosome resulting from mutation and recombination in BOX 2 represents a very poor robot controller for this maze: (While wall ahead (Turn left), then (Turn left)). Headless chicken crossover (HCC) replaces a randomly selected fragment of this program with a randomly generated one. For example, HCC might replace the second (Turn left) with (Do (Move), then (Do (Turn right), then (Move))) — assuming that to have been randomly generated. The resulting tree moves the robot further through the maze, and so has a higher fitness. Hill climbing in this example would test every possible change in the original tree, and find that replacing the second (Turn left) with (Move) will advance the robot, thus improving its fitness, and would therefore make that substitution.

informative the NFL theorems are for more restricted classes of problems — such as those of practical interest. One obvious way out of this conundrum is to design ways to encode information into chromosomes that evolve with the chromosomes themselves. This was surely the case in natural evolution, because the genetic code itself had to arise at some point. With GP, for example, the data on the chromosomes might be segregated into two types of subroutine, both of which are dynamically evolving: those that solve the target problem and those ‘automatically defined functions’ (ADFs)23. ADFs are small programs that larger programs use repeatedly (FIG. 3). A similar approach uses a population of ADFs that are separate from the main population46,47. Many GP applications show improved performance when ADFs are used, including the application of GP to the problem of identifying transmembrane regions23. Operators

MACROMUTATION

Any large change in alleles that does not involve recombination. HILL CLIMBING

Varying a chromosome by searching explicitly for small changes that improve the fitness of the individual, and then making those changes. PERMUTATION

An ordered arrangement of distinct items. EVOLUTIONARY ROBOTICS

Using evolutionary computation to design robots or robot controllers. INTRON

Data in chromosomes that do not contribute to fitness.

434

EC operators are the processes that transform chromosomes, and include mutation and recombination (these are treated separately in EC). As we have seen, selfadaptation randomly varies genes that are associated with the target problem according to a distribution that is determined by separate, evolving strategy parameters. In GA and GP, point mutations, a cut-pointinduced crossover and branch swapping in tree structures, usually suffice. In this section, I present some unusual alternative operators. Sometimes, special-purpose MACROMUTATIONS work better than the standard ones. For example, ‘headless chicken crossover’ removes a randomly selected fragment of a chromosome and replaces it with randomly generated alleles (FIG. 4)48,49. This is effectively carrying out a normal crossover with a randomly generated parent. Hybrid operators, which are not typical of evolution at all, also often work well. For example, HILL CLIMBING examines all available alleles for a single gene, and replaces the original allele with the one that results in the best individual50 (FIG. 4). Sometimes, standard operators will not work. For example, when the individual repre-

sents a PERMUTATION, special crossover and mutation operators are used to preserve the property that every element in the parents remains in the children, or the children will not themselves be permutations51. Another unusual operator is ‘learning’, in which individuals adapt to their environment at the phenotypic rather than the genotypic level. EVOLUTIONARY 52 ROBOTICS , frequently uses learning, because robots typically improve their performance by interacting with their environments and incorporating feedback from those interactions into their future behaviour25. In EC, learning might interact either directly or indirectly with evolution. The direct approach, known as Lamarckian evolution (for obvious reasons), is to allow individuals to alter their chromosomes before they replicate, encoding what they have learned into their genotype. In EC, Lamarckian evolution is a simple matter of programming and it can be very effective53. Indirectly, the Baldwin effect, which might also operate in nature54, causes genotypes with sufficient plasticity to drift towards fixed alleles that encode beneficial learned characters in static environments, in effect hard-wiring what would otherwise be learned55. It is also possible to design operators that themselves evolve. For example, probabilities of crossover and mutation might be encoded directly in the chromosome, as in self-adapting approaches. Explicitly adaptive operators also exist, for example where the genome carries both a chromosome and a description of recombination or mutation hot spots56,57, or dominance relationships in polyploid representations58 (see REF. 51 for a discussion — curiously, most EC representations are haploid). Also, so-called ‘homologous crossover’ (which has nothing to do with homology in the usual sense of related by common descent)59,60 restricts crossover to locations on one parent that are similar by some metric to a randomly chosen location on the other parent. Even without such efforts, the evolutionary process itself moderates mutation and recombination. For example, EC populations tend to ‘converge’ on similarly constituted chromosomes. When this happens, recombination produces children that are identical to their parents. So, the effective rate of recombination decreases as the genetic diversity of a population decreases. More intriguingly, in a phenomenon known as ‘code bloat’, or ‘survival of the fattest’, evolving programs often tend to grow by accumulating ‘junk’ code that does nothing (called INTRONS in the EC literature). This is a very general phenomenon, which happens whenever variable length chromosomes are present, unless there is a sufficient selective advantage for smaller genomes (see below). The rate of growth in code size is uncorrelated with improvements in fitness, so that even after the population has converged, chromosomes continue to grow. There are three reasons for bloat, but the dominant one is prophylactic: variations within the ‘ineffective’ regions do not harm the fitness of the chromosome61, but variation elsewhere usually does. So, bloat reduces the effective rates of recombination and mutation. Perhaps something akin to the prophylactic aspect of GP bloat induces selective advantages in nature, too.




REVIEWS

Box 4 | Selected sources for further information Books • Bäck, T., Fogel, D. B. & Michalewicz, Z. Handbook of Evolutionary Computation (Oxford Univ. Press, New York, 1997). • Banzhaf, W., Nordin, P., Keller, R. E. & Francone, F. D. Genetic Programming — An Introduction; On the Automatic Evolution of Computer Programs and its Applications (Morgan Kaufmann, San Francisco, 1998). • Bäck, T. Evolutionary Algorithms in Theory and Practice: Evolutionary Strategies, Evolutionary Programming, Genetic Algorithms (Oxford Univ. Press, New York, 1996). • Mitchell, M. An Introduction to Genetic Algorithms (MIT Press, Cambridge, Massachusetts, 1996). Journals • Genetic Programming and Evolvable Machines, Kluwer. • Evolutionary Computation, MIT Press. • IEEE Transactions on Evolutionary Computation, IEEE Press. Conference proceedings These conference proceedings are fully refereed, and are excellent resources for current research. • Genetic and Evolutionary Computing Conference (GECCO) • Congress of Evolutionary Computation (CEC) • European Conference on Genetic Programming (EuroGP) • Parallel Problem Solving From Nature (PPSN) • Foundations of Genetic Algorithms (FOGA)

PARSIMONY PRESSURE, which has been used in EC applications for many years, is the only known reliable antidote to bloat62. For example, changing the EC to detect and remove ‘ineffective’ code does not work, because evolution will find a way to insert useless code that the editing process cannot detect63. Perhaps this EC research might help to explain why genome sizes are so variable in nature, and why the smallest genomes tend to be those in which replicative efficiency is a strong selective advantage. There are many other areas in which EC researchers have either implemented or analysed processes that are of interest to biologists. There is a large body of work on the rates at which diversity is lost in populations during the execution of an EC, particularly when the population converges on an individual that represents a suboptimal solution to the target problem (so-called ‘premature convergence’)64. In EC, there are techniques for preserving diversity with ‘speciation’65. There has also been considerable work with co-evolving populations66. There are more biologically inspired algorithmic techniques that use evolutionary ideas, including ‘ant systems’ and artificial immune systems67,68. For more information, consult the selected sources in BOX 4.

Concluding thoughts

PARSIMONY PRESSURE

Selective fitness advantage for smaller genomes.

One tantalizingly unanswered question is ‘When should one use (or avoid) EC?’ In practice, EC has worked for a remarkable number and variety of problems, including many of interest to biologists. The literature has few, if any, descriptions of failed attempts to use EC — although that is certainly a result of selective publishing. There is little theoretical guidance on EC

limitations, but it is clear in practice that one must choose and design operators and representations carefully. EC seems to be a good first choice when one has a problem that involves a very large search space with some structure in the parameters that a search algorithm can exploit, but when the exact nature of the structure is poorly understood. This contrasts with more traditional stochastic techniques, which transform difficult problems into tractable ones by making (usually) unrealistic simplifying assumptions. Often, EC will find solutions that, when examined carefully, lead one to understand the problem better and thereby to design more traditional algorithms — although this second step is rarely taken. The biological sciences are full of such problems. The current wealth of genomic information, which will increase markedly, provides wonderful opportunities for applying EC. Data mining for genes and regulatory regions seems to be fairly well understood, and in any case is driven by an industry that will not wait for new algorithms. But EC techniques are likely to be useful for untangling questions about genomic organization. They might also have a role in developing tools for discovering protein structures, and for revealing the webs of interactions that make up gene regulation or metabolic pathways. In the short term, EC applications are likely to continue to be dominated by traditional engineering problems, as most practitioners are trained as engineers or computer scientists. As more biological questions become more widely known, they will increasingly attract the attention of EC researchers. However, this practical question is really a special case of the more general one: What are the principles of information organization and what are the processes by which evolution generates objects that are well adapted to their environments? General answers to this question rely on a deep theoretical understanding of evolutionary processes, which we do not have at present. This is the largest single challenge for EC researchers, and it is obviously a concern shared by theoretical biologists. Current EC theory attempts to explain formally and precisely why the evolutionary process works, and to predict when it will not work. (See REFS 11,43,69–75 for a full survey). It seems entirely reasonable to suppose that biologists and EC researchers have confronted similar issues, and that answers and techniques from one discipline might help colleagues from the other. Links FURTHER INFORMATION Tutorial: What is GP? | Genetic

programming | Easy guide to EC | Genetic Programming and Evolvable Machines, Kluwer | Evolutionary Computation, MIT Press | IEEE Transactions on Evolutionary Computation | Handbook of Evolutionary Computation | Genetic Programming — An Introduction; On the Automatic Evolution of Computer Programs and its Applications | Evolutionary Algorithms in Theory and Practice: Evolutionary Strategies, Evolutionary Programming, Genetic Algorithms | An Introduction to Genetic Algorithms | James Foster’s lab


VOLUME 2 | JUNE 2001 | 4 3 5


REVIEWS

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

436

Danielson, W. F., Foster, J. A. & Frincke, D. in Proc. Int. Conf. Evolutionary Computing (eds Fogel, D. B. & Angeline, P. J.) 259–264 (IEEE Press, Piscataway, New Jersey, 1998). Shoaf, J. S. & Foster, J. A. in Proc. 1996 Annu. Meeting of the Decision Sciences Institute Vol. 2, 571–573 (Decision Sciences Institute, Orlando, Florida, 1996). Marconi, J. & Foster, J. A. in Proc. Int. Conf. Evolutionary Computing (eds Fogel, D. B. & Angeline, P. J.) 650–655 (IEEE Press, Piscataway, New Jersey, 1998). Notredame, C. & Higgins, D. G. SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996). Chellapilla, K. & Fogel, G. B. in Proc. Congr. Evolutionary Computation (eds Angeline, P. J. & Porto, V. W.) 445–452 (IEEE Press, Piscataway, New Jersey, 1999). Zhang, C. & Wong, A. K. C. Toward efficient multiple molecular sequence alignment: a system of genetic algorithm and dynamic programming. IEEE Trans. Syst. Man Cybernet. B: Cybernet. 27, 918–932 (1997). Lewis, P. O. A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol. Biol. Evol. 15, 277–283 (1998). Koza, J. R., Mydlowec, W., Lanza, G., Yu, J. & Keane, M. A. in Pacific Symp. Biocomputing 2001 (eds Altman, R. B., Dunker, A. K., Hunker, L., Lauderdale, K. & Klein, T. E.) 434–445 (World Scientific, Singapore, 2001). Goh, G. K.-M. & Foster, J. A. in Proc. Genetic and Evolutionary Computation Conf. (GECCO-2000) (eds Whitley, D. et al.) 27–33 (Morgan Kaufmann, San Francisco, California, 2000). Gehlhaar, D. et al. Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolutionary programming. Chem. Biol. 2, 317–324 (1995). Bäck, T. Evolutionary Algorithms in Theory and Practice: Evolutionary Strategies, Evolutionary Programming, Genetic Algorithms (Oxford Univ. Press, New York, 1996). Survey of all evolutionary computation types, with mathematical characterization of their properties. Holland, J. H. Adaptation in Natural and Artificial Systems (MIT Press, Cambridge, Massachusetts, 1975). Readable introduction to genetic algorithms, with interesting additional comments on artificial ecosystems. Fraser, A. S. Simulation of genetic systems by automatic digital computers. I. Introduction. Aust. J. Biol. Sci. 10, 484–491 (1957). Krasnogor, N., Hart, W. E., Smith, J. & Pelta, D. A. in Proc. Genetic and Evolutionary Computation Conf. (eds Banzhaf, W. et al.) 1596–1601 (Morgan Kaufmann, San Francisco, California, 1999). Blickle, T. in Handbook of Evolutionary Computation (eds Bäck, T., Fogel, D. B. & Michalewicz, Z.) C2. 3: 1–4 (Oxford Univ. Press, New York, 1997). Fogel, D. B. Evolutionary Computation: Towards a New Philosophy of Machine Intelligence (IEEE Press, Piscataway, New Jersey, 1999). Fogel, D. B. in Proc. 2nd Annu. Conf. Evolutionary Programming (eds Fogel, D. B. & Atmar, W.) 23–29 (IEEE Press, Piscataway, New Jersey, 1993). Fogel, L. J., Owens, A. & Walsh, M. Artificial Intelligence Through Simulated Evolution (John Wiley & Sons, New York, 1966). Bäck, T., Hammel, U. & Schwefel, H.-P. Evolutionary computation: comments on the history and current state. IEEE Trans. Evol. Comput. 1, 3–17 (1997). Thorough presentation of main types of evolutionary computation (EC), with a complete discussion of EC history. Reed, J., Toombs, R. & Barricelli, N. A. Simulation of biological evolution and machine learning. I. Selection of self-reproducing numeric patterns by data processing machines, effects of hereditary control, mutation and type crossing. J. Theor. Biol. 17, 319–342 (1967). Rosenberg, R. Simulation of Genetic Populations with Biochemical Properties (Univ. of Michigan, Ann Arbor, 1967). Fogel, D. B., Fogel, L. J. & Atmar, J. W. in Proc. 25th Asilomar Conf. Signals, Systems and Computers (ed. Chen, R. R.) 540–545 (Pacific Grove, California, 1991). Koza, J. R. Genetic Programming. II. Automatic Discovery of Reusable Programs (MIT Press, Cambridge, Massachusetts, 1994). Koza, J. R. Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge, Massachusetts, 1992). Mayley, G. Landscapes, learning costs and genetic

assimilation. Evol. Comput. 4, 213–234 (1996). 26. Smith, R. E. in Handbook of Evolutionary Computation (eds Bäck, T., Fogel, D. B. & Michalewicz, Z.) B1. 5: 6–11 (Oxford Univ. Press, New York, 1997). 27. Holland, J. H. Adaptation in Natural and Artificial Systems (MIT Press, Cambridge, Massachusetts, 1992). 28. Russo, M. Genetic fuzzy learning. IEEE Trans. Evol. Comput. 4, 259–273 (2000). 29. Banzhaf, W., Nordin, P., Keller, R. E. & Francone, F. D. Genetic Programming — An Introduction; On the Automatic Evolution of Computer Programs and its Applications (Morgan Kaufmann, San Francisco, California, 1998). Textbook on genetic programming, with a survey of many applications. 30. Foster, J. A. Discipulus: the first commercial genetic programming system. J. Genetic Programming Evolvable Hardware 2, 201–203 (2001). 31. Ryan, C., Collins, J. J. & O’Neill, M. in Proc. 1st European Workshop on Genetic Programming (eds Banzhaf, W., Poli, R., Schoenauer, M. & Fogarty, T. C.) 83–95 (Springer, New York,1998). 32. Koza, J. R., Bennett, F. H., Andre, D. & Keane, M. A. Genetic Programming. III. Darwinian Invention and Problem Solving (Morgan Kaufmann, San Francisco, California, 1999). 33. Koza, J. R., Bennett, F. H., Andre, D., Keane, M. A. & Dunlap, F. Automated synthesis of analog electrical circuits by means of genetic programming. IEEE Trans. Evol. Comput. 1, 109–128 (1997). 34. Miller, J. F., Job, D. & Vassilev, V. K. Principles in the evolutionary design of digital circuits. Part I. Genetic Programming Evolvable Machines 1, 7–35 (2000). 35. Miller, J. F., Job, D. & Vassilev, V. K. Principles in the evolutionary design of digital circuits. Part II. Genetic Programming Evolvable Machines 1, 259–288 (2000). 36. Miller, J. F. in Proc. Genetic and Evolutionary Computation Conf. (eds Banzhaf, W. et al.) 1135–1142 (Morgan Kaufmann, San Francisco, California, 1999). 37. Masner, J., Cavalieri, J., Frenzel, J. & Foster, J. A. in Proc. NASA/DoD Workshop on Evolvable Hardware (eds Stoica, A., Keymenlen, D. & John, J.) 255–261 (IEEE Press, Piscataway, New Jersey, 1999). 38. Thompson, A. in Genetic Programming 1996: Proc. 1st Annu. Conf. (eds Koza, J. R., Goldberg, D. E., Fogel, D. B. & Riolo, R. L.) 444–452 (Morgan Kaufmann, San Francisco, California, 1996). 39. Dumoulin, J., McGrew, S., Frenzel, J. & Foster, J. A. in Proc. Int. Workshop on Evolvable Image and Digital Signal Processing (ed. Cagnoni, S.) 1–11 (Springer, New York, 2000). 40. Thompson, A., Harvey, I. & Husbands, P. in Towards Evolvable Hardware; The Evolutionary Engineering Approach (eds Sanchez, E. & Tomassini, M.) 136–165 (Springer, Berlin, 1996). 41. Masner, J., Cavalieri, J., Frenzel, J. & Foster, J. A. in Proc. NASA/DoD Workshop on Evolvable Hardware (eds Stoica, A., Keymenlen, D. & John, J.) 81–90 (IEEE Press, Piscataway, New Jersey, 2000). 42. Barkai, N. & Leibler, S. Robustness in simple biochemical networks. Nature 387 (1997). 43. Wolpert, D. H. & Macready, W. G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 67–82 (1997). 44. Culberson, J. C. On the futility of blind search: an algorithmic view of ‘no free lunch’. Evol. Comput. 6, 109–127 (1998). Readable presentation of ‘no free lunch’ theorems, an important part of evolutionary computation theory. 45. Whitely, D. in Proc. Genetic and Evolutionary Computation Conf. (eds Banzhaf, W. et al.) 833–839 (Morgan Kaufmann, San Francisco, California, 1999). 46. Angeline, P. J. & Pollack, J. B. in Artificial Life III (ed. Langton, C.) 55–71 (Addison-Wesley Longman, Reading, Massachusetts, 1994). 47. Rosca, J. P. & Ballard, D. H. in Advances in Genetic Programming 2 (ed. Angeline, P. J. & Kinnear, K. E.) 177–202 (MIT Press, Cambridge, Massachusetts, 1996). 48. Jones, T. in Proc. 6th Int. Conf. Genetic Algorithms (ed. Eshelman, L. J.) 73–80 (Morgan Kaufmann, San Francisco, California, 1995). 49. Angeline, P. J. in Genetic Programming 1997: Proc. 2nd Annu. Conf. (eds Koza, J. R. et al.) 9–17 (Stanford Univ., California, 1997). 50. O’Reilly, U.-M. & Oppacher, F. in Parallel Problem Solving from Nature (PPSN III) (eds Davidor, Y., Schwefel, H.-P. & Männer, R.) 397–406 (Springer, New York, 1994).


51. Goldberg, D. Genetic Algorithms in Search, Optimization and Machine Learning (Addison–Wesley, Reading, Massachusetts, 1989). 52. Lipson, H. & Pollack, J. B. Automatic design and manufacture of artificial lifeforms. Nature 406, 974–978 (2000). 53. Whitley, D. L., Gordon, V. S. & Mathias, K. E. in Parallel Problem Solving from Nature (eds Davidor, Y., Schwefel, H.-P. & Männer, R.) 6–15 (Springer, Berlin, 1994). 54. Turney, P., Whitely, D. & Anderson, R. Evolution, learning, and instinct: 100 years of the Baldwin effect. Evol. Comput. 4, iv–viii (1996). 55. Hinton, G. E. & Nowlan, S. J. How learning can guide evolution. Complex Syst. 1, 495–502 (1987). 56. Schaffer, J. D. & Morishima, A. in Genetic Algorithms and their Applications: Proc. 2nd Int. Conf. Genetic Algorithms (ed. Grefenstette, J. J.) 36–40 (Morgan Kaufmann, San Francisco, California, 1987). 57. Louis, S. J. & Rawlins, G. J. E. in Proc. 4th Int. Conf. Genetic Algorithms (eds Belew, R. K. & Booker, L. B.) 53–60 (Morgan Kaufmann, San Mateo, California, 1991). 58. Hadad, B. S. & Eick, C. F. in Evolutionary Programming VI (eds Angeline, P. J., Reynolds, R. G., McDonnell, J. R. & Eberhart, R.) 223–234 (Springer, New York, 1997). 59. Francone, F. D., Conrads, M., Banzhaf, W. & Nordin, P. in Proc. Genetic and Evolutionary Computation Conf. (eds Banzhaf, W. et al.) 1021–1026 (Orlando, Florida, 1999). 60. Langdon, W. B. Size fair and homologous tree genetic programming crossovers. Genetic Programming Evolvable Machines 1, 95–119 (2000). 61. Langdon, W. B., Soule, T., Poli, R. & Foster, J. A. in Advances in Genetic Programming 3 (eds Spector, L., Langdon, W. B., O’Reilly, U.-M. & Angeline, P. J.) 163–190 (MIT Press, Cambridge, Massachusetts, 1999). 62. Soule, T. & Foster, J. A. Effects of code growth and parsimony pressure on populations in genetic programming. Evol. Comput. 6, 293–309 (1998). 63. Soule, T., Foster, J. A. & Dickinson, J. in Genetic Programming 1996: Proc. 1st Annu. Conf. (eds Koza, J. R., Goldberg, D. E., Fogel, D. B. & Riolo, R. L.) 215–223 (Morgan Kaufmann, San Francisco, California, 1996). 64. Leung, K.-S., Duan, Q.-H., Xu, Z.-B. & Wong, C. K. A new model of simulated evolution computation — convergence analysis and specifications. IEEE Trans. Evol. Comput. 5, 3–16 (2001). 65. Deb, K. & Spears, W. M. in Handbook of Evolutionary Computation (eds Bäck, T., Fogel, D. B. & Michalewicz, Z.) C6. 2: 1–5 (Oxford Univ. Press, New York, 1997). 66. Paredis, J. in Evolutionary Computation. 2. Advanced Algorithms and Operators (eds Bäck, T., Fogel, D. B. & Michalewicz, Z.) 224–238 (Institute of Physics Publishers, Bristol, UK, 2000). 67. Dorigo, M. & Caro, G. D. in New Ideas in Optimization (eds Corne, D., Dorigo, M. & Glover, F.) 11–32 ( McGraw–Hill, London, 1999). 68. Hofmeyr, S. A. & Forrest, S. in Proc. Genetic and Evolutionary Computation Conf. (eds Banzhaf, W. et al.) 1289–1296 (Orlando, Florida, 1999). 69. Vose, M. D. The Simple Genetic Algorithm: Foundations and Theory (MIT Press, Cambridge, Massachusetts, 1999). A thorough presentation of current genetic algorithm theory. 70. Jones, T. Computer Science. Thesis, Univ. of New Mexico, Albuquerque, New Mexico (1995). 71. Poli, R. & Langdon, W. B. Schema theory for genetic programming with one-point crossover and point mutation. Evol. Comput. 6, 231–252 (1998). 72. Heckendorn, R. B. & Whitley, D. Walsh functions and predicting problem complexity. Evol. Comput. 7, 69–101 (1999). 73. Altenberg, L. in Foundations of Genetic Algorithms III (eds Whitley, L. D. & Vose, M. D.) 23–49 (Morgan Kaufmann, San Francisco, California, 1995). 74. Fogel, D. B. & Ghozeil, A. The schema theorem and the misallocation of trials in genetic algorithms. Inform. Sci. 122, 93–119 (2000). 75. Macready, W. G. & Wolpert, D. H. Bandit problems and the exploration/exploitation trade-off. IEEE Trans. Evol. Comput. 2, 2–22 (1998).

Acknowledgements This work was supported in part by the National Science Foundation and the National Institutes of Health. The author is particularly grateful for detailed comments from, and discussions with, D. Fogel, J. Koza, R. Heckendorn and H. Wichman, and the anonymous referees.