Graph Based Genetic Algorithms Daniel Ashlock Iowa State University Department of Mathematics Ames, Iowa, 50010 [email protected]

Mark Smucker Firefly Network One Broadway, 6th Floor Cambridge, MA 02142 [email protected]

Abstract- Genetic algorithms use crossover to blend pairs of putative solutions to a problem in hopes of creating novel solutions. At its best, crossover takes distinct good features from each of the two structures involved in the crossover. This creates a conflict: progress results from crossing over distinct types of structures but such crossover produces new structures that are like their parents, reducing the diversity on which successful crossover depends. In this paper we describe and test genetic algorithms that use a combinatorial graph to limit choice of crossover partner. This gives a computationally cheap method of picking a level of tradeoff between having heterogeneous crossover (crossover between genetically distinct individuals) and preservation of population diversity. Statistics for estimating the degree to which a given graphical population structure favors population diversity or heterogeneous crossover are given. These statistics are computed for ten example graphs. These graphs are then used as population structures for genetic algorithms of three test problems: a trivial string evolver, the plusone-recall-store (PORS) test suite for genetic programming [3, 4], and simple string controllers for Astro Teller’s Tartarus problem. [13]

1 Introduction In nature we find constraints such as geography or mutual infertility imposed on an organism’s ability to sexually reproduce with other organisms. In the Simple Genetic Algorithm (SGA) [5] the only constraint on reproduction is that more fit individuals have a higher probability of being selected for reproduction. In nature individuals who are separated by great distances, no matter what their respective fitnesses may be, have a very low probability of reproducing with each other. Within higher order species one also finds cultural constraints on the probability of two individuals reproducing, and actually in almost all “intelligent” animals the individuals in the population select their partner. One of the central problems of population genetics is explaining why there are not problems with loss of diversity in natural populations even though standard mathematical models show that diversity should vanish rapidly. Sewall Wright has his theory of Isolation by distance [16]. Kimura and Crow [6] examine the rate different graphical population structures lose their genetic diversity under simple reproduction without selection. Similarly, one of the fundamental problems in genetic algorithms is maintaining useful diversity in the pop-

John Walker Iowa State University Mechanical Engineering Department Ames, Iowa, 50010 [email protected]

ulation as the algorithm progresses. Various approaches to preventing diversity loss have been tried. This include using a high mutation rate, reducing the fitness of organisms in proportion to the number of organisms representing similar solutions (niche specialization), directly rejecting duplicate solutions, and imposing a geography upon the population[1, 10]. We expand on this last notion by imposing geographical structures coded as combinatorial graphs on the simple genetic algorithm. In Section 2 we give the necessary mathematical definitions and three invariants that estimate ability permit heterogeneous crossover (crossover between genetically distinct individuals), the ability to preserve population diversity, and an aggregate measure of both these qualities, geometrically averaged. In Section 3 we define graph based genetic algorithms and describe the three problems we use to test them. Section 4 gives the precise design of the experiments performed and their outcomes. Section 5 gives conclusions and discusses the experimental results. Section 6 discusses future directions for this work.

2 Mathematical Background A combinatorial graph or graph, G, is a set V (G) of vertices and E(G) of edges where E(G) is a subset of the unordered pairs that can be drawn from V (G). Two vertices of the graph are neighbors if they are members of the same edge. For an introduction to graph theory, we refer the reader to [14]. We will term a graph used to constrain mating in a population the population structure. The general strategy is to use the graph to specify the geography on which a population lives, permitting mating only between neighbors, and find graphs that preserve diversity without hindering progress due to heterogeneous crossover. We use a nonstandard operation on graphs that generates a valuable substructure, simplexification. Simplexification at a vertex v replaces v with a cluster of vertices, one for each neighbor of v so that all the new vertices are neighbors of one another and each is a neighbor of exactly one of v’s former neighbors. Simplexification of a vertex with four neighbors is shown in Figure 1. The effect of simplexification creates small groups of vertices that are closely coupled to one another but less closely coupled to the rest of the graph. This crates an analog of biological refuges in the graphical connection topology. A coloring of a graph is a mapping, f : V (G) → C, where C is a set of colors. In this research the different colors will represent distinct genotypes and hence each possible

Figure 1: Simplexification of a vertex with four neighbors population corresponds to a coloring f. We will measure the diversity of a population by the entropy of the frequency distribution of colors. More precisely, if there are n members of a population, k genotypes (or colors) with ai members of the population having genotype i, 1 ≤ i ≤ k, then the entropy E(f) of the population f is X ai n E(f) = · log2 ( ). (1) n ai ai 6=0

This is the Shannon entropy of the distribution of genotypes and is one standard measure of population diversity used in conservation biology. If our only goal were to preserve the diversity then we would generate a random population, inherently richly diverse, and allow no mating to take place. Since we want to permit the population to evolve, we must permit some mating. Connection topologies that preserve diversity do so by allowing only limited mating to take place. In Ackley and Littman[1] an evolving population is situated on a grid of processors in a multiprocessor machine. The authors observe that the useful crossover-based computation takes place in areas where different types of creatures are adjacent. Ackley and Littman’s geography quickly self-organizes into genetically homogeneous cells with useful crossover based computation taking place mostly at the cell boundaries where dissimilar creatures can mate. Motivated by this example we approximate the useful crossover-based computation taking place in a graphically structured population f by the edge, Q(f), of the population defined to be: Q(f) =

|{{u, v}εE(G) : f(u) 6= f(v)}| . |E(G)|

(2)

The edge of population on a graph is the fraction of potential matings that are between organisms with different genotypes. Intuitively this fraction should be closely related to the faction of heterogeneous crossover events. The neutral selection behavior of a population is the way a population changes in the absence of any selective pressure. We approximate the average neutral selection behavior of edge and entropy for a population structure (graph) by repeating the following computation many times and averaging the results. Each vertex is initially assigned a distinct inte-

ger value (this is a population with maximal edge and entropy). We repeatedly generate mating events that consist of picking a vertex uniformly at random and copying the integer currently on one of its neighbors, also selected uniformly at random, onto it. As this happens we compute the edge and entropy of the population as the function of the number of mating events. The averages approximate the way a population structure G permits populations living on it to evolve their entropy and edge under neutral selection. For a population structure G, denote the neutral selection entropy function N E G (t) and the neutral edge function N QG (t) where t is the number of mating events. A high edge value in a population usually implies that it is losing entropy (diversity) quickly. This is because crossover events between distinct types of parents often replaces one of them with a child less distinct from the surviving parent. From this we see that these two qualities of a population structure, that it maintain high entropy and high edge over time, conflict with one another. We define the following summary statistics and compare them to performance on test problems since it is problem dependent which measure is a more useful predictor of performance in a genetic algorithm using a given population structure. For all three statistics we take the time weighted average of the quality of interest as preserving edge or entropy later in evolution is more impressive that preserving it early on. The entropy preservation of a population structure G is: Z N E G(t) · dt. (3) EP (G) = t · Log2 (|V (G)|) The edge preservation of a population structure G is Z QP (G) = t · N QG (t) · dt. The mean quality of a population structure G is s Z N E G(t) M Q(G) = t · · N QG(t) · dt. Log2 (|V (G)|)

(4)

(5)

Division by log2 (|V (G)|) normalizes entropy to have a maximum value of 1 no matter the number of vertices, |V (G)| in the population structure. A logical question to ask is “even if preserving edge and entropy is good, why would the neutral selection versions matter?” There are two answers. First, we will present experiments to test their utility. Second, many evolving populations reach a point with little fitness variation. Neutral selection behavior may be a good model for this regime. 2.1 Desirability of Edge and Entropy Preservation Edge and entropy are statistical surrogates for qualities that conventional wisdom believes worth preserving in a genetic algorithm. The entropy statistic is a surrogate for population diversity, borrowed from population biology. From an

optimization perspective, a diverse population is one not yet trapped in a local optima or scattered about widely within an optima and so more likely to escape it. It is clear that diversity preservation is valuable in problems with multimodal fitness landscapes. If we depart from optimization and examine adaption there is an even more pressing need to preserve diversity. An algorithm with agents undergoing adaption to changing conditions, be it other agents or a varying figure of merit imposed by the experimenter, can be viewed as having a moving fitness landscape. A picture of such a fitness landscape might be like the surface of an ocean on a choppy day. With a diverse population there are more likely to be individuals on or near what is presently a high spot. The need for a high value of the edge preservation statistic is less clear, but may be seen by examining a situation in which crossover is of no worth. One such situation is a population in which most of the diversity is gone and individuals are very much like one another. Here, the typical crossover produces two children that look like their parents. This happens because the parents are identical or differ in only one location, with crossover deciding which child is a copy of which parent, but generating no new structures. In this case any effort spent manipulating structures to perform crossover is wasted. If we want crossover to be between genes that are different, then the preservation of edge is at least in the direction of keeping the rate of such crossovers relatively high over time. Imagine that we have a graph with a high mean quality (Equation 5). Then something intrinsic in the topology of the graph keeps both edge and and entropy at at least some minimal level over time; this is implicit in the definition of mean quality. Individuals with distinct genotypes are crossing over fairly often and no one genotype is coming to dominate the population too rapidly. This should then improve performance on problems where heterogeneous crossover and diversity preservation are valuable.

3 Definition of Graph Based Genetic Algorithms A graph based genetic algorithm is a genetic algorithm run with a graphical population structure. We assume the reader to be familiar with the standard terminology of genetic algorithms as given in [5]. One individual is placed on each vertex of the graph. A type of steady state genetic algorithm [11, 12, 15] is used where evolution proceeds by individual mating events. A mating event is performed as follows. Pick a vertex v of the graph uniformly at random. A neighbor of v is chosen for mating with a fitness bias. Crossover and probabilistic mutation are used to produce a single new individual which may or may not be used to replace the individual on vertex v. The details of how the neighbor is picked for mating and how to decide if the new individual replaces the individual on vertex v are called the local mating rule of the graph based genetic algorithm. In this research the local mating rule will pick neighbors in direct proportion to their

Figure 2: K4,4 and the result of simplexifying every vertex of K4,4 . fitness (roulette selection) and let the new individual replace the old if it is at least as fit as the individual it replaces. We call this local mating rule locally elite roulette mating. We report results on graphical genetic algorithms for four problems on ten graphs. See West[14] for definition of all of the following graphs except the last. These graphs all available by electronic mail, as lists of neighbors, from the first author. The graphs are as follows: the complete graph K512 , the generalized Petersen graphs P256,1, P256,3, P256,17, the 4-neighbor tori T4,128, T8,64 , T16,32, the cycle C512, the 9dimensional hypercube H9 , and a graph Z derived by the following procedure. Start with the complete bipartite graph K4,4 . Simplexify every vertex (see Section 2). Repeat simplexification twice more. The graph K4,4 and first step of this process are shown in Figure 2. All of these graphs contain 512 vertices. The complete graph was chosen as a baseline - a graph based genetic algorithm using the complete graph simulates a fairly standard genetic algorithm. The other graphs were chosen because either (i) they are the connection topology of a popular type of multiprocessor machine or (ii) they added to the diversity of the graphs tested as measured by our summary statistics. The summary statistics for these ten graphs are given in Figure 3 together with the graphs listed in increasing order for each statistic.

G K512 P256,1 P256,3 P256,17 T4,128 T8,64 T16,32 C512 H9 Z

EP(G) 79281 69518 87384 103432 81037 94873 102961 59703 11957 98851

EP(G) H9 C512 P256,1 K512 T4,128 P256,3 T8,64 Z T16,32 P256,17

QP(G) 41396 177679 128892 84733 140117 103610 86249 213902 7550 127435

QP(G) H9 K512 P256,17 T16,32 T8,64 Z P256,3 T4,128 P256,1 C512

MQ(G) 152588 27408 59312 126898 47044 86881 123394 16829 19005 76807

MQ(G) C512 H9 P256,1 T4,128 P256,3 Z T8,64 T16,32 P256,17 K512

Figure 3: Summary statistics for graphical connection topologies and graphs ordered by the statistics in increasing order.

4 Experimental Design and Results For each test problem we give a definition of a successful individual. For each graph we run a graph based genetic algorithm 200 times, saving the number of mating events until the first successful individual appears in each run. We treat the fraction of populations that contain a successful individual after k mating events as a Bernoulli variable with probability pk of success. For each graph we compute the minimum number of mating events k for which half the populations associated with that graph contain a successful individual. For each other graph we then use a normal approximation to the binomial distribution to construct a 95% confidence interval for pk . Other graphs are judged significantly worse as population structures for the test problem if this confidence intervals upper bound is less than 0.5. A table of these upper bounds is supplied with each experiment For details of the method of constructing this type of confidence interval see Larsen and Marx[9]. 4.1 Trivial String Evolver Experiment The first experiment operates on twenty character binary strings. The initial population consists of strings chosen uniformly at random and the fitness of a string is the number of ones in the string. This problem is easy, high-dimensional, and unimodal. A successful individual in this experiment is one composed entirely of ones. The graph based genetic al-

gorithm uses two point crossover and mutation consists of changing one character, selected uniformly at random, from zero to one or one to zero. The confidence interval upper bounds for this experiment are given in Figure 4. 4.2 Plus-One-Recall-Store Experiments This experiment tests the value of graph based genetic algorithms for genetic programming [8, 7]. The test problem, called the plus-one-recall store (PORS) efficient node use problem, is to find parse trees in a very limited language that, when executed, generates the largest integer result possible given a fixed maximum number of parse tree nodes. The language has two operations, integer addition and a store that places its argument in an external memory location, and two terminals, the integer 1 and recall from an external memory. This test problem is described in detail in [3, 4]. The difficulty of the PORS-efficient node use problem varies very strongly according to the upper bound on the number of nodes modulo three. We ran experiments on n = 15 nodes, the hardest of the three classes. In this set of experiment the initial population was made of randomly generated trees with n nodes. A successful individual was defined to be a tree that produces the largest possible number (these numbers are computed in [4]). Crossover was performed by the usual subtree exchange [8]. If this produced a tree with more than n nodes then a subtree of the root node iteratively replaced the root node until the tree had less than n nodes. We term this operation chopping. Mutation was performed by replacing a subtree uniformly at random with a new random subtree the same size, with probability 0.2 for each new tree produced. The confidence interval upper bounds for this experiment are given in Figure 5. 4.3 Tartarus String Experiments The Tartarus problem was first posed by Astro Teller [13]. A small autonomous robot is places in a six-by-six grid. The robot occupies one grid square and may turn right, turn left, or attempt to advance during each time step. Six boxes, each filling a grid square, are placed away from the edges of the grid with no two-by-two region containing four boxes. (The robot heading, position, and box positions at the start of a test define a “board”.) The robot may advance into an empty square or pushing one box but two boxes or a wall stop the robot. The robot’s task is to push boxes into corners (worth two points) or against a wall (worth one point). The robot is given eighty time steps and then its score is assessed for a given starting configuration. Teller and we both sum fitnesses over 40 boards to approximate the quality of a robot controller. Research on baselines for the Tartarus problem reported in [2] indicate that non-reactive controllers with memory, implemented as strings of moves, can obtain fitness comparable to those obtained by Teller’s reactive GP-tree controllers with memory. In this experiment we use strings of sixteen moves

H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 T8,64 ∗ . 0.16 ∗ 0.40 ∗ 0.29 ∗ 0.34 ∗ 0.36 ∗ 0.43 ∗ 0.50 ∗ 0.45 ∗ 0.48 0.99 . 0.94 0.69 0.92 0.92 0.95 0.97 0.95 0.97 ∗ 0.76 ∗ 0.26 . 0.38 ∗ 0.48 0.55 0.61 0.69 0.58 0.65 0.95 ∗ 0.45 0.84 . 0.79 0.83 0.84 0.90 0.85 0.90 ∗ P256,17 0.83 ∗ 0.29 0.64 0.43 . 0.63 0.71 0.75 0.67 0.74 ∗ ∗ P256,3 0.78 0.27 0.60 0.38 ∗ 0.50 . 0.64 0.70 0.61 0.68 ∗ P256,1 0.72 ∗ 0.24 0.55 0.36 ∗ 0.47 0.51 . 0.65 0.54 0.62 ∗ T16,32 0.64 ∗ 0.17 ∗ 0.45 ∗ 0.32 ∗ 0.39 ∗ 0.43 ∗ 0.48 . 0.47 0.54 ∗ T41 28 0.74 ∗ 0.25 0.55 0.37 ∗ 0.47 0.54 0.59 0.68 . 0.64 ∗ ∗ ∗ T8,64 0.68 0.20 0.50 0.34 ∗ 0.42 ∗ 0.45 0.52 0.61 0.51 . * Values Less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

Figure 4: Confidence interval upper bounds for trivial string evolver experiment. H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 . 0.59 0.67 0.65 0.65 0.64 0.75 0.67 0.64 0.51 . 0.62 0.59 0.63 0.59 0.72 0.65 0.60 ∗ 0.47 0.52 . 0.52 0.58 0.54 0.64 0.59 0.56 0.51 0.55 0.61 . 0.61 0.57 0.69 0.63 0.59 P256,17 ∗ 0.47 0.51 0.56 0.51 . 0.53 0.63 0.58 0.56 P256,3 ∗ 0.49 0.54 0.60 0.55 0.59 . 0.68 0.62 0.57 ∗ ∗ P256,1 ∗ 0.42 ∗ 0.46 ∗ 0.50 ∗ 0.42 0.52 0.49 . 0.49 ∗ 0.48 ∗ T16,32 ∗ 0.46 ∗ 0.50 0.55 0.49 0.56 0.52 0.62 . 0.54 ∗ T4,128 0.47 0.53 0.59 0.53 0.58 0.55 0.67 0.61 . T8,64 ∗ 0.47 0.52 0.58 0.53 0.58 0.55 0.65 0.60 0.56 * Values less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

T8,64 0.66 0.61 0.55 0.59 0.54 0.57 ∗ 0.49 0.52 0.57 .

Figure 5: Confidence interval upper bounds for the PORS-efficient node use experiment on n = 15 node trees. (left, right, forward) which the algorithm cycles through five times to obtain 80 moves. The termination requirement, i.e. definition of successful individual, is modified to require the average per-board fitness of the entire population to be greater than 3.0 (about 60% of the maximum known fitness for string controllers of this type) for forty boards. The requirement that success is demonstrated over long time scales is necessary due to the stochasticity of the fitness function. The set of forty boards used in a fitness evaluation are generated uniformly at random. Every 256 mating events (half the population size) the set of forty test boards is replaced with a new, randomly generated set of forty. The fitness test thus samples the space of possible boards in a manner consistent with a steady state genetic algorithm. We use two point crossover of the strings of moves and a probabilistic mutation that changes between zero and three positions in the string to a new randomly generated action. The confidence interval upper bounds for this experiment are given in Figure 6.

5 Conclusions The first analysis we performed on the significance data in Figures 4, 5, 6 was to partially order the graphs by relative performance advantage. In the trivial string evolver experiment 36 of the 81 pairs of graphs exhibited a statistically significant difference in performance. The partial order segregates the graphs nicely as follows: H9 ≥ T16,32 , T8,64 ≥ P256,1, T4,128, P256,3, Z ≥ P256,17 ≥ K512 ≥ C512 . Figure 7 shows this relationship graphically with an arrow indicating statistically significant dominance.

T8,64 H9 T16,32

Z P256,3 P256,17 K512 C512 T4,128 P256,1

Figure 7: Graph dominance figure for trivial strings.

H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 ∗ ∗ . 0.01 ∗ 0.14 0.74 0.20 ∗ 0.21 ∗ 0.10 ∗ 0.33 ∗ 0.29 1.00 . 1.00 1.00 1.00 0.99 0.99 1.00 1.00 ∗ 0.99 0.10 . 1.00 0.65 0.61 0.54 0.81 0.84 ∗ ∗ 0.40 ∗ 0.00 ∗ 0.07 . 0.06 ∗ 0.12 ∗ 0.01 ∗ 0.18 ∗ 0.17 ∗ ∗ P256,17 0.97 0.05 ∗ 0.47 0.98 . 0.55 0.42 0.70 0.71 ∗ ∗ P2563 0.98 0.07 0.51 0.99 0.62 . 0.48 0.76 0.78 ∗ P256,1 1.00 0.12 0.62 1.00 0.72 0.69 . 0.86 0.87 ∗ ∗ T16,32 0.90 0.04 ∗ 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 . 0.61 ∗ ∗ T4,128 0.90 0.04 ∗ 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 0.60 . ∗ ∗ ∗ T8,64 0.90 0.04 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 0.60 0.61 * Values less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

T8,64 0.33 1.00 0.83 ∗ 0.19 0.74 0.80 0.85 0.60 0.60 . ∗

Figure 6: Confidence interval upper bounds for the string based Tartarus controller experiments. In the PORS experiment only 15 of 81 pairs of graphs exhibited a performance advantage and the partial order indicated unambiguous dominance by P256,1 but otherwise gave a somewhat ragged ranking of the graphs: P256,1 ≥ P256,17 ≥ T4,128 , T8,64, T16,32, P256,3 ≥ K512 , H9 , C512. Figure 8 shows this relationship graphically with an arrow indicating statistically significant dominance.

P256,17

T8,64 T16,32

Z

T4,128

P256,17 K512 H9

Z

T8,64

C512 P256,3

P256,1

T4,128 Figure 9: Graph dominance figure for Tartarus.

C512

P256,3

P256,1

T16,32

H9 K512

Figure 8: Graph dominance figure for PORS. The Tartarus experiment yielded 39 of 81 pairs of graphs with a statistically significant performance advantage. The partial order gave a firm segregation of the graphs in the following fashion: K512 ≥ H9 ≥ T4,128, T8,64 , T16,32 ≥ P256,3, P256,17 ≥ P256,1, Z ≥ C512. Figure 9 shows this relationship graphically with an arrow indicating statistically significant dominance. Comparing the partial orders of graphs by relative performance with the summary statistics in Figure 3 we find that preservation of diversity as estimated by EP(G) had little predictive value, that the mean quality statistic MP(G) had little predictive value, but that edge preservation was somewhat predictive of success in the trivial string evolver and strongly predicted success in the Tartarus experiments. Either our measure of diversity preservation is not a good one or diversity preservation is not an issue for the test problems and population sizes we have presented. The graph-theoretic

intuition of the authors is that more subtle measures of graphical connectivity may be the key predictor of performance. In any case we have some support for Ackley and Littman’s hypothesis that having lots of boundary area between regions with distinct genotypes is good. To finish our conclusions on a positive note, we can report a very strong effect from changing population structures. While we at present have little of worth to say about which population structures are good for what problem (the sum total of general result is that C512 is bad) we see that (i) changing the population structure has a significant effect on solution time and (ii) different population structures are good for different problems. Note, for example, the positions of the nine-hypercube H9 and the complete graph K512 in each of the three test problems.

6 Discussion and Future Work An important point to make about the graph based genetic algorithm is that when use of a given graphical connection topology helps, it does so with little run time cost. The cost of locating the good graph is paid before the experimenter runs his genetic algorithm. Only the overhead for maintaining the list of graphical connections is charged to the graph based genetic algorithm’s run time. The crop of experiments reported here show that graphical connection topologies can help but give only a a modest amount of help in choosing them (the edge preservation statistic, Equation 4, has a lim-

ited, empirical correlation with performance on two of three test problems). Knowing that the choice of connection topology is important, it remains to figure out good predictors of what graphical topologies will help with which sorts of problems. In addition to trying graph based genetic algorithms on a wider variety of problems (and on harder problems) there are a number of other issues that need to be examined. It is intuitive that the local mating rule is important and we have tested exactly one, highly elitist, local mating rule. Experiments for well-mixed populations that speak to optimum allocation of breeding trials, mutation rates, and crossover rates need to be repeated in the graph based context to see if the change in population structure modifies results. The experimental results from this paper and the ordering of graphs they induce are suggestive and far from conclusive: a much larger number of graphs should be studied. Researchers wishing to have a the graph based genetic algorithm code used in this paper as a starting point should contact the first author. Another, more difficult, direction for future research involves locating good graphical topologies. If a good predictor of performance of graphical population structures is available for a class of problems then a genetic algorithm or other optimization method could be used to locate good graphs. It also might be possible to vary the local structure of the graph and so locate good connection topologies in an online manner during the execution of a genetic algorithm. This second suggested direction is a meta-algorithms, a notoriously tricky and time-consuming effort.

Acknowledgments The authors would like to thank the Iowa Center for Emerging Manufacturing Technology for their hardware and logistical support of this research. We also would like to thank the referees for their proofreading and helpful suggestions.

Bibliography [1] D. L. Ackley and M. L. Littman. A case for distributed Lamarckian evolution. Working Paper, Cognitive Science Research Group, Bellcore, New Jersey, 1992. [2] Dan Ashlock and Mark Joenks. ISAc lists, a different representation for program induction. In Genetic Programming 98, proceedings of the third annual genetic programming conference., pages 3–10, San Francisco, 1998. Morgan Kaufmann. [3] Dan Ashlock and James I. Lathrop. An arithmetic test suite for genetic programming. ISU Applied Mathematics Report AM98-1, 1998. [4] Dan Ashlock and James I. Lathrop. A fully characterized test suite for genetic programming. In Evolutionary Programming VII, pages 537–546, New York, 1998. Springer.

[5] David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Inc., Reading, MA, 1989. [6] Motoo Kimura and James Crow. On the maximum avoidance of inbreeding. Genetical Research, 4:399– 415, 1963. [7] Kenneth Kinnear. Advances in Genetic Programming. The MIT Press, Cambridge, MA, 1994. [8] John R. Koza. Genetic Programming. The MIT Press, Cambridge, MA, 1992. [9] Richard J. Larsen and Morris L. Marx, editors. An introduction to mathematical statistics and its applications. Prentice-Hall Inc., 1981. ¨ [10] Heinz M{u}hlenbein. Darwin’s continent cycle theory and its simulation by the prisoner’s dilemma. Complex Systems, 5:459–478, 1991. [11] Craig Reynolds. An evolved, vision-based behavioral model of coordinated group motion. In Jean-Arcady Meyer, Herbert L. Roiblat, and Stewart Wilson, editors, From Animals to Animats 2, pages 384–392. MIT Press, 1992. [12] Gilbert Syswerda. A study of reproduction in generational and steady state genetic algorithms. In Foundations of Genetic Algorithms, pages 94–101. Morgan Kaufmann, 1991. [13] Astro Teller. The evolution of mental models. In Kenneth Kinnear, editor, Advances in Genetic Programming, chapter 9. The MIT Press, 1994. [14] Douglas B. West. Introduction to Graph Theory. Prentice Hall, Upper Saddle River, NJ 07458, 1996. [15] Darrel Whitley. The genitor algorithm and selection pressure: why rank based allocation of reproductive trials is best. In Proceedings of the 3rd ICGA, pages 116– 121. Morgan Kaufmann, 1989. [16] Sewall Wright. Evolution. University of Chicago Press, 1986. Edited and with introductory Materials by William B. Provine.

Mark Smucker Firefly Network One Broadway, 6th Floor Cambridge, MA 02142 [email protected]

Abstract- Genetic algorithms use crossover to blend pairs of putative solutions to a problem in hopes of creating novel solutions. At its best, crossover takes distinct good features from each of the two structures involved in the crossover. This creates a conflict: progress results from crossing over distinct types of structures but such crossover produces new structures that are like their parents, reducing the diversity on which successful crossover depends. In this paper we describe and test genetic algorithms that use a combinatorial graph to limit choice of crossover partner. This gives a computationally cheap method of picking a level of tradeoff between having heterogeneous crossover (crossover between genetically distinct individuals) and preservation of population diversity. Statistics for estimating the degree to which a given graphical population structure favors population diversity or heterogeneous crossover are given. These statistics are computed for ten example graphs. These graphs are then used as population structures for genetic algorithms of three test problems: a trivial string evolver, the plusone-recall-store (PORS) test suite for genetic programming [3, 4], and simple string controllers for Astro Teller’s Tartarus problem. [13]

1 Introduction In nature we find constraints such as geography or mutual infertility imposed on an organism’s ability to sexually reproduce with other organisms. In the Simple Genetic Algorithm (SGA) [5] the only constraint on reproduction is that more fit individuals have a higher probability of being selected for reproduction. In nature individuals who are separated by great distances, no matter what their respective fitnesses may be, have a very low probability of reproducing with each other. Within higher order species one also finds cultural constraints on the probability of two individuals reproducing, and actually in almost all “intelligent” animals the individuals in the population select their partner. One of the central problems of population genetics is explaining why there are not problems with loss of diversity in natural populations even though standard mathematical models show that diversity should vanish rapidly. Sewall Wright has his theory of Isolation by distance [16]. Kimura and Crow [6] examine the rate different graphical population structures lose their genetic diversity under simple reproduction without selection. Similarly, one of the fundamental problems in genetic algorithms is maintaining useful diversity in the pop-

John Walker Iowa State University Mechanical Engineering Department Ames, Iowa, 50010 [email protected]

ulation as the algorithm progresses. Various approaches to preventing diversity loss have been tried. This include using a high mutation rate, reducing the fitness of organisms in proportion to the number of organisms representing similar solutions (niche specialization), directly rejecting duplicate solutions, and imposing a geography upon the population[1, 10]. We expand on this last notion by imposing geographical structures coded as combinatorial graphs on the simple genetic algorithm. In Section 2 we give the necessary mathematical definitions and three invariants that estimate ability permit heterogeneous crossover (crossover between genetically distinct individuals), the ability to preserve population diversity, and an aggregate measure of both these qualities, geometrically averaged. In Section 3 we define graph based genetic algorithms and describe the three problems we use to test them. Section 4 gives the precise design of the experiments performed and their outcomes. Section 5 gives conclusions and discusses the experimental results. Section 6 discusses future directions for this work.

2 Mathematical Background A combinatorial graph or graph, G, is a set V (G) of vertices and E(G) of edges where E(G) is a subset of the unordered pairs that can be drawn from V (G). Two vertices of the graph are neighbors if they are members of the same edge. For an introduction to graph theory, we refer the reader to [14]. We will term a graph used to constrain mating in a population the population structure. The general strategy is to use the graph to specify the geography on which a population lives, permitting mating only between neighbors, and find graphs that preserve diversity without hindering progress due to heterogeneous crossover. We use a nonstandard operation on graphs that generates a valuable substructure, simplexification. Simplexification at a vertex v replaces v with a cluster of vertices, one for each neighbor of v so that all the new vertices are neighbors of one another and each is a neighbor of exactly one of v’s former neighbors. Simplexification of a vertex with four neighbors is shown in Figure 1. The effect of simplexification creates small groups of vertices that are closely coupled to one another but less closely coupled to the rest of the graph. This crates an analog of biological refuges in the graphical connection topology. A coloring of a graph is a mapping, f : V (G) → C, where C is a set of colors. In this research the different colors will represent distinct genotypes and hence each possible

Figure 1: Simplexification of a vertex with four neighbors population corresponds to a coloring f. We will measure the diversity of a population by the entropy of the frequency distribution of colors. More precisely, if there are n members of a population, k genotypes (or colors) with ai members of the population having genotype i, 1 ≤ i ≤ k, then the entropy E(f) of the population f is X ai n E(f) = · log2 ( ). (1) n ai ai 6=0

This is the Shannon entropy of the distribution of genotypes and is one standard measure of population diversity used in conservation biology. If our only goal were to preserve the diversity then we would generate a random population, inherently richly diverse, and allow no mating to take place. Since we want to permit the population to evolve, we must permit some mating. Connection topologies that preserve diversity do so by allowing only limited mating to take place. In Ackley and Littman[1] an evolving population is situated on a grid of processors in a multiprocessor machine. The authors observe that the useful crossover-based computation takes place in areas where different types of creatures are adjacent. Ackley and Littman’s geography quickly self-organizes into genetically homogeneous cells with useful crossover based computation taking place mostly at the cell boundaries where dissimilar creatures can mate. Motivated by this example we approximate the useful crossover-based computation taking place in a graphically structured population f by the edge, Q(f), of the population defined to be: Q(f) =

|{{u, v}εE(G) : f(u) 6= f(v)}| . |E(G)|

(2)

The edge of population on a graph is the fraction of potential matings that are between organisms with different genotypes. Intuitively this fraction should be closely related to the faction of heterogeneous crossover events. The neutral selection behavior of a population is the way a population changes in the absence of any selective pressure. We approximate the average neutral selection behavior of edge and entropy for a population structure (graph) by repeating the following computation many times and averaging the results. Each vertex is initially assigned a distinct inte-

ger value (this is a population with maximal edge and entropy). We repeatedly generate mating events that consist of picking a vertex uniformly at random and copying the integer currently on one of its neighbors, also selected uniformly at random, onto it. As this happens we compute the edge and entropy of the population as the function of the number of mating events. The averages approximate the way a population structure G permits populations living on it to evolve their entropy and edge under neutral selection. For a population structure G, denote the neutral selection entropy function N E G (t) and the neutral edge function N QG (t) where t is the number of mating events. A high edge value in a population usually implies that it is losing entropy (diversity) quickly. This is because crossover events between distinct types of parents often replaces one of them with a child less distinct from the surviving parent. From this we see that these two qualities of a population structure, that it maintain high entropy and high edge over time, conflict with one another. We define the following summary statistics and compare them to performance on test problems since it is problem dependent which measure is a more useful predictor of performance in a genetic algorithm using a given population structure. For all three statistics we take the time weighted average of the quality of interest as preserving edge or entropy later in evolution is more impressive that preserving it early on. The entropy preservation of a population structure G is: Z N E G(t) · dt. (3) EP (G) = t · Log2 (|V (G)|) The edge preservation of a population structure G is Z QP (G) = t · N QG (t) · dt. The mean quality of a population structure G is s Z N E G(t) M Q(G) = t · · N QG(t) · dt. Log2 (|V (G)|)

(4)

(5)

Division by log2 (|V (G)|) normalizes entropy to have a maximum value of 1 no matter the number of vertices, |V (G)| in the population structure. A logical question to ask is “even if preserving edge and entropy is good, why would the neutral selection versions matter?” There are two answers. First, we will present experiments to test their utility. Second, many evolving populations reach a point with little fitness variation. Neutral selection behavior may be a good model for this regime. 2.1 Desirability of Edge and Entropy Preservation Edge and entropy are statistical surrogates for qualities that conventional wisdom believes worth preserving in a genetic algorithm. The entropy statistic is a surrogate for population diversity, borrowed from population biology. From an

optimization perspective, a diverse population is one not yet trapped in a local optima or scattered about widely within an optima and so more likely to escape it. It is clear that diversity preservation is valuable in problems with multimodal fitness landscapes. If we depart from optimization and examine adaption there is an even more pressing need to preserve diversity. An algorithm with agents undergoing adaption to changing conditions, be it other agents or a varying figure of merit imposed by the experimenter, can be viewed as having a moving fitness landscape. A picture of such a fitness landscape might be like the surface of an ocean on a choppy day. With a diverse population there are more likely to be individuals on or near what is presently a high spot. The need for a high value of the edge preservation statistic is less clear, but may be seen by examining a situation in which crossover is of no worth. One such situation is a population in which most of the diversity is gone and individuals are very much like one another. Here, the typical crossover produces two children that look like their parents. This happens because the parents are identical or differ in only one location, with crossover deciding which child is a copy of which parent, but generating no new structures. In this case any effort spent manipulating structures to perform crossover is wasted. If we want crossover to be between genes that are different, then the preservation of edge is at least in the direction of keeping the rate of such crossovers relatively high over time. Imagine that we have a graph with a high mean quality (Equation 5). Then something intrinsic in the topology of the graph keeps both edge and and entropy at at least some minimal level over time; this is implicit in the definition of mean quality. Individuals with distinct genotypes are crossing over fairly often and no one genotype is coming to dominate the population too rapidly. This should then improve performance on problems where heterogeneous crossover and diversity preservation are valuable.

3 Definition of Graph Based Genetic Algorithms A graph based genetic algorithm is a genetic algorithm run with a graphical population structure. We assume the reader to be familiar with the standard terminology of genetic algorithms as given in [5]. One individual is placed on each vertex of the graph. A type of steady state genetic algorithm [11, 12, 15] is used where evolution proceeds by individual mating events. A mating event is performed as follows. Pick a vertex v of the graph uniformly at random. A neighbor of v is chosen for mating with a fitness bias. Crossover and probabilistic mutation are used to produce a single new individual which may or may not be used to replace the individual on vertex v. The details of how the neighbor is picked for mating and how to decide if the new individual replaces the individual on vertex v are called the local mating rule of the graph based genetic algorithm. In this research the local mating rule will pick neighbors in direct proportion to their

Figure 2: K4,4 and the result of simplexifying every vertex of K4,4 . fitness (roulette selection) and let the new individual replace the old if it is at least as fit as the individual it replaces. We call this local mating rule locally elite roulette mating. We report results on graphical genetic algorithms for four problems on ten graphs. See West[14] for definition of all of the following graphs except the last. These graphs all available by electronic mail, as lists of neighbors, from the first author. The graphs are as follows: the complete graph K512 , the generalized Petersen graphs P256,1, P256,3, P256,17, the 4-neighbor tori T4,128, T8,64 , T16,32, the cycle C512, the 9dimensional hypercube H9 , and a graph Z derived by the following procedure. Start with the complete bipartite graph K4,4 . Simplexify every vertex (see Section 2). Repeat simplexification twice more. The graph K4,4 and first step of this process are shown in Figure 2. All of these graphs contain 512 vertices. The complete graph was chosen as a baseline - a graph based genetic algorithm using the complete graph simulates a fairly standard genetic algorithm. The other graphs were chosen because either (i) they are the connection topology of a popular type of multiprocessor machine or (ii) they added to the diversity of the graphs tested as measured by our summary statistics. The summary statistics for these ten graphs are given in Figure 3 together with the graphs listed in increasing order for each statistic.

G K512 P256,1 P256,3 P256,17 T4,128 T8,64 T16,32 C512 H9 Z

EP(G) 79281 69518 87384 103432 81037 94873 102961 59703 11957 98851

EP(G) H9 C512 P256,1 K512 T4,128 P256,3 T8,64 Z T16,32 P256,17

QP(G) 41396 177679 128892 84733 140117 103610 86249 213902 7550 127435

QP(G) H9 K512 P256,17 T16,32 T8,64 Z P256,3 T4,128 P256,1 C512

MQ(G) 152588 27408 59312 126898 47044 86881 123394 16829 19005 76807

MQ(G) C512 H9 P256,1 T4,128 P256,3 Z T8,64 T16,32 P256,17 K512

Figure 3: Summary statistics for graphical connection topologies and graphs ordered by the statistics in increasing order.

4 Experimental Design and Results For each test problem we give a definition of a successful individual. For each graph we run a graph based genetic algorithm 200 times, saving the number of mating events until the first successful individual appears in each run. We treat the fraction of populations that contain a successful individual after k mating events as a Bernoulli variable with probability pk of success. For each graph we compute the minimum number of mating events k for which half the populations associated with that graph contain a successful individual. For each other graph we then use a normal approximation to the binomial distribution to construct a 95% confidence interval for pk . Other graphs are judged significantly worse as population structures for the test problem if this confidence intervals upper bound is less than 0.5. A table of these upper bounds is supplied with each experiment For details of the method of constructing this type of confidence interval see Larsen and Marx[9]. 4.1 Trivial String Evolver Experiment The first experiment operates on twenty character binary strings. The initial population consists of strings chosen uniformly at random and the fitness of a string is the number of ones in the string. This problem is easy, high-dimensional, and unimodal. A successful individual in this experiment is one composed entirely of ones. The graph based genetic al-

gorithm uses two point crossover and mutation consists of changing one character, selected uniformly at random, from zero to one or one to zero. The confidence interval upper bounds for this experiment are given in Figure 4. 4.2 Plus-One-Recall-Store Experiments This experiment tests the value of graph based genetic algorithms for genetic programming [8, 7]. The test problem, called the plus-one-recall store (PORS) efficient node use problem, is to find parse trees in a very limited language that, when executed, generates the largest integer result possible given a fixed maximum number of parse tree nodes. The language has two operations, integer addition and a store that places its argument in an external memory location, and two terminals, the integer 1 and recall from an external memory. This test problem is described in detail in [3, 4]. The difficulty of the PORS-efficient node use problem varies very strongly according to the upper bound on the number of nodes modulo three. We ran experiments on n = 15 nodes, the hardest of the three classes. In this set of experiment the initial population was made of randomly generated trees with n nodes. A successful individual was defined to be a tree that produces the largest possible number (these numbers are computed in [4]). Crossover was performed by the usual subtree exchange [8]. If this produced a tree with more than n nodes then a subtree of the root node iteratively replaced the root node until the tree had less than n nodes. We term this operation chopping. Mutation was performed by replacing a subtree uniformly at random with a new random subtree the same size, with probability 0.2 for each new tree produced. The confidence interval upper bounds for this experiment are given in Figure 5. 4.3 Tartarus String Experiments The Tartarus problem was first posed by Astro Teller [13]. A small autonomous robot is places in a six-by-six grid. The robot occupies one grid square and may turn right, turn left, or attempt to advance during each time step. Six boxes, each filling a grid square, are placed away from the edges of the grid with no two-by-two region containing four boxes. (The robot heading, position, and box positions at the start of a test define a “board”.) The robot may advance into an empty square or pushing one box but two boxes or a wall stop the robot. The robot’s task is to push boxes into corners (worth two points) or against a wall (worth one point). The robot is given eighty time steps and then its score is assessed for a given starting configuration. Teller and we both sum fitnesses over 40 boards to approximate the quality of a robot controller. Research on baselines for the Tartarus problem reported in [2] indicate that non-reactive controllers with memory, implemented as strings of moves, can obtain fitness comparable to those obtained by Teller’s reactive GP-tree controllers with memory. In this experiment we use strings of sixteen moves

H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 T8,64 ∗ . 0.16 ∗ 0.40 ∗ 0.29 ∗ 0.34 ∗ 0.36 ∗ 0.43 ∗ 0.50 ∗ 0.45 ∗ 0.48 0.99 . 0.94 0.69 0.92 0.92 0.95 0.97 0.95 0.97 ∗ 0.76 ∗ 0.26 . 0.38 ∗ 0.48 0.55 0.61 0.69 0.58 0.65 0.95 ∗ 0.45 0.84 . 0.79 0.83 0.84 0.90 0.85 0.90 ∗ P256,17 0.83 ∗ 0.29 0.64 0.43 . 0.63 0.71 0.75 0.67 0.74 ∗ ∗ P256,3 0.78 0.27 0.60 0.38 ∗ 0.50 . 0.64 0.70 0.61 0.68 ∗ P256,1 0.72 ∗ 0.24 0.55 0.36 ∗ 0.47 0.51 . 0.65 0.54 0.62 ∗ T16,32 0.64 ∗ 0.17 ∗ 0.45 ∗ 0.32 ∗ 0.39 ∗ 0.43 ∗ 0.48 . 0.47 0.54 ∗ T41 28 0.74 ∗ 0.25 0.55 0.37 ∗ 0.47 0.54 0.59 0.68 . 0.64 ∗ ∗ ∗ T8,64 0.68 0.20 0.50 0.34 ∗ 0.42 ∗ 0.45 0.52 0.61 0.51 . * Values Less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

Figure 4: Confidence interval upper bounds for trivial string evolver experiment. H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 . 0.59 0.67 0.65 0.65 0.64 0.75 0.67 0.64 0.51 . 0.62 0.59 0.63 0.59 0.72 0.65 0.60 ∗ 0.47 0.52 . 0.52 0.58 0.54 0.64 0.59 0.56 0.51 0.55 0.61 . 0.61 0.57 0.69 0.63 0.59 P256,17 ∗ 0.47 0.51 0.56 0.51 . 0.53 0.63 0.58 0.56 P256,3 ∗ 0.49 0.54 0.60 0.55 0.59 . 0.68 0.62 0.57 ∗ ∗ P256,1 ∗ 0.42 ∗ 0.46 ∗ 0.50 ∗ 0.42 0.52 0.49 . 0.49 ∗ 0.48 ∗ T16,32 ∗ 0.46 ∗ 0.50 0.55 0.49 0.56 0.52 0.62 . 0.54 ∗ T4,128 0.47 0.53 0.59 0.53 0.58 0.55 0.67 0.61 . T8,64 ∗ 0.47 0.52 0.58 0.53 0.58 0.55 0.65 0.60 0.56 * Values less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

T8,64 0.66 0.61 0.55 0.59 0.54 0.57 ∗ 0.49 0.52 0.57 .

Figure 5: Confidence interval upper bounds for the PORS-efficient node use experiment on n = 15 node trees. (left, right, forward) which the algorithm cycles through five times to obtain 80 moves. The termination requirement, i.e. definition of successful individual, is modified to require the average per-board fitness of the entire population to be greater than 3.0 (about 60% of the maximum known fitness for string controllers of this type) for forty boards. The requirement that success is demonstrated over long time scales is necessary due to the stochasticity of the fitness function. The set of forty boards used in a fitness evaluation are generated uniformly at random. Every 256 mating events (half the population size) the set of forty test boards is replaced with a new, randomly generated set of forty. The fitness test thus samples the space of possible boards in a manner consistent with a steady state genetic algorithm. We use two point crossover of the strings of moves and a probabilistic mutation that changes between zero and three positions in the string to a new randomly generated action. The confidence interval upper bounds for this experiment are given in Figure 6.

5 Conclusions The first analysis we performed on the significance data in Figures 4, 5, 6 was to partially order the graphs by relative performance advantage. In the trivial string evolver experiment 36 of the 81 pairs of graphs exhibited a statistically significant difference in performance. The partial order segregates the graphs nicely as follows: H9 ≥ T16,32 , T8,64 ≥ P256,1, T4,128, P256,3, Z ≥ P256,17 ≥ K512 ≥ C512 . Figure 7 shows this relationship graphically with an arrow indicating statistically significant dominance.

T8,64 H9 T16,32

Z P256,3 P256,17 K512 C512 T4,128 P256,1

Figure 7: Graph dominance figure for trivial strings.

H9 C512 Z K512 P256,17 P256,3 P256,1 T16,32 T41 28 ∗ ∗ . 0.01 ∗ 0.14 0.74 0.20 ∗ 0.21 ∗ 0.10 ∗ 0.33 ∗ 0.29 1.00 . 1.00 1.00 1.00 0.99 0.99 1.00 1.00 ∗ 0.99 0.10 . 1.00 0.65 0.61 0.54 0.81 0.84 ∗ ∗ 0.40 ∗ 0.00 ∗ 0.07 . 0.06 ∗ 0.12 ∗ 0.01 ∗ 0.18 ∗ 0.17 ∗ ∗ P256,17 0.97 0.05 ∗ 0.47 0.98 . 0.55 0.42 0.70 0.71 ∗ ∗ P2563 0.98 0.07 0.51 0.99 0.62 . 0.48 0.76 0.78 ∗ P256,1 1.00 0.12 0.62 1.00 0.72 0.69 . 0.86 0.87 ∗ ∗ T16,32 0.90 0.04 ∗ 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 . 0.61 ∗ ∗ T4,128 0.90 0.04 ∗ 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 0.60 . ∗ ∗ ∗ T8,64 0.90 0.04 0.38 0.95 0.45 ∗ 0.42 ∗ 0.28 0.60 0.61 * Values less than 0.5 indicate the graph indexing the column performed significantly worse than the graph indexing the row. H9 C512 Z K512

T8,64 0.33 1.00 0.83 ∗ 0.19 0.74 0.80 0.85 0.60 0.60 . ∗

Figure 6: Confidence interval upper bounds for the string based Tartarus controller experiments. In the PORS experiment only 15 of 81 pairs of graphs exhibited a performance advantage and the partial order indicated unambiguous dominance by P256,1 but otherwise gave a somewhat ragged ranking of the graphs: P256,1 ≥ P256,17 ≥ T4,128 , T8,64, T16,32, P256,3 ≥ K512 , H9 , C512. Figure 8 shows this relationship graphically with an arrow indicating statistically significant dominance.

P256,17

T8,64 T16,32

Z

T4,128

P256,17 K512 H9

Z

T8,64

C512 P256,3

P256,1

T4,128 Figure 9: Graph dominance figure for Tartarus.

C512

P256,3

P256,1

T16,32

H9 K512

Figure 8: Graph dominance figure for PORS. The Tartarus experiment yielded 39 of 81 pairs of graphs with a statistically significant performance advantage. The partial order gave a firm segregation of the graphs in the following fashion: K512 ≥ H9 ≥ T4,128, T8,64 , T16,32 ≥ P256,3, P256,17 ≥ P256,1, Z ≥ C512. Figure 9 shows this relationship graphically with an arrow indicating statistically significant dominance. Comparing the partial orders of graphs by relative performance with the summary statistics in Figure 3 we find that preservation of diversity as estimated by EP(G) had little predictive value, that the mean quality statistic MP(G) had little predictive value, but that edge preservation was somewhat predictive of success in the trivial string evolver and strongly predicted success in the Tartarus experiments. Either our measure of diversity preservation is not a good one or diversity preservation is not an issue for the test problems and population sizes we have presented. The graph-theoretic

intuition of the authors is that more subtle measures of graphical connectivity may be the key predictor of performance. In any case we have some support for Ackley and Littman’s hypothesis that having lots of boundary area between regions with distinct genotypes is good. To finish our conclusions on a positive note, we can report a very strong effect from changing population structures. While we at present have little of worth to say about which population structures are good for what problem (the sum total of general result is that C512 is bad) we see that (i) changing the population structure has a significant effect on solution time and (ii) different population structures are good for different problems. Note, for example, the positions of the nine-hypercube H9 and the complete graph K512 in each of the three test problems.

6 Discussion and Future Work An important point to make about the graph based genetic algorithm is that when use of a given graphical connection topology helps, it does so with little run time cost. The cost of locating the good graph is paid before the experimenter runs his genetic algorithm. Only the overhead for maintaining the list of graphical connections is charged to the graph based genetic algorithm’s run time. The crop of experiments reported here show that graphical connection topologies can help but give only a a modest amount of help in choosing them (the edge preservation statistic, Equation 4, has a lim-

ited, empirical correlation with performance on two of three test problems). Knowing that the choice of connection topology is important, it remains to figure out good predictors of what graphical topologies will help with which sorts of problems. In addition to trying graph based genetic algorithms on a wider variety of problems (and on harder problems) there are a number of other issues that need to be examined. It is intuitive that the local mating rule is important and we have tested exactly one, highly elitist, local mating rule. Experiments for well-mixed populations that speak to optimum allocation of breeding trials, mutation rates, and crossover rates need to be repeated in the graph based context to see if the change in population structure modifies results. The experimental results from this paper and the ordering of graphs they induce are suggestive and far from conclusive: a much larger number of graphs should be studied. Researchers wishing to have a the graph based genetic algorithm code used in this paper as a starting point should contact the first author. Another, more difficult, direction for future research involves locating good graphical topologies. If a good predictor of performance of graphical population structures is available for a class of problems then a genetic algorithm or other optimization method could be used to locate good graphs. It also might be possible to vary the local structure of the graph and so locate good connection topologies in an online manner during the execution of a genetic algorithm. This second suggested direction is a meta-algorithms, a notoriously tricky and time-consuming effort.

Acknowledgments The authors would like to thank the Iowa Center for Emerging Manufacturing Technology for their hardware and logistical support of this research. We also would like to thank the referees for their proofreading and helpful suggestions.

Bibliography [1] D. L. Ackley and M. L. Littman. A case for distributed Lamarckian evolution. Working Paper, Cognitive Science Research Group, Bellcore, New Jersey, 1992. [2] Dan Ashlock and Mark Joenks. ISAc lists, a different representation for program induction. In Genetic Programming 98, proceedings of the third annual genetic programming conference., pages 3–10, San Francisco, 1998. Morgan Kaufmann. [3] Dan Ashlock and James I. Lathrop. An arithmetic test suite for genetic programming. ISU Applied Mathematics Report AM98-1, 1998. [4] Dan Ashlock and James I. Lathrop. A fully characterized test suite for genetic programming. In Evolutionary Programming VII, pages 537–546, New York, 1998. Springer.

[5] David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Publishing Company, Inc., Reading, MA, 1989. [6] Motoo Kimura and James Crow. On the maximum avoidance of inbreeding. Genetical Research, 4:399– 415, 1963. [7] Kenneth Kinnear. Advances in Genetic Programming. The MIT Press, Cambridge, MA, 1994. [8] John R. Koza. Genetic Programming. The MIT Press, Cambridge, MA, 1992. [9] Richard J. Larsen and Morris L. Marx, editors. An introduction to mathematical statistics and its applications. Prentice-Hall Inc., 1981. ¨ [10] Heinz M{u}hlenbein. Darwin’s continent cycle theory and its simulation by the prisoner’s dilemma. Complex Systems, 5:459–478, 1991. [11] Craig Reynolds. An evolved, vision-based behavioral model of coordinated group motion. In Jean-Arcady Meyer, Herbert L. Roiblat, and Stewart Wilson, editors, From Animals to Animats 2, pages 384–392. MIT Press, 1992. [12] Gilbert Syswerda. A study of reproduction in generational and steady state genetic algorithms. In Foundations of Genetic Algorithms, pages 94–101. Morgan Kaufmann, 1991. [13] Astro Teller. The evolution of mental models. In Kenneth Kinnear, editor, Advances in Genetic Programming, chapter 9. The MIT Press, 1994. [14] Douglas B. West. Introduction to Graph Theory. Prentice Hall, Upper Saddle River, NJ 07458, 1996. [15] Darrel Whitley. The genitor algorithm and selection pressure: why rank based allocation of reproductive trials is best. In Proceedings of the 3rd ICGA, pages 116– 121. Morgan Kaufmann, 1989. [16] Sewall Wright. Evolution. University of Chicago Press, 1986. Edited and with introductory Materials by William B. Provine.