Parallel Genetic Algorithms and Sequencing Optimisation

4 downloads 533 Views 765KB Size Report
The University of Birmingham UK. Date: 23 of May, 1998. Revised version. Abstract. Genetic Algorithms are powerful search techniques that are used to solve ... tion for this project is to prepare and evaluate parallel GA, for optimisation of ..... tor supercomputers, specialised graphics engines, standard PCs or scalar worksta-.
Parallel Genetic Algorithms and Sequencing Optimisation Mini-project report

Mariusz Nowostawski [email protected] Supervisor: Dr. William B. Langdon School of Computer Science The University of Birmingham UK Date: 23 of May, 1998 Revised version

Abstract

Genetic Algorithms are powerful search techniques that are used to solve problems in many disciplines. Parallel Genetic Algorithms promise gains in performance and scalability, and can be easily implemented and applied to a number of problems. We de ned and discussed a parallel genetic algorithm for sequencing problems, based on dynamic demes paradigm. The main motivation for this project is to prepare and evaluate parallel GA, for optimisation of sequencing scheduling problems with representation as a permutation. There is also a description of general purpose parallel GA library developed on base on the parallel GA algorithm with dynamic demes. Number of tests and promising application to South Wales problem show that this approach can be successfully applied to scheduling problems.

1

This work is dedicated to my brother, sister and to my love, Ula.

Contents

1 Introduction 2 Parallel Genetic Algorithms

2.1 Genetic Algorithms . . . . . . . . . . . . . . . . . . 2.2 Parallel Processing . . . . . . . . . . . . . . . . . . 2.2.1 General information . . . . . . . . . . . . . 2.2.2 Heterogeneous computer networks . . . . . . 2.2.3 Distributed parallel technologies . . . . . . . 2.2.4 Distributed object technologies . . . . . . . 2.2.5 Technology for Parallel Genetic Algorithms . 2.3 Genetic Algorithm Distribution . . . . . . . . . . . 2.3.1 Global parallelisation . . . . . . . . . . . . . 2.3.2 Coarse grained . . . . . . . . . . . . . . . . 2.3.3 Fine grained . . . . . . . . . . . . . . . . . . 2.3.4 Hybrid methods . . . . . . . . . . . . . . . . 2.3.5 Combined method . . . . . . . . . . . . . .

3 Dynamic Demes 3.1 3.2 3.3 3.4

Ideas, purpose & features . General description . . . . Processes overview . . . . MPGA library . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

4.1 Sequencing problems . . . . . . . . . 4.2 GA operators comparison . . . . . . 4.2.1 Selection . . . . . . . . . . . . 4.2.2 Crossover . . . . . . . . . . . 4.2.3 Mutation . . . . . . . . . . . 4.2.4 Dynamic Demes evaluation . . 4.3 Parallel timing . . . . . . . . . . . . 4.4 South Wales National Grid problem . 4.5 Other tests. . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

4 Evaluation and Tests

. . . .

. . . . . . . . . . . . .

5 5

5 5 5 6 7 8 9 10 10 10 11 11 12

12

12 12 15 16

16

16 17 17 17 19 19 21 21 25

5 Conclusions 6 Future work

25 26

A MPGA User Guide (short version)

26

6.1 MPGA extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.2 Combinatorial GA problems . . . . . . . . . . . . . . . . . . . . . . . 26

3

List of Figures 1 2 3 4 5 6 7 8 9 10 11

Dynamic demes control ow. . . . . . . . . . . . . Some of sequential crossovers. . . . . . . . . . . . CX crossover, best tness. . . . . . . . . . . . . . CX crossover, average tness. . . . . . . . . . . . CX crossover, general performance. . . . . . . . . PMX and MPMX comparison (m=0.01, c=0.95). SP mutation, general performance. . . . . . . . . Dynamic demes, quality comparison. . . . . . . . PVM process timing details (XPVM). . . . . . . . MPGA time performance . . . . . . . . . . . . . . Dynamic demes, general performance. . . . . . . .

4

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

13 17 18 18 19 20 20 22 23 23 24

1 Introduction This is a preliminary work done with sequencing scheduling problems with the use of parallel genetic algorithm. The research was done concurrently in two di erent areas. The rst aim of this work is to investigate di erent genetic strategies to scheduling and timetabling problems. The GA approach is based on a permutation representation of some units, usually referred as direct coding. There is comparison of di erent combinatorial operators, and tests with di erent probability factors. There are proposed new modi cations to existing crossover operator called MPMX (Modi ed Partially Matched Crossover) together with modi cation of a simple combinatorial mutation (swap of two genes), called Element Position Mutation (EP Mutation), and Swap neighbouring Pair Mutation (SP Mutation). The second aim, is to investigate di erent methods of parallel GA. There is proposed parallel algorithm with dynamic demes, which is promising in a sense, that for almost no loss of quality, we gain scalability and exibility of parallel processing of a single population of individuals. The main objective of the implementation was to develop a scalable and faulttolerant parallel algorithm, which can be applied to scheduling problems. There is developed functionally complete library based on PVM, and dynamic demes paradigm, with implementation of the existing and new GA operators. The developed library is publicly available. On a base of the library there were done tests with simple "toy" problems, and tests with real problems like National Grid problem. The main investigation was done upon sequencing problems, with an attention paid to permutation representation.

2 Parallel Genetic Algorithms

2.1 Genetic Algorithms

Genetic Algorithms are powerful general purpose optimisation tools which model the principles of evolution (Holland 1992). They operate on population of coded solutions which are selected according to their quality, then used as the basis for a new generation of solutions found by combining some features from current individuals or by direct random changes within genotype (Goldberg 1989a), (Goldberg 1989b). In this work attention is concentrated on direct coding without phenotype level, with genes representation as a permutation.

2.2 Parallel Processing 2.2.1 General information

Parallel processing is not a new idea. It has been established as a computational method not long time after the rst sequential machines appeared on the market. The von Neumann approach based on sequence of actions, which run one after another represents classical sequential method of programming machines. The parallel extension is based on concurrency and can be rather represented as a directed 5

graph than a single sequence or tree of actions. It seems that parallel processing is a straightforward and ecient method of solving dicult and complicated problems by dividing them into small tasks which can be solved concurrently. This method of processing is likely to become very popular in the near future. Even recently we can observe growing interest paid to parallel processing. It is used as well in academic and research centres as in commercial applications. The reason is in a need for ultra quick processing, lower costs, scalability and performance. Parallel processing is classi ed into two categories: SIMD and MIMD. Other possible point of view is heterogeneity. Parallel processing then one can divide into sub areas: massively parallel processors (MPP) and distributed computing. MPP represents currently the most advanced and powerful digital machines in the world. Such machines contain from several to several thousand processors (processing elements, PE) in one uniform device connected to gigabytes of memory. MPPs can o er very big computational capacity and they are used to solve so called "hard problems", like weather forecasting, climate modelling, chemical drugs modelling etc. (Kozielski and Szczerbiski 1994), (Almasi and Gottlieb 1989). One of the direction of parallel processing is parallel distributed computing. Brie y speaking the main idea is to use a set of machines within a network, to work together solving one particular problem. The idea of computers connected together for sharing resources is very popular, and it becomes a kind of phenomena of current computing technology. The infrastructure for distributed processing is really huge, and almost every company has its own computer network, which can be used for intensive computations in idle time. In the case of distributed parallel processing an end-user in fact uses the existing infrastructure and processing resources, for which one does not have to pay additionally. Those resources usually are not fully used during normal conditions (e.g. at night, PCs used for word-processing, workstations after working hours etc.) and thus there is a great possibility to use it as a computational resources. Generally one has two main programming methods for applying parallelism. There is shared-memory and message passing model. In the rst case, each PE (processing element) can access the same part of memory independently, and by that exchange information. It really does not matter how it is implemented, it can be the same physical or virtual uniform address space, and the memory can be local, or it can be distributed shared-memory model. The important is, that program is executed similar to sequential processing, with direct memory access. In the second case each information have to be send explicitly, and information and processing control is done by means of messages (Hwang and Briggs 1985). Design of a parallel algorithm is not an easy task, however there is a collection of existing parallel versions of popular algorithms, which can help when transforming sequential programs into the parallel versions (Gibbons and Rytter 1988).

2.2.2 Heterogeneous computer networks This research is concentrated on distributed processing across the heterogeneous computer networks. Many large computational problems can be solved more cost e ectively by using the aggregate power and memory of many computers. However 6

it also rises speci c problems and diculties. In MPP every PE has the same speed, resources, kind of software, platform and architecture. In heterogeneous computer network the resources can vary. One can characterise di erent aspects of heterogeneity:  Architecture  Data format  Processing speed  Machine load  Network load A set of machines in the network can contain many di erent architectures. Each of those architectures has its own optimal parallel programming paradigm, binary format, and resource allocation subsystem. Packages for parallel distributed processing have to support conversions and compatibility on the lower level. It is very convenient if the package can support dynamic load balancing, processes monitoring, dynamic processes spawning (for fault-tolerance). In the following section we will discuss di erent possible existing technologies for that purpose.

2.2.3 Distributed parallel technologies PVM (Parallel Virtual Machine) This is a software package that permits a

heterogeneous collection of Unix computers hooked together by a network to be used as a single large parallel computer. The software is very portable. The source, which is available free through netlib (http://www.netlib.org/pvm3/index.html ) can be compiled almost on every Unix operating system (Geist, Kohl and Papadopoulos 1996). The individual computers may be shared- or local-memory multiprocessors, vector supercomputers, specialised graphics engines, standard PCs or scalar workstations, that may be interconnected by a variety of networks, such as ethernet, FDDI, etc. PVM support software executes on each machine in a user-con gurable pool, and presents a uni ed, general, and powerful computational environment of concurrent applications. The library itself is written in C, and user programs can be written in C, C++, Fortran or Java (through native methods). User programs are provided access to PVM through the use of calls to PVM library routines for functions such as process initiation, message transmission and reception, and synchronisation via barriers or rendezvous. Users may optionally control the execution location of speci c application components. The PVM system transparently handles message routing, data conversion for incompatible architectures, and other tasks that are necessary for operation in a heterogeneous network environment. PVM is particularly e ective for heterogeneous applications that exploit speci c strengths of individual machines on a network. As a loosely coupled concurrent supercomputer environment, PVM is a viable scienti c computing platform. There is also a graphical analyser and debugging tool, called XPVM (Kohl and Geist 1996). 7

MPI (Message Passing Interface) This is a standard for message passing

architecture, which was speci ed in April 1994. The goal of MPI, simply stated, is to develop a widely used standard for writing message-passing programs. As such the interface attempts to establish a practical, portable, ecient, and exible standard for message passing. This standard is an outcome of the project aiming at development of syntax and semantics for message-passing libraries. It was necessary to introduce such a kind of speci cation for programmers and designers developing parallel software for MPP. The greatest advantage of message-passing is portability. MPI is not the integrated developing environment for parallel software. There is no initialising layer (process initialisation), nor virtual machine. MPI should be treated rather as an implementation of one of the application layers. Exactly this layer which is responsible for message passing. The main advantages of establishing a message-passing standard are portability and ease-of-use. In a distributed memory communication environment in which the higher level routines and/or abstractions are build upon lower level message passing routines the bene ts of standardisation are particularly apparent. Furthermore, the de nition of a message passing standard provides vendors with a clearly de ned base set of routines that they can implement eciently, or in some cases provide hardware support for, thereby enhancing scalability.

2.2.4 Distributed object technologies DCOM (Distributed Object Model) is the distributed extension to COM (Com-

ponent Object Model) (WWW 1995) that builds an object remote procedure call (ORPC) layer on top of RPC (Remote Procedure Call) (WWW 1998a), to support remote objects. A COM server can create object instances of multiple object classes. A COM object can support multiple interfaces, each representing a di erent view or behaviour of the object. An interface consists of a set of functionally related methods. A COM client interacts with a COM object by acquiring a pointer to one of the object's interfaces and invoking methods through that pointer, as if the object resides in the client's address space. Since the speci cation is at the binary level, it allows integration of binary components possibly written in di erent programming languages such as C++, Java and Visual Basic. With early versions of COM objects were accessible only to clients residing on the same machine. The enhanced version of COM, called DCOM, allows COM objects to be accessed across a network. CORBA (Common Object Request Broker Architecture is a distributed object framework proposed by a consortium of over 700 companies called the Object Management Group (OMG). The core of the CORBA architecture is the Object Request Broker (ORB) that acts as the object bus over which objects transparently interact with other objects located locally or remotely (WWW 1998b). A CORBA object is represented to the outside world by an interface with a set of methods. A particular instance of an object is identi ed by an object reference. The client of a CORBA object acquires its object reference and uses it as a handle to make method calls, as if the object is located in the client's address space. The ORB is responsible for all the mechanisms required to nd the object's implementation, prepare it to receive the request, communicate the request to it, and carry the reply (if any) back 8

to the client. The object implementation interacts with the ORB through either an Object Adapter (OA) or through the ORB interface. RMI (Remote Method Invocation RMI is the part of Java from JDK 1.1. It supports seamless remote invocation of objects in di erent virtual machines, with callback from servers to applets. Moreover RMI integrate the distributed object model into the Java language in a natural way while retaining most of the Java object semantics. Thanks RMI the di erences between the distributed object model and local Java object model become apparent, RMI enables developers of distributed Java applications to treat remote objects and their methods very much like normal Java objects. Java RMI brings a new level of functionality to distributed programs with features like distributed, automatic management of objects and passing objects themselves from machine to machine over the network (http://www.javasoft.com). The object oriented frameworks provide client-server type of communications. In DCOM and CORBA to request a service, a client invokes a method implemented by a remote object, which acts as the server in the client-server model. The service provided by the server is encapsulated as an object and the interface of an object is described in an Interface De nition Language (IDL). The interfaces de ned in an IDL le serve as a contract between a server and its clients. Clients interact with a server by invoking methods described in the IDL. The actual object implementation is hidden from the client. Some object-oriented programming features are present at the IDL level, such as data encapsulation, polymorphism and single inheritance. CORBA also supports multiple inheritance at the IDL level, but DCOM does not. Instead, the notion of an object having multiple interfaces is used to achieve a similar purpose in DCOM. CORBA IDL can also specify exceptions. In both DCOM and CORBA, the interactions between a client process and an object server are implemented as object-oriented RPC-style communications.

2.2.5 Technology for Parallel Genetic Algorithms All described above technologies can be successfully used as a core mechanism in designing a distributed parallel genetic algorithm. However GA as such can be implemented on very low level, and it is rather inecient to decompose the GA into client-server-like algorithm. From the other hand, parallel virtual machine can release designer from many tasks and aspects which should be implemented from the scratch within the other techniques. This covers mechanisms like process monitoring, load balancing, fault tolerance, and debugging in development stage. We have decided to implement the parallel library on basis of PVM (Parallel Virtual Machines), because of the following features:  good portability (there is over 40 di erent architectures supported)  reliability and monitoring incorporated within PVM  process initialisation and controlling  simple load balancing  easy implementation of dynamic load balancing possible 9

 advantages of C and C++ PVM allows the executables to be run concurrently on uniprocessors, multiprocessors, and workstation networks. PVM supports also process management with facilities of a virtual machine (dynamic resource balancing).

2.3 Genetic Algorithm Distribution

Nowadays Parallel Genetic Algorithms (PGAs) are often studied, but the history of parallel implementation of GAs is not long. The researches made mainly empirical research however the lack of theory is observed. We can nd many di erent PGA approaches in the literature, but it is impossible to compare them mainly because they are designed for di erent tasks, use di erent operators and run in di erent environments. Cantu-Paz (1995) proposed the categorisation of the known parallelisation techniques. The rst technique is a global parallelisation approach. This means, that all genetic operators and the evaluation of all individuals are explicitly parallelised. Such implementation is easy, it does not require any modi cation of classical GA. The coarse grained PGA is another attempt. We can name a process as a coarse grained if the ratio of the needed time for computation to the time needed for communication between processors, is high. Such parallelism requires a division of population into some number of demes (subpopulations). Demes are separated one from another (`geographic isolation'), and individuals compete only within a deme. An additional operator called migration is introduced: from time to time, some individuals are moved from one deme to another. If individuals can migrate to any other deme, it is referred as an island model. If individuals can migrate not to any, but only to their neighbouring deme, we say that it is a stepping stone model. Fine grained PGAs require a large number of processors because the population is divided into a large number of small demes, and each deme is processed separately with communication by the migration.

2.3.1 Global parallelisation

In this type of parallel GAs there is only one population as in the serial GA, but the evaluation of individuals and the genetic operators are parallelised explicitly. Since there is only one population, selection considers all the individuals and every individual has a chance to mate with any other, i.e. there is random mating, and therefore the behaviour of the algorithm remains unchanged. This method is relatively easy to implement and a signi cant speedup can be expected if the communications cost does not dominate the computation cost (Cantu-Paz 1997). With heterogeneous workstation networks and with big populations, there is a classical bottle-neck e ect. The whole process has to wait for the last slowest individual. After that, the selection operator can be applied.

2.3.2 Coarse grained

This is a more sophisticated idea in parallel GAs. In this case the population of the GA is divided into multiple subpopulations or demes that evolve isolated from 10

each other most of the time, but exchange individuals occasionally. This exchange of individuals is called migration and it is controlled by several parameters. Coarse grained parallel GAs introduce fundamental changes in the operation of the GA and have a di erent behaviour than simple GAs. Sometimes coarse grained parallel GAs are known as distributed GAs because they are usually implemented on distributed memory MIMD computers. Coarse grained GAs are also known as island parallel GAs because in Population Genetics there is a model for the structure of a population that considers relatively isolated demes and it is called the island model. It seems that since the size of the demes is smaller than the population used by a serial GA, we would expect that the parallel GA will converge faster. However when we compare the performance of the serial and the parallel algorithms we must also consider the quality of the solutions found in each case. It is true that a smaller deme will converge faster but it is also true that the quality of the solution might be poorer. Coarse grained approach is well suitable for heterogeneous networks, however there is a problem with quality of the solution. It is dicult to estimate, what kind of migration one should use, what is the in uence of several subpopulation being processed, and for some problems (sequencing problems) there is a high risk that in short time subpopulation becomes very similar, so the parallel GA search will be not as e ective as one could expect. Additional problems is scalability, e.g. if one have 5 machines, it is good to use coarse model with 5 subpopulations, however, if one have 100 machines available, it is dicult to scale up eciently the size and number of subpopulations.

2.3.3 Fine grained Fine grained parallel GAs partition the population into a large number of very small subpopulations. Indeed the ideal case is to have just one individual for every processing element (PE) available. This model is suited for massively parallel computers (MPP) but it can be implemented on any multiprocessor. The main disadvantage of this model on the network of workstation is communication cost, which is extremely high. Another problem with ne grained approach is the fact, that in heterogeneous networks, there is a bottle neck e ect, because the already processed individuals have to wait for the slower ones. This method can be successfully used within uniform environments, or on parallel machines.

2.3.4 Hybrid methods The last method to parallelise GAs uses some combination of the rst three. We call this class of algorithms hybrid parallel GAs. Combining parallelisation techniques results in algorithms that combine the bene ts of their components and promise better performance than any of the components alone. It is important to emphasise that while the global parallelisation method does not a ect the behaviour of the algorithm, the ne and coarse grained methods introduce fundamental changes in the way the GA works. For example in the global method the selection operator takes into account the entire population but in the other parallel GAs selection is local to each deme. Also in the methods that divide the 11

population it is only possible to mate with a subset of individuals the deme whereas in the global model it is possible to mate with any other individual.

2.3.5 Combined method Following the discussion above, we have decided to propose the hybrid method based on the idea of dynamic demes. The method was rstly announced in (Kwasnicka and Nowostawski 1997). There are some of advantages:  high scalability and exibility (from global parallelism to ne grained method)  fault tolerance  possible dynamic load balancing  easy monitoring with on-line evaluation  single population (subpopulation model possible) The idea was developed, implemented and evaluated. The algorithm and implementation details are explained in the next section. The selection and mating operators are applied to subpopulations (demes), and this is similar to coarse and ne grained paradigms. But those subpopulations are created dynamically after each processing cycle, and demes as such are not xed.

3 Dynamic Demes

3.1 Ideas, purpose & features

To overcome disadvantages of pure ne or coarse grained parallel genetic algorithm paradigms, one have to look for a new approach. Our algorithm is based on dynamic demes paradigm. This method is a combination of global parallelism (the algorithm can work as a simple tness distributed GA), with coarse grained GA with subpopulations. There is however no migration operator as such, because the whole population is treated during the evolution as a single collection of individuals. From the parallel point of view dynamic demes approach can be classi ed as a MIMD category (Flyn classi cation), asynchronous message-passing algorithm. The main idea is to cut down waiting time for the last (slowest) individuals, by dynamic splitting the population into demes, which can then be processed without delay. It gives more eciency in terms of processing speed. The algorithm is fully scalable. Starting from global parallelism with tness-processing distribution, one can scale up the algorithm up to ne grained version, with several individuals within each deme.

3.2 General description

Each individual is represented by separate process. Within this process there is a tness calculation and mutation done fully concurrently. Each individual is capable of doing crossover with another individual, if only it received other individual 12

Process ID (PID), or other individual's genotype. This operation is also done fully concurrently within population. There is a deme manager (MASTER), which is responsible for selection and mating within one deme. For each deme there exists separate MASTER process. This process does selection and sends appropriate partners ID to some individuals. There is also one process (possible more) responsible for load balancing. Each individual after tness evaluation and mutation, noti es this special process (COUNTER), and then COUNTER gives a proper MASTER ID to this individual. The last process within the system is called SORTER. This process only listen to individuals nishing their evaluation, takes genotype and tness, and prepares log les (output). SORTER is also responsible for stopping the whole search process if only end conditions are achieved (sucient solution was found, xed number of individuals were processed, etc). As shown in gure 1 there is a lot Parallel GA with Dynamic Demes

Manager

Counter

Master

Sorter Slave

There is only one process in the system Communication flow

Figure 1: Dynamic demes control ow. of communication going on while the algorithm works. There is a eciency barrier, satis able by equation: Tcounter  tmc + tindivid where: Tcounter - time of processing one full cycle by COUNTER tmc - time for one cycle of MANAGER tmc - time for one cycle of SLAVE (individual) Nid - number of individuals Nd - number of demes 13

nid - number of individuals in one deme

Generally: tsend - time for sending a message N nid = id Nd

for COUNTER the total time of processing one cycle is: Tcounter = Nid  tsend

for MANAGER the total time of processing one cycle is: tmc = nid  tsend + tsel

where: tsel - time for doing selection within a deme for SLAVE (individual) the total time of processing one cycle is: tindivid = tfitness + tmutation + tcrossover + 3  tsend

From the rst condition we have: (nid + 3)  tsend + tfitness + tmutation + tcrossover + tsel  1 N t

because: thus:

where:

id

send

(nid + 3)  tsend < 1 Nid  tsend

tGeneticOperators 1 Nid  tsend tGeneticOperators  Nid  tsend

tGeneticOperators - time for applying once all genetic operators.

The problem is that the times derived in the last equation, depend on a hardware, network speed, machines load and network load. However the most important implication is, that generally for GAs with very simple GA operators, especially tness evaluation, dynamic demes approach will behave not as good as for computationally intensive GAs. One has to realise, that theoretical estimation of eciency is dicult, while it depends on such independent factors like network load. Performance simulations are needed to determine best design and con guration.

14

3.3 Processes overview

MANAGER It is unique process in the system. This is an initialisation element,

which starts the whole genetic process. Its main tasks are: initialisation of SORTER and COUNTER, and initialisation of all SLAVE processes. The schema of the process is following: 1. start SORTER 2. start COUNTER 3. initialise n SLAVE processes (depends on con guration) 4. wait until the SORTER shut down the whole process COUNTER Like MANAGER and SORTER is unique in the whole system. In the initial phase it is responsible for starting all MASTER processes, then it controls the dynamic management of SLAVE demes and MASTER load. The schema of process is following: 1. initialise all MASTER processes, and make a list of them. 2. wait for a message 3. if it is individual, give him current waiting MASTER's task identi er 4. check if current MASTER got the enough number of individuals, if so, set the next MASTER as current 5. go to 2. SLAVE It is multiple process. The number of initialised processes is given by con guration. Together with MANAGER, SLAVE is the core of the genetic algorithm. It is responsible for the initialisation, mutation and crossover, and it calculates the tness function for each individual. Completing its task, SLAVE sends the tness to the SORTER, than it noti es the COUNTER, from which it will get MASTER task identi er. MASTER is responsible for selection of individuals to a reproduction. Like SLAVE process, MASTER is multiple process. It is initiated by the COUNTER, then wait for SLAVE processes being ready for selection. MASTER makes selection and identi es partners for crossover of individuals. It also controls the process of creation of the new generation, and after completing, after one generation it waits for another individuals, another demes to be processed. SORTER This is the nal element in the GA process. Its main task is controlling and reporting. It analyses the whole population on-line, creates the text output le, which can be processed with pgen tool for producing gnuplot compatible diagrams. It monitors current population state, algorithm and passes QUIT message it to the MANAGER and the COUNTER.

15

3.4 MPGA library

The dynamic demes parallel GA approach was implemented as a general purpose library. The base GA structures were taken from QGAME package written by Laura Dekker. The parallelism is provided by PVM library. See MPGA User Guide in Appendix A for details. Library was designed and tested in Unix environment, with some unix-speci ed features (Stevenson 1990).

4 Evaluation and Tests

4.1 Sequencing problems

There is an extensive research done in area of sequencing optimisation. One can apply sequencing GA optimiser to many problems. There is a number of examples of timetable optimisation with direct representation (Burke, Elliman and Weare n.d.), (Abramson and Abela 1992), TSP analysis (Dzubera and Whitley n.d.), or a set partitioning problem (Levine 1994). Following di erent studies within highly constrained combinatorial problems (Asveren and Molitor 1996), we de ned test problem. In our example the permutation representation is de ned as follows: we have number of di erent units, which should be sorted out; the relative position between elements is important. The absolute position plays role as well. The problem is de ned that we have a string of units s = (1,2,::,n) and each unit is present in the string once and only once. The order of the numbers within the string has to be set accordingly to the xed pattern string. The rule tested is de ned as the desired pattern string has structure of the ordered string, e.i. 1,2,3,::,n. Fitness function is simply the distance between the given string, and the correct one. If the given string has form a1 ; a2; a3; an, and the correct pattern string has form '1; '2; '3; 'n tness value is given by: F=

X jai ? 'ij i

Taking this into consideration we can see, that the search space has number of local optima, when tness values are 2, 4, 6, 8 and so on, and one and only one global optimum with value 0. As stated before, both absolute and relative position are important. We will refer to this problem as a strEval. All diagrams are based on minimising the Fitness function, and the global optimum is on level 0. The problem is very similar to many sequencing problems, like National Grid, timetable or TSP, when representation is direct and based on permutation of units. To test eciency and performance of di erent operators, we have set up the number a population to 8. With genotype consisting of 8 units, we have the search space with size 8! Having so small population size, we can assure better comparison between di erent GA operators, especially those which converges to local optima. Probability factors are taken for the whole genes, except mutation, when the probability refers to a single gene.

16

4.2 GA operators comparison 4.2.1 Selection

After taking selection into consideration, one can see, that there is no big di erence between sequencing GA and other GA. Selection operator can be taken without any constrains, and there is little in uence of this operator to the general performance and dynamics of the sequencing GA in comparison to classical oating point coding. In all our experiments we were using Roulette-Wheel Selection.

ParentA ParentB

1 2 3 4 5 6 7 8 4 3 2 7 6 5 8 1

ChildA ChildB

1 2 3 7 6 4 5 8 7 3 2 4 5 6 8 1

PMX Crossover

ParentA ParentB

1 2 3 4 5 6 7 8 4 3 2 7 6 5 8 1

ChildA ChildA ChildA ChildA

1 1 1 1

ChildA

7 3 2 4 6 5 7 8

ChildB ChildB ChildB ChildB

4 4 4 4

ChildB

4 2 3 7 5 6 8 1

-

-

4 4 4 7

-

-

- - 7 7 8 -

8 8

1

ParentA ParentB

1 2 3 4 5 6 7 8 4 3 2 7 6 5 8 1

ChildA ChildB ChildA ChildB

1 * 3 2

ChildA ChildB

3 4 5 7 6 8 1 2 2 7 6 4 5 8 1 3

1

1

2 3 4 7

3 2 5 6

4 7 * *

5 6 * *

* * 8 8

* 8 1 1

8 1 2 3

OX Crossover

CX Crossover

Figure 2: Some of sequential crossovers.

4.2.2 Crossover There was tested number of di erent Crossover operators. Some of them are presented in the gure 2. We have noticed that for absolute position importance, performance of PMX is a little bit better than others (CX, OX), but the di erence is not big. However in comparison to simple one-point crossover with repair (referred as combinatorial Crossover), PMX, OX and CX performed much better. All of them do not change much (or they did change nothing) when the population becomes uniform. But from the other hand, they play very important role in the beginning of the search, when the population is characterised by wide variety. Experiments have showed that even if crossover factor is very high, there is almost no risk in getting caught in local optima. Generally, the higher crossover rate was used, the better performance was achieved. There is proposed a small modi cation to PMX operator. This is based on the observation that even if two chromosomes are completely di erent, if only the matching section is big, the PMX operator does nothing. The probability of change decreases if the matching section size increases. The MPMX operator is the straightforward modi cation to PMX. The matching section has preset maximum size equals to the half of the chromosome length. This gives much better performance in the early stage, with almost no change in performance after the population stabilised. As shown in the gure 6 it has rather strong in uence on the population behaviour. 17

25 c=0.1, m=0.01 c=0.6, m=0.01 c=0.95, m=0.01 c=0.6, m=0.1 c=0.6, m=0.001

20

Best Fitness

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

2500

3000

Figure 3: CX crossover, best tness.

25 c=0.1, m=0.01 c=0.6, m=0.01 c=0.95, m=0.01 c=0.6, m=0.1 c=0.6, m=0.001

Average Fitness

20

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

Figure 4: CX crossover, average tness.

18

2500

3000

CX, c=0.6, m=0.01 25 worst av. worst average av. best best

20

Fitness

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

2500

3000

Figure 5: CX crossover, general performance. It quickly adapts the mutation changes, and gives more diversity to the population. There is signi cant di erence (bigger distance) between best and average curves for PMX and MPMX. (On diagrams m represents probability of mutation of one gene, and c represents probability of crossover operators being applied.)

4.2.3 Mutation Mutation, similarly to crossover in sequencing problems is a very sensitive operator. For relative order optimisation we have de ned and tested so called Element Position mutation. In this operator, you randomly take one of genes from the chromosome, and put it back at random position. Because this operator in uence often the whole genotype, it is too disruptive for absolute location based chromosomes. With EP mutation the probability of mutation should be rather applied to the whole chromosome, than to single gene. Even simple swapping of two genes, for absolute location seems to be too disruptive. We have de ned second mutation operator called Swap Pair Mutation. It is based on simple modi cation of swapping of two genes (referred as Combinatorial mutation), in such a way, that one can take swap only neighbouring genes. As shown on gure 7 this approach seems to be very promising in the absolute position-based optimisation. The results were obtained with m=0.01 and c=0.06.

4.2.4 Dynamic Demes evaluation This evaluation was done to compare the general in uence on the population dynamics made by dynamic demes algorithm in comparison to traditional single population 19

PMX crossover vs. MPMX crossover 25 MPMX: Av. minimum PMX: Av. minimum MPMX: Average PMX: Average 20

Fitness

15

10

5

0 0

500

1000 1500 2000 Number of Individuals processed

2500

3000

Figure 6: PMX and MPMX comparison (m=0.01, c=0.95).

Combinatorial mutation vs. Swap Pair mutation 30 combinatorial: Average SPM: Average combinatorial: Av. best SPM: Av. best combinatorial: Av. worst SPM: Av. worst

25

Fitness

20

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

2500

Figure 7: SP mutation, general performance.

20

3000

based GA. As the dynamic demes algorithm can be scaled from single population up to ne grained paradigm, we have done tests with di erent con gurations applied to the same problem. As shown in gure 8, the performance gained by coarse grained approach (3 demes 4 individuals each) are quite comparable with those gained by single population based algorithm with distributed tness evaluation. As a ne grained we have used demes with only two individuals in each deme. The results show that this can strongly in uence the population dynamics, and because selection is done always on very small fraction of individuals, the GA cannot converge to existing sub-optimum within population and the search process is not any longer stable and convergent.

4.3 Parallel timing

The main purpose of parallelisation of GA is to increase the speed and performance of the search. For tests we were taken simple permutation problem referred as strEval, described in the previous section. The test was based on checking the time of evaluation of 20000 individuals in function of the number of workstations employed. The population size is xed and equals to 8 individuals. Thus the process consists of 14 processes (8 SLAVES, 3 MASTERS, 1 COUNTER, 1 SORTER and 1 MANAGER). As shown on the gure 10 there is much slower increase in speed than one could expect, but the increase is quite good in terms of quality (stable and linear up to saturation level). The positive thing is that there is consistent increase in speed up to the total number of processes, when the curves saturates. It gives possibility for standard GA problems with population size in range of couple of hundreds to be easily scalable with the hardware resources, with high exibility. The traditional single machine based algorithm could achieve only processing speed in range of 64 individuals per second. With ve machines and utilisation in range of 5% on each, the same algorithm could process up to 130 individuals per second, and with optimum con guration with 14 machines there was 165 individuals per second. In contrast, single population based algorithm with distributed tness evaluation on 14 machines was capable of achieving only 130 individuals per second 1 . There is rather poor speedup because of the communication costs. We have run additional tests with more computationally intensive tness evaluation. In the previous case, tness evaluation itself was about 0.03s. In the following case, tness evaluation was 100 times longer. In perfect sequencing algorithm, evaluation of 20000 individuals would last 60,000s (no additional time, pure tness evaluation). It gives processing speed of 0.33 individual per second. With distributed tness evaluation on 14 machines we reached 2.63 individuals per second, which gives 7.9 speedup. With dynamic demes approach also on 14 machines the results are: (2 demes) 2.83 individuals per second, 8.5 speedup; (4 demes) 3.53 individuals per second, 10.6 speedup (!!).

Tests were done on the network of DEC Alphas in the School of Computer Science, University of Birmingham (6 x DEC Alpha 233MHZ, 160MB RAM, 8 x DEC Alpha 233MHz, 64 MB RAM), for the strEval tness function with permutation of 8 units. 1

21

Single population vs. dynamic demes Single: Avarage Single: Minimal Fine grained: Avarage Fine grained: Minimal

20

Fitness

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

2500

3000

Single: Avarage Single: Minimal Coarse grained: Avarage Coarse grained: Minimal

20

Fitness

15

10

5

0 0

500

1000 1500 2000 Number of individuals processed

2500

Figure 8: Dynamic demes, quality comparison.

22

3000

Figure 9: PVM process timing details (XPVM).

Performance vs. hardware employed 400 "avarage" "minimum"

350

Time [sec]

300 250 200 150 100 0

2

4 6 8 10 12 Number of workstations used

Figure 10: MPGA time performance

23

14

16

Dynamic demes performance 350 Single population 2 demes 4 demes

Time of evaluation

300

250

200

150

100 0

2

4

6 8 Number of workstations used

10

12

Figure 11: Dynamic demes, general performance. With moreDynamic intensive tness evaluation, proposed algorithm performs very good demes performance and is ecient enough to be used as an alternative for sequencing algorithms. The reader should be aware, that the population size used in experiments is very small (8 individuals). In di erent cases, with bigger populations, it is possible to use more demes, and by that increase the speedup even more.

4.4 South Wales National Grid problem

In England and Wales electrical power transmission network is known as the National Grid. It is highly interconnected network carrying large power ows. The National Grid is owned and operated by the National Grid Company Plc. The problem is to maintain all lines in a speci ed period of time, at least cost, considering plant safety and security of supply. The task of planning maintenance is a complex constrained optimisation scheduling problem. The schedule is constrained to ensure that all plant remains within its capacity and the cost of replacement generation, throughout the duration of the plan is minimised (Langdon 1995). There are di erent possible representations for this problem. In work of Langdon (1997) there is described a problem of four nodes, called Four-node problem. There is used greedy optimiser together with GA, and the nal schedule is not prepared by GA itself, but by the greedy optimiser. GA is searching only for the best possible list of nodes to be given to the greedy optimiser. Within our tests we have concentrated on the representation based on the permutation of lines to be maintained. There is a sequence of lines, and the position of within the string represents the appropriate week for maintenance. 24

For this four-node problem, the new operators with dynamic demes approach behaves very eciently, and the global solution was nd very quickly. As a tness function for four-node we used package developed by W.B.Langdon. Di erent improved package was used for evaluating South Wales tness. Unfortunately there was not sucient number of tests done with South Wales problem, but the preliminary results are quite promising. With direct coding, MPMX Crossover together with SP Mutation the algorithm performed better than other con gurations with direct coding presented in Gordon (1996). There is necessary greater number of tests to be done, and some code improvements in applying South Wales tness evaluation and comparison with greedy optimiser.

4.5 Other tests.

Regretfully, because of time constraints there was not possible to apply developed technique to other sequencing problems like TSP, University Timetable, and scheduling-like problems. Those tests will be probably done in the near future.

5 Conclusions The approach with dynamic demes seems to be an ecient and scalable parallel paradigm for GA. The quality of solution gain with more ne grained con gurations are much worse than those gained with coarse grained attempts. However results obtained with coarse grained version are comparable with those achieved by single population, and speedup gained by the coarse grained version is quite promising. The implementation of MPGA allows the user to scale up the algorithm, for particular problem and con guration of genetic algorithm (number of individuals within the population, time of mutation, time of crossover and time of tness evaluation). The algorithm itself can be also easily scaled up to the hardware con guration. PVM is very good environment for developing distributed parallel software. Its high exibility and support of virtual machine allow to build up ecient and fault tolerant software packages. It is also advantageous, that one can use speed and eciency of C (C++) programming language. There are some drawbacks, like intensive network usage, high communication costs (time) (see gure 9) and not easy developing stage (diculty in debugging, lack of good debugging tools). However, the diculty is connected with general diculty of designing asynchronous messagepassing software, rather than tools used for this purpose. As experiments with GA are concern, there is a signi cant di erence between normal GA problems, and those based on permutations of genes. The work presents some ideas and hints for dealing with scheduling problems based on a permutation representation. The general output of the project is that one have to consider very carefully parameters of the GA, because there is a risk of getting stacked in the local optima much easier then with normal oating point coding. Sequencing GA is very sensitive, so it can easily jump in the search space, and the important thing is to apply operators which introduce appropriate amount of noise (e.g. too noisy Combinatorial Mutation can be not as ecient as SP Mutation). 25

For sequencing problems, with direct coding the small improvements to existing genetic operators were done. Modi ed Partially Matched Crossover with Swap neighbouring Genes Mutation seems to work more eciently than PMX and Combinatorial crossover. Traditional operators sometimes are to disruptive, and there is a risk that with huge search spaces, traditional operators can converge to local sub-optima. Developed analysis can be easily extended into number of sequencing optimisation or scheduling problems. There is expected a mathematical model of GA for sequencing problems, and some theoretical modelling of di erent GA operators applied to such kind of problems, based on probability and entropy theory.

6 Future work

6.1 MPGA extensions

The main extension to the MPGA library would be graphic visualiser for representing the population dynamics within the parallel GA. It would be also bene cial to develop 'probe' software which could estimate the most ecient dynamic demes con guration for the given tness function (GA operators in general) and for given hardware con guration. There is a very big area of improvements in dynamic load balancing, which is not implemented but which could signi cantly improve the parallel GA. There is evident improvement in terms of speed with comparison to simple distributed tness GA, but there is still lack of proper and close to reality theoretical analysis method for estimating the expected timing di erences. Developing proper theoretical analysis would be a challenge and big advantage.

6.2 Combinatorial GA problems

The work on sequencing problems can be extended by testing the approach on di erent optimisation problems, such as scheduling of the university timetable with the use of those new modi ed operators. There is also possible to specify more domain dependent operators, which should work much more e ectively on di erent sequencing problems. It would be also very good to estimate the di erence for small problems and big problems. The tests within this work were mainly done with use of rather simple tness functions and short genotype (South Wales length of genotype was 52 units). It seems that developed estimations will work for the longer genotypes as well, but it should be tested and evaluated. There would be bene cial to develop new methods of comparison di erent genetic operators used with sequencing problems. Improvements done in mutation and crossover operators shows that there is still big eld for research. Better understanding of sequencing problems and proper representation implications can lead to more ecient methods of solving di erent scheduling problems, such as university timetable problems, or National Grid problem.

26

A MPGA User Guide (short version) Statistics:  Multi-purpose, publicly accessible parallel GA library  Source code developed: 248kB, 38 les  Executables: 5 for MPGA, 2 additional tools, one script for multi-runs, gnuplot friendly outputs  Time of development: 3 months (MPGA) Parallel Genetic Algorithm

library

(working name: Cirrus)

Author: Mariusz Nowostawski The University of Birmingham, UK School of Computer Science 1998 February-May

MPGA library is based on PVM (by PVM Project Group) and QGame package (by Laura Dekker) Web site: http://www.cs.bham.ac.uk/~mxn/cirrus

******************* * MPGA user guide * *******************

1. Getting started 1.1. Requirements 1.2. Installation 1.3. Features 2. Files hierarchy 3. How to prepare your own GA 3.1 GA operators 3.2 Genotype 3.3 Fitness function 3.4 Population management 4. Running the algorithm 4.1 Makefile structure 4.2 How to compile

27

4.3 Debugging information 5. Comments

----------------------------------1. Getting started =================== 1.1. Requirements -----------------To use the library you need to have installed and properly configured (!) PVM virtual machine software only. PVM can be obtained from http://www.netlib.org/pvm3 The min PVM Web Page: http://www.epm.ornl.gov/pvm The whole MPGA package is written in C++, thus you need to have C++ compiler installed. The MPGA library was tested under PVM 3.4, with cxx compiler under DEC Alpha machines (OSF1 V4.0 564 alpha). You can use safely gnu compiler instead.

1.2. Installation ----------------To install the software you need to download the latest version of the MPGA library from the Cirrus Home Page (http://www.cs.bham.ac.uk/~mxn/cirrus). After successfully downloading the gzipped tar archive, you should use gzip -cd MPGA.tar.gz | tar xvf to extract the source files. After that, you will need to set up properly Makefile.aimk, by changing appropriate directory entries. The library itself will be compiled and linked with your own GA operators within compilation stage. All executables are produced after single compilation. The whole library should be recompiled after changes within the GA, or after changes of configuration. There are some additional tools apart from the main algorithm, which should be compiled separately (once only). This includes: Plot and Average generators. To compile them go into tools subdirectory and run aimk, which will produce two executables for your platform: PGen and Avr. IMPORTANT: The whole Makefile system is based on Architecture Independent Make, so you need to have it installed in your system. It is a part of the PVM library. All compilation should be done with aimk instead of make,

28

because it will produce appropriate directory structure for Parallel Virtual Machine.

1.3. Features ------------MPGA current version 0.3 is supports many of the features for genetic algorithms. It is straightforward framework with collection of objects for different GA strategies, with full use of distributed environment.

* Supported parallel features: - Single population with distributed fitness and mutation evaluation - Single population with dynamic demes - Distributed subpopulations with migration operator

* Supported GA features: * initialisation - Random Initialisation - Combinatorial Initialisation * selection: - Roulette Wheel - Truncated Selection - Tournament Selection - Linear Selection - Remainder Stochastic - Window Select * mutation: - Basic Mutation (change the value of one gen) - Combinatorial Mutation (swap of two genes) - One element mutation, EPM (take one element and put it back into genotype in random position with the shift of the remaining genes) - Swap Pair Mutation (similar to Combinatorial Mutation, but swaps only neighbours), SP * crossover - Holland Crossover - Two Point Crossover - Uniform Crossover -

Combinatorial Crossover (one point with correction) PMX (Partially Matched Crossover) CX (Cycle Crossover) OX (Order Crossover)

29

- MPMX (Modified PMX)

2. Files overview ================== All the files in the main directory are the part of the library. Files inside the strEval are the classes of the example of the string Evaluation (permutation of integer numbers). strEval implementation can be easily adopted for the user needs, and use also as a test fitness for setting up the library and configuration. There are two additional tools for the library, one produces the gnuplot format file from the output of the algorithm (from the SORTER output). This one is called pgen. Second one called avr, will take the output of the pgen, and evaluate mean for the collection of files. Input files for avr should have names 0, 1, 2, 3 ...

MPGA\ - source files for the library MPGA\strEval - demo fitness evaluation MPGA\tools - additional applications

3. How to prepare your own GA ============================== 3.1 GA operators ----------------If you are planning to use predefined GA operators defined within the library, you should put appropriate object initialisation in the Slave.cpp source code. There is only one place when the Individual object is initialised, and there should be initialisation of the appropriate GA operators (which are separate objects). When you plan to use your own GA operators, the best solution is to extend the MPGA library by the new operators, and use them as a part of the library.

3.2 Genotype ------------There library support several standard genotypes, such as string of symbols, string of real numbers, and string of integers. If you would like to use another type of genes, you are welcome to define new class which will inherit from the CGene class. Then you have to specify proper definitions of methods for taking, setting, and sending genes, and thats it.

30

3.3 Fitness function --------------------Follow the demo strEval program, and you will see, that setting up new fitness function for the MPGA is very easy. The thing you have to do, is to implement function which can take an Individual object, and evaluate the fitness using methods from the passed individual (it have to be passed as an argument).

3.4 Population management -------------------------The whole population management in current version is done within mqgame.h header file. You should set up properly all necessary constants, responsible for GA (probability factors, problem type, population size etc) and the parallel configuration (number of demes, size of demes, SORTER output size).

4. Running the algorithm ========================= 4.1 Makefile structure ----------------------It is very likely that the only think you will have to change, is: SDIR - source, directory with the MPGA source files BDIR - binaries, directory, where executables can be find by pvmd OBJ_F - put here appropriate object file with fitness evaluation

4.2 How to compile ------------------Compilation process is pretty simple. With use of Architecture Independent Make script, one can easily compile the some source files on different architectures. Type: aimk CLEAN to get rid of object files and then: aimk for creating all executables at once.

4.3 Debugging information --------------------------

31

Because the MPGA library is based on PVM, debugging is a little bit tricky. All output made from all your executables except MANAGER will go to the /tmp/pvml.* file, so you should inspect this file while in debugging stage. There should be defined two debugging constant: XPVM_DEBUG and DEBUG uncomment them in the mqgame.h file 5. Comments ============ Please feel free to send any feedback, comments, bug reports to [email protected] or [email protected] The next version of the library will contain: (*) better configuration management (e.g. no need to recompile the whole lib after single change of crossover probability) (*) populaton dynamics visualiser (*) even more GA operators

References Abramson, D. and Abela, J. (1992). A parallel genetic algorithm for sloving the school timetabling problem, 15 Australian Computer Science Conference, Hobart. Almasi and Gottlieb (1989). Highly Parallel Computing, Benjamin-Cummings. Asveren, T. and Molitor, P. (1996). New crossover methods for sequencing problems, Proceedings of the Fourth International Conference on Parallel Problem Solving from Nature, Berlin. Burke, E., Elliman, D. and Weare, R. (n.d.). A genetic algorithm based university timetabling system, e-mail: [email protected]. Cantu-Paz, E. (1995). A summary of research on parallel genetic algorithms, IlliGAL Report 95007, Illinois GA Labolatory, Urbana Illinois. Cantu-Paz, E. (1997). Designing ecient master-slave parallel genetic algorithms, IlliGAL Report 97004, University of Illinois Urbana-Champaign, Urbana Illinois. Dzubera, J. and Whitley, D. (n.d.). Advanced correlation analysis of operators for the traveling salesman problem. Geist, G. A., Kohl, J. A. and Papadopoulos, P. M. (1996). Pvm and mpi: a comparison of features. 32

Gibbons, A. and Rytter, W. (1988). Ecient Parallel Algorithms, Cambridge University Press. Goldberg, D. (1989a). Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley. Goldberg, D. (1989b). Sizing populations for serial and parallel genetic algorithms, Proceedings of the Third International Conference on Genetic Algorithms, pp. 70{79. Gordon, T. G. W. (1996). Schedule optimisation using genetic algorithms, Master's thesis, University College London. Holland, J. (1992). Adaptation in Natural and Arti cial Systems, MIT Press. Hwang, K. and Briggs, F. (1985). Computer Architecure and Parallel Processing, McGraw-Hill. Kohl, J. A. and Geist, G. A. (1996). Xpvm 1.0 users's guide, Technical Report ORNL/TM-12981, Oak Ridge National Laboratory. Kozielski, S. and Szczerbiski, Z. (1994). Komputery rownolegle (Parallel computers), WNT Warszawa. Kwasnicka, H. and Nowostawski, M. (1997). Search engine for information systems based on parallel genetic algorithm, in R. Wyrzykowski, H. Piech and J. Szopa (eds), Proceedings of the 2nd International Conference on Parallel Processing & Applied Mathematics, Vol. 2, Institute of Mathematics and Computer Science, Technical University of Czestochowa, pp. 442{451. Langdon, W. B. (1995). Scheduling planned maintenance of the national grid, Technical Report RN/3/95, University College London. Langdon, W. B. (1997). Scheduling planned maintenance of electrical power transmission networks using genetic algorithms, Technical Report CSRP-97-26, University of Birmingham. Levine, D. (1994). A parallel genetic algorithm for the set partitioning problem, Technical Report ANL-94/23, Argonne National Laboratory. Stevenson, R. W. (1990). UNIX Network Programming, Prentice Hall, Inc., USA. WWW (1995). The component object model speci cation, Microsoft Corporation and Digital Equipment Corporation, http://www.microsoft.com/oledev/olecom/title.htm. WWW (1998a). DCE 1.1: Remote Procedure Call Speci cation, The Open Group, http://www.rdg.opengroup.org/public/pubs/catalog/c706.htm. WWW (1998b). The Common Object Request Broker: Architecture and Speci cation, Revision 2.2, Object Management Group, http://www.omg.org/corba/corbiiop.htm. 33