Evolutionary Neural Networks for Time Series ... - Semantic Scholar

1 downloads 0 Views 314KB Size Report
It acts as a photocopy of a gene by having a sequence complementary to one strand of the DNA and identical to the other strand. The mRNA acts as a busboy to ...
Evolutionary Neural Networks for Time Series Prediction based on L-system and DNA Coding Method Ki-Youl Lee, Dong-Wook Lee Dept. of Control and Instrumentation Engineering Chung-Ang University 221, Hukseuk-Dong, DongJak-Ku Seoul 156-756, Korea (azure, dwlee)@ms.cau.ac.kr

Kee-Bo Sim School of Electrical and Electronic Engineering Chung-Ang University 221, Hukseuk-Dong, DongJak-Ku Seoul 156-756, Korea [email protected]

Abstract- In this paper, we propose a method of constructing neural networks using bio-inspired emergent and evolutionary concepts. This method is algorithm that is based on the characteristics of the biological DNA and growth of plants. Here is, we propose a constructing method to make a DNA coding method for production rule of Lsystem. The L-system is based on so-called the parallel rewriting mechanism. The DNA coding method has no limitation in expressing the production rule of L-system. Evolutionary algorithms motivated by Darwinian natural selection are population based searching methods and the high performance of which is highly dependent on the representation of solution space. In order to verify the effectiveness of our scheme, we apply it to one step ahead prediction of Mackey-Glass time series and Sun spot data.

programming based artificial brain. Sugisaka([sugi98]) also has developed artificial brain. To develop more big and complex system, we also use the method of development and evolution. In our systems, the mapping from genotype to phenotype uses the developmental model, which is L-system. Since the chromosome is mapped the set of production rules of L-system. We propose new encoding method based on biological DNA encoding. This method is an algorithm that includes the characteristics of biological DNA. DNA coding method([yomo96]) has many advantages. For example, it is suitable for representing the rule, and has good performance when the chromosome is long. In section 2, we describe backgrounds of our neural network, which are DNA coding method, L-system and Time series prediction. Section 3 explains the construction of our neural network. Section 4 show the experimental result of time series prediction of Mackey-Glass data([mac77]) and Sun spot data.

1 Introduction There are many models in imitation of nature. Evolution, development and learning are the three levels of organization in biology. Representative evolutionary models are Genetic Algorithms(GA), Evolutionary Strategies(ES), Genetic Programming(GP), Evolutionary Programming(EP) and CoEvolutionary Algorithms(CEA). Models of development are Cellular Automata(CA), L-system, and bio-morph model etc. And typical learning model is neural networks. The success of neural networks in information processing tasks depends mostly on well-designed structures of the systems. In general, its structures have to be defined before learning algorithms are executed. In practice, determination of structure takes most of effort in adjusting a neural system to particular tasks. In recent researches, the evolutionary approach is used for the structure and parameter optimization of neural networks. In addition, the resulting networks have usually been trained using back propagation. However, in this case, the computation cost can be so high as to make genetic algorithms impractical except for optimizing small topologies. Many researchers take developmental and evolutionary approach for designing neural networks. Boers ([boers93]), Gruaru([gruaru93]) et al. Proposed the design method of neural networks based on L-system and genetic algorithms. De Garis([garis94]) has studied cellular automata and genetic

2 Background 2.1 DNA coding method 2.1.1 Biological DNA DNA(deoxyribonucleic acid) is that natural living things have their own inherent code. It is genetic code that emerges to the characteristics of individual. Biological DNA consists of nucleotides which have Adenine(A), Thymine(T : Urasil(U) in RNA), Guanine(G), Cytosine(C). DNA is made up of two complementary chains of nucleotide bases (A, G, C, and T). The linear sequence of these bases provides information content via a genetic code in which three, consecutive bases specify one amino acid. A messenger RNA(ribonucleic acid)1 is first synthesized from DNA. Each amino acid is specified by a particular combination of three nucleotides, called a codon. Codons are allocated sequentially in the m-RNA. Codons have 64 types. But codons correspond to 20 kinds of amino acid. The al1 mRNA - messenger RNA is a copy of a gene. It acts as a photocopy of a gene by having a sequence complementary to one strand of the DNA and identical to the other strand. The mRNA acts as a busboy to carry the information stored in the DNA in the nucleus to the cytoplasm where the ribosome can make it into protein.

locations of amino acid make proteins and proteins make up cells. Translation of m-RNA starts on AUG, and comes to an end on UGA(or UAA, UAG). Each codon represents one amino acid and termination sequence. Table 1: The Genetic Code U C A UUU Phe UCU Ser UAU UUC UCC UAC UUA Leu UCA UAA UUG UCG UAG C CUU CCU Pro CAU CUC CCC CAC CUA CCA CAA CUG CCG CAG A AUU Ile ACU Thr AAU AUC ACC AAC AUA ACA AAA AUG Met ACG AAG G GUU Val GCU Ala GAU GUC GCC GAC GUA GCA GAA GUG GCG GAG *U(uracil) is replaced by T(thymine) in DNA U

G Thr Stop His Gln Asn Lys Asp Glu

UGU UGC UGA UGG CGU CGC CGA CGG AGU AGC AGA AGG GGU GGC GGA GGG

Cys Stop Trp Arg

Ser Arg Cly

U C A G U C A G U C A G U C A G

Figure 1 shows the general structure of genetic code. General point of view, genes are composed of regulatory region and coding region. When the state of cell is satisfied to stimulate the regulatory region, RNA translates the coding region to protein. The regulatory region and coding region structure of the DNA code corresponds to the general representation of rule such as ”If A then B”. The antecedent(A) of rule corresponds to regulatory region and the consequent(B) to coding region. In development model, the representation method of the production rule is similar to this structure.

This method has redundancy of chromosome and overlapping of genes, because start codon and stop codon are determined at random and interpretation of genes is overlapped. And crossover point is not constraint, so the length of the chromosome is variable. Yomohiro’s method is suitable to represent a rule by preparing a decoding table. The DNA coding method has the following features.  The variable length of chromosome(floating representation)

 

No constraint on crossover point The redundancy and overlapping of interpretation of chromosome

 Easy knowledge representation Wu([wu96]) proved the effectiveness of the floating representation by schema analysis in GA. DNA coding is one of the floating representation, because it does not have a fixed location for special meaning. Floating representation has the following characteristics.  It is very effective when the length of chromosome is long 

Diversity of population is high. It has a good parallel search ability and recombination ability.

Figure 2: Overlapping gene and translation of genetic code Figure 2 shows that overlapping of DNA code. Start analysis at start codon(ATG), and amino acid consist of translated codons. 2.2 L-system 2.2.1 Simple L-system

Figure 1: Ceneral structure of genetic coding

2.1.2 DNA Coding Method Yomohiro et al.([yomo96]) proposed a new coding method based on biological DNA. This coding method is made of four symbols like nucleotides that have four bases. The elements of method represented by four symbols not a binary representation that is used in GA. It is translated by codon that is three nucleotides like biological mechanism.

L-system proposed by Aristid Lindenmayer([lind68]), ([lind87]) is a mathematical development model of biological growth of plants and it can be considered as a special class of fractal. L-system is based on so-called the parallel rewriting mechanism. A useful application of L-system is modeling plants in computer graphics. L-system is used in theoretical biology for describing and simulating natural growth process. This feature makes Lsystem especially suitable for describing fractal structures, cell divisions in multi-cellular organism, or flowering stages of herbaceous plants. 0L-system (0 means deterministic with no context) is the simplest type of L-system. Formally, a 0L-system can be defined as a triple G ; P;

= (

)

 : the finite set of alphabet or symbol ex)  = f a, b, c g  : the set of all finite strings over the alphabet  ex)  = f a, b, c, ab, bc, ca, abc, aaabca,    g : the initial start word, common referred to as the axiom, is an element of  . ex) = a P

: productions or rewrite rules. the structure preserving !  is defined by a set of 0L-system mapping P rules. ex) production rules

: 

 

p1 : p2 :

a ! ab b!a

For example, let us consider the following L-system which models a simple grow process:

  = f A, B, C g  = ABC  P(production rules) – p1 : A ! BA – p2 : B ! CB – p3 : C ! AC Applying the rules p1, p2 and p3 first to the axiom , and then repeatedly to the resulting expressions generates the following sequence of strings: - : ABC - P1 : BACBAC - P2 : CBBAACCBBAAC

- P3 : ACCBCBBABAACAC   The left-hand-side of production rule (the part before the ”!”) describes, the sub-string to be replaced, and the righthand-side describes the string with which it should be replaced. If no production rule is given for a character, the character is replaced with itself. 2.2.2 Bracketed L-system Only the simple L-system cannot model of real life plants. Accordingly new presentation method needs for modeling of plants. There is a L-system that has = F and p : F = F[+F]F[F]F. If we associate F with draw a line, - with take a left turn and + with take a right turn. The angle for + and - is variable and will be denoted with Æ . Where is, two new symbols are introduced: [ - Remember the current position and angle ] - Restore the last stored position and angle

Figure 3: F ! F[+F]F[-F]F With these new symbols ([,]), more realistic drawings can be obtained. Figure 3 shows the string created from the axiom after three rewriting steps. Æ :Æ

= 23 5

2.2.3 Context-sensitive L-system An extension of L-system, so-called context, is needed to model information exchange between neighboring cells. Context also leads to more plants. Context can be left, right and both for a certain sub-string. A production rule in a context-sensitive L-system has the following form: L

R!S P(also called the predecessor) and S(the successor)are what we earlier called the left-hand-side and the right-handside of a production rule. L and R(the left- and right-context respectively) may be absent. Commonly, an L-system without context is called a 0L-system. If all production rules have one-sided context, it is called a 1L-system, and a 2 L-system has production rules with two-sided context. A production rule with left and right context L and R can only replace P by S if L precedes P and followed by R. If two production rules apply for a certain character, one with and one without context, the one with context is used. 2.3 Time Series Prediction The problem of time series prediction is this: given values of the past, x, one must find a function, f , which predicts values of the futures. Past values, x t  , can be considered as a vectors,

(

~x(t) = (x(t); x(t

(+ ) ~( ( )) 19

)

~

1);    ; x(t

))

(1)

The future values, x t  , are estimated by a function of previous values, f ~x t . In this paper we consider the shortterm prediction of   and  , i.e., we predict the

=1

(t 1) at time t + 1 from the input ~x(t) = (x(t); x(t 1);    ; x(t 19))

value x

Table 2: Translation table of codon (2)

The predictive accuracy of models are evaluated by estimating the normalized mean squared error (NMSE) as follow:

E=

1

 N  var

N X t=1

jx(t + 1) f~(~x(t))j2



(3)

2

N N 1 X 1 X var =  x(t + 1) x(t + 1) N N t=1

(4)

t=1

This problem of time series prediction is then reduced to finding the predicator f ~x t that minimizes its NMSE value E.

~( ( ))

3 Construction of Neural Network In this research we tried to combine four methods with there origin in biology: - Genetic Algorithms - DNA coding method - L-system - Neural Networks Our goal is to design a method that searches automatically for neural networks architecture. 3.1 DNA Coding Method for Neural Network The chromosome of proposed neural network is DNA code. Its code consists of A, G, T and C. Table 2 is translation table for DNA to production rules. In first step, initial DNA code is proposed at random. The production rules made of translated amino acid in DNA code. The translation starts from start codon(ATG) in DNA code. The first codon translates to name of node. The second codon translates to connecting range(C/R). Connecting range(x,y) is ,in string(or node’s array), its connecting range is determined by the value of x,y. Current node links between xth and yth nodes. If node’s name is comma(’,’) in string of between the range, next node does not link. The third codon is bias. The weight of each node is translated 4th codon to 8th codon.

Amino Acid Leu Arg Ser Thr Ala Gly Val Pro Stop Ile Tyr Gln Phe Asp Cys Asn Glu His Lys Trp Met

# of Amino Acid 6 6 6 4 4 4 4 4 3 3 2 2 2 2 2 2 2 2 2 1 1

Node’s Name A B C D A B C D A B C D , , , , , , C D

C/R 1,1 2,2 3,3 1,2 1,3 1,4 2,3 2,4 3,4 4,4 1,1 2,2 3,3 1,2 1,3 1,4 2,3 2,4 3,4 4,4

values of bounds from -3.2 to 3.1 at 0.1 intervals.

W eight =

(DNA  42 + DNA  41 + DNA  40) 32 10 (5)

where DNA is one of the nucleotides which the values are A=0, G=1, T=2, C=3 respectively. Now, one node completes. Repeatedly, DNA code translates as for until stop codon. The first codon is predecessor and the remainder is successor. This string is a rule. The other rules find in the same way.

Figure 5: Translation of DNA code for Production Rules Figure 4: Structure of Node The weight and bias is calculated by equation 5. This has

The translated rule by referred way like production rule of context-free L-system. If there are rules have a same predecessor, only use the first rule. Repeatedly determined rewriting steps make a neural network by final string.

The neural network has one over more input and output nodes. But usable network has suitable number of in-output nodes. The input value is past data and output value is predicted value. To evolve neural network by genetic algorithms, we have to select suitable neural network. The suitable neural network is that network has a determined number of in-output nodes. And high performance of networks is low error in network’s time prediction. Error is difference between original data and output of predicator. The selection method in GA is mixed ranking selection and   selection of ES. Ranking selection is that the population is sorted according to objective function value. And   section is selected by  individuals in  parents and  offspring. Through the GA, prediction performance of neural networks is higher.

P2 : B(2,3)C(1,2)A(2,2) P3 : B(2,3)C(1,2)A(2,2)B(1,1)C(2,4) Its string may have unavailable nodes. The unavailable nodes eliminate at organizing network. Figure 7 shows the neural network created from string for solving XOR problem.

(+ ) ( + )

Figure 7: Organization of Neural Network

4 Prediction results 4.1 The Mackey-Glass equation To validate our method, we use test data as MackeyGlass([mac77]) chaotic data. The equation of Mackey-Glass time series is as follows:

dx(t) dt

= 1 +axx( t(t ))

bx(t)

(6)

The variables are chosen to be a=0.2, b=0.1, =10 and equal to 30. The fitness function of neural network is equation 7:

F itness = e E

(7)

=

Where E is normalized mean squared error(NMSE) and  . Figure 8 is result of Mackey-Glass time series prediction. The number of learning data is 250 and test set is 50.

2

Figure 6: Process of Evolving Neural Network



3.2 Example of Neural Network Suppose that five rules created by DNA code. But, there are two rules that have predecessor ’A’. In that case, only use the first rule. Third rule eliminates in the set of rules. p1 : A ! B(1,1)C(2,4) p2 : B ! B(2,3) p3 : A ! A(2,3)B(3) p4 : D ! A(1,1)

p5 : C ! C(1,2)A(2,2) String is created from the axiom after three rewriting steps following abovementioned rules. - Axiom : A P1 : B(1,1)C(2,4)

Figure 8: Predicted Mackey-Glass Data(— Ideal,    Predict) Figure 9 is transition of fitness in our simulation. Figure 9 shows that best individual converges fast and other individuals follow it.

Table 3: Parameter of Simulation

Figure 9: Transition of Fitness 4.2 The Sunspot data Sunspot is a dark spot, some as large as 80,000km in diameter, move across the surface of the sun, contracting and expanding as they go. It is made of low temperature than around. These strange and powerful phenomena are known as sunspots. The first set of experiments was conducted on Wolf’s sunspot series acquired during 1700-1988. Only pastobserved data is in existence in sunspot data. Since in so, prediction method use only past data. The data set was partitioned into a training set in 17191918(200) and test set in 1919-1968(50). Another parameter is equal to Mackey-Glass data prediction.

Population Initial String Length Crossover method Crossover Prob. Mutation Prob. Selection Generation No. of input node No. of output node Training data Test data Range of output

Mackey-Glass 50(50+150) 300 one point 0.9 0.3 Ranking 300 5  19 1 250 50 -2  2

Sunspot data 50(50+150) 300 one point 0.9 0.3

& ( + )

500 5  19 1 200 50 -2  2 (

100)

The artificial neural networks have computational ability by interconnection of the artificial neuron that is simple component of networks. In conventional neural networks, weight between neurons is an important parameter for behavior of network, because it is the objective of learning. Accordingly, weight and architecture find through evolutionary algorithms. The DNA code of short length makes diverse production rules. And neural networks consist of string made of production rule in L-system. In the future work, we have plan to make use of contextsensitive and bracket L-system. We are studying for extends Genetic Programming. And this application is applied to problem in variety of real world problems(e.g., financial forecasting, stock forecasting).

Bibliography [yomo96] Yomohiro, T. Uchikawa, Y. ”Effect of New Mechanism of Development from Artificial DNA and discovery of Fuzzy Control Rules,” Proc. of IIZUKA ’96, pp.498-501. 1996 Figure 10: Predicted Sunspot Data(— Ideal,    Predict) Table 3 shows parameters used in time predicted simulation of Mackey-Glass time series and Sunspot data. Parameter of sunspot data prediction problem is equal except for size of training data set and generation number.

5 Conclusions In this paper, we proposed a new method of constructing neural networks. These evolutionary neural networks are based on the concept of development and evolution. To make evolutionary neural networks, we use DNA coding method, Lsystem and GA. We make the diverse production rule of Lsystem from DNA coding of short length. Neural networks are made of string based on production rules.

[mac77] Mackey, M.C. and Glass, L. ”Oscillation and Chaos in Physiological Control Systems,” Science 1977, p.287 [boers93] Boers, E.J.W Kuiper, H. Happel, B.L.M. and Kuyper, S. ”Designing Modular Artificial Neural Networks,” Proc. of Computer Science in the Netherlands, pp. 87-96. 1993 [garis94] Garis, H.D. ”CAM-BRAIN: The Genetic Programming of an Artificial Brain Which Grows/Evolves at Electronic Speed in a Cellular Automata Machine,” Proc. of The First int’l Conf. on Evolutionary Computation, vol. 1, pp. 337-339, 1993

[sugi98] Sugisaka, M. ”Design of an artificial brain for robots,” Proc. of The Third Int’l Symp. on Artificial Life and Robotics. pp. (I-1)-(I-11), 1998 [gruaru93] Gruaru, F. Whitley, D. ”Adding Learning to the Cellular Development of Neural Networks: Evolution and the Baldwin Effect,” Evolutionary Computation, vol. 1-3, pp. 213-233. 1993 [wu96]

A.S Wu, R.K. Lindsay, ”A computation of the fixed and Floating Building Block Representation in Genetic Algorithms,” Evolutionary Computation, vol. 4, no.2, pp. 169-193, 1996

[goldberg89] David E. Goldberg ”Genetic Algorithms in search, Optimization, and Machine Learning,” Addison Wesley Publishing Company, 1989 [lind68] Aristid Lindenmayer, ”Mathematical Models for Cellular Interaction in Development, Part I, II”, Journal of Theoretical Biology, vol. 18, pp. 280315, 1968 [lind87] Aristid Lindenmayer, ”Developmental Models of Multicellular Organisms : A Computer Graphics Perspective,” Artificial Life VI, pp. 221-249, 1987

&

[jacob95] Jacob, C. ”Modeling Growth with L-systems Mathematica,” Mathematica in Education and Research, Vol. 4, No. 4, TELOS Springer, pp. 12-19, 1995 [lee99]

Lee, D.W. Sim, K.B. ”Evolving Chaotic Neural Systems for Time Series Prediction,” Congress on Evolutionary Computation. pp. 310-316, 1999

[casd89] Casdagli, Martin. ”Nonlinear Prediction of Chaotic Time Series,” Phisica D, vol 35, pp. 335-356, 1989