artificial chemical functional link network for prediction ... - Springer Link

14 downloads 0 Views 3MB Size Report
FTSE. Different experiments are conducted to evaluate the performance of the proposed ..... and Adaptive BFO (ABFO) for stock market index prediction ..... 250. 300. 350. 400. 450. TAIEX. Fig. 5 Histogram of daily returns of the BSE (left panel) ...
Evolving Systems https://doi.org/10.1007/s12530-018-9221-4

ORIGINAL PAPER

ACFLN: artificial chemical functional link network for prediction of stock market index S. C. Nayak1 · B. B. Misra2 · H. S. Behera3 Received: 19 April 2017 / Accepted: 27 February 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract Uncertainty and complexity associated with the stock data make the exact determination of future prices impossible. Successful prediction of a stock future price requires an efficient prediction system. This paper proposes an artificial chemical reaction optimization based functional link network termed as ACFLN for stock market forecasting. The efficiency of the proposed model has been evaluated by forecasting five real stock market prices such as BSE, DJIA, NASDAQ, TAIEX and FTSE. Different experiments are conducted to evaluate the performance of the proposed model such as forecasting the stock price 1 day ahead, 1 week ahead, and 1 month ahead. Data is obtained for all the working days in a year and for each data the said experiments are conducted. From simulation studies, it is revealed that the proposed model achieves better forecasting accuracies over others. Keywords  Stock market forecasting · Functional link artificial neural networks · Artificial chemical reaction optimization · Back propagation neural network

1 Introduction Stock data is noisy, uncertain, nonlinear, hence drawing inferences and making precise predictions of stocks is a very critical and challenging matter. It has been drawn the attention of researchers and financial managers for last few decades. The stock market behaves very much like a random walk process due to the influence of uncertainties involved and their serial correlation is economically and statistically * S. C. Nayak [email protected]; [email protected] B. B. Misra [email protected] H. S. Behera [email protected] 1



Department of Computer Science and Engineering, Kommuri Pratap Reddy Institute of Technology, Hyderabad 500088, India

2



Department of Information Technology, Silicon Institute of Technology, Bhubaneswar 751024, India

3

Department of Computer Science Engineering and Information Technology, Veer Surendra Sai University of Technology, Burla 768018, India



insignificant. Further, nonlinearities, high volatility, discontinuities, fluctuation in other stock markets, political influences, macro-economical factors, and individual psychology (Oh and Kim 2002; Wang 2003) makes the prediction of stock market more complex. Hence, effective and accurate forecasting models are deeply desirable for investors’ financial decision making process as well as predict the market behavior. In recent years, many new methods for the modeling and forecasting the stock market have been developed including linear as well as nonlinear models. A reasonable accurate prediction of stock movement may possible yield high financial benefits (Booth et al. 2014; Al-Hmouz et al. 2015; Barak and Modarres 2015; Ballings et al. 2015). For many decades linear models have been the basis of traditional statistical forecasting models in financial engineering. The Box–Jenkins autoregressive moving average (ARMA) model has widely been used for time series forecasting (Liu et al. 2009). Moving average (MA), auto-regressive integrated moving average (ARIMA), auto-regressive heteroscedastic (ARCH), generalized ARCH (GARCH) have acknowledged wide acceptance among those statistical methods (Box and Jenkins 1976). They have been successfully applied to different engineering, economic and social applications. Over the years, the random walk (RW) has been used as the most dominant statistical linear model in

13

Vol.:(0123456789)



the domain of financial forecasting. The simple RW model (Zhang 2003) and its various modifications (Ghazali et al. 2009) have been extensively used for exchange rate forecasting. A combination methodology which attempts to benefit from the strength of both ANN and RW was proposed in the literature to achieve better forecasting accuracies (Adhikari and Agrawal 2014). However, those models fail to capture the nonlinearity associated with the financial time series data. These statistical methods do not perform very well in predicting the stock market perfectly due to the non linearity coupled with it. Generally financial time series data is asymmetric, showing unexpected outburst at irregular time intervals as well as periods of high and low volatility. Petrică et al. (2016) investigated the limitations of ARIMA model in financial and monetary economics using the behavior of BET Index and EUR/RON exchange rates. They observed fat-tails and volatility clustering on analyzing the financial time series and claimed the inability of ARIMA model to capture those. Recently, Artificial Neural Networks (ANNs) have shown their promise in time series forecasting problems with their nonlinear modeling capability. Some efforts have been made in the direction of combining ARIMA and ANN aiming to capture different form of relationship in the time series data. A number of studies have been accumulated in the literature integrating ARIMA and ANN for the task of forecasting (Zhang 2003; Khashei and Bijari 2010, 2011). In order to take the advantage of the unique strength of ARIMA and ANN in linear and nonlinear modeling Zhang (2003) proposed a hybrid methodology for time series forecasting with improved prediction accuracy. A novel hybridization of ANNs and ARIMA models for time series forecasting was proposed by Khashei and Bijari (2011). Their proposed model was used the unique capability of ARIMA models in linear modeling in order to identify and magnify the existing linear structure in data. Then a multilayer perceptron was used to determine a model in order to capture the underlying data generating process and predict the future. Empirical results with three well-known real data sets indicate that the proposed model can be an effective way in order to yield more general and more accurate model. A hybrid ANN–GARCH model is applied to forecast the gold price volatility. The results show an overall improvement in forecasting using the ANN–GARCH as compared to a GARCH method alone (Kristjanpoller et al. 2015). The popular nonlinear models used for financial forecasting include artificial neural networks (ANN), support vector machine (SVM), Bayesian networks, fuzzy system models etc. Among these methods, ANNs have drawn noteworthy attention from researchers in the stock market behavior forecasting. ANN is one of the important approaches in machine learning methods. These are the software constructs intended to mimic the way the human brain learns and can emulate the process of human’s behavior to resolve

13

Evolving Systems

nonlinear problems. These characteristics have made it extensively used in predicting complicated systems. Neural networks broadly used in medical applications such as image/signal processing, pattern and statistical classifiers and for modeling the dynamic nature of biological systems. Due to their superior learning abilities and approximation capabilities, ANNs has been successfully applied to wide range of forecasting problems such as exchange rate, credit scoring, business failure, bankruptcy, interest rate, stock return, stock market index, portfolio management and option and future prices. ANNs are considered to be an effective modeling procedure when the mapping from the input to the output contains both regularities and exceptions. This is particularly useful in financial engineering applications where much is assumed and little is known about the nature of the processes determining asset prices. The neural networks have the ability to discover nonlinear relationships in the input data set without a priori assumption of the knowledge of relation between the input and the output. They are considered to be an effective modeling procedure when the mapping from the input to the output contains both regularities and exceptions which is the way the stock market behaves. It also allows the adaptive adjustment to the model and nonlinear description of the problems. These advantages of ANN attract researchers to develop ANN based forecasting models to the area of stock market prediction. However, ANN based forecasting models suffer from computational complexity as they have need of many number of layers and large number of neurons in each layer of the network (Dutta et al. 2006). Also it suffers from the limitations like black box technique, over fitting and gets trapped in local minima (Fu-Yuan 2008). To circumvent these limitations ANNs with less complex architecture like Functional Link Artificial Neural, higher order neural networks, and hybrid models have been developed. Higher order neural networks have fast learning properties, stronger approximation, higher fault tolerance capability and powerful mapping single layer trainable weights. Higher order terms can increase the information capacity of the network. This representational power of higher order terms can help solving complex nonlinear problems with small networks as well as maintaining fast convergence capabilities. Pao (1989), have revealed that Functional Link Artificial Neural network (FLANN) may be conveniently used for function approximation with faster convergence rate and smaller computational load as compared to multi-layer neural structures (Pao 1989; Pao and Takefuji 1992). The functional expansion effectively increases the dimensionality of the input signal and hence the hyper planes generated afford greater discrimination capability in the input pattern space. This is the motivation for adopting FLANN as the base model in the

Evolving Systems

present research work in comparison to other soft computing models. Gradient based methods are one of the most commonly used error minimization methods used to train back propagation networks. Back propagation neural networks, particularly the multilayer perceptron (MLP) has many shortcomings such as the slow learning rate, more computational overhead, larger memory size, get into local minimum, bigger randomicity and so on. This affects the predicted results of the stock price. These shortcomings force researchers toward developing hybrid models by combining linear and nonlinear models. ANN and its hybridization with other soft computing techniques have been successfully applied to the potential corporate finance applications and found to be appropriate. These hybrid models that have been developed by many researchers combining nonlinear models such as ANN and evolutionary soft computing techniques such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and other nature and bio-inspired search techniques, have better accuracies. Over the decades, a number of forecasting models based on soft computing techniques such as ANN, fuzzy logic and its hybridization have been applied to the stock index forecasting. Several nature-inspired populationbased algorithms such as GA, PSO, differential evolution (DE), and evolutionary algorithm (EA) have shown their promising ability as learning algorithm utilized for forecasting purpose. However, their performance may vary from one stock market to another. A method for time series forecasting using a Deep Belief Nets (DBN) composed by two Restricted Boltzmann Machines (RBMs) has been proposed by Kuremoto et al. (2014). The structure of RBMs was optimized by a PSO for the proposed model and was confirmed its priority to conventional ANN models. Choosing a suitable optimization technique for solving a particular problem involves a numerous trial and error method. Another application of GA for choosing the optimal parameter of ANN based models was proposed by Nayak et al. (2017). Here the authors employed the hybrid model to explore virtual data positions in the financial time series and incorporated them to enhance the forecasting accuracy. Chakravarty and Dash (2012) proposed an integrated FLANN and interval Type-2 fuzzy logic system (FLIT2FNS) for prediction of three stock market indices S&P 500, BSE, and DJIA for 1 day, 1 week, and 1 month ahead. BP and PSO were used to train the model. Performance of their model has been compared with an integrated FLANN and Type-1FLS, and Local Linear Wavelet Network (LLWNN) model and found superior. The average mean absolute percentage of error obtained from PSO trained FLIT2FNS was 0.32% for 1 day ahead prediction for all three data sets. Though several evolutionary learning techniques have been used for financial forecasting, their efficiency is

characterized by fine-tuning some learning parameters. In order to search the global optimum solution, an algorithm requires appropriate selection of parameters which makes the use of algorithm difficult. Choosing a suitable optimization technique for solving a particular problem involves a numerous trial and error method. Hence, an optimization techniques requiring fewer parameters as well as good approximation capability will be the choice for better forecasting accuracy. This is the motivation behind choosing artificial chemical reaction optimization as the training algorithm for obtaining an optimal FLANN based forecasting model. The detail about ACRO will be discussed in Sect. 3. Artificial chemical reaction optimization (ACRO) is one of the newly well-known meta-heuristics for optimization proposed by Lam and Li (2010). It is an evolutionary optimization technique inspired by the nature of chemical reaction. This optimization method does not need a local search method to refine the search and includes both local and global search capability. Unlike other optimization techniques, ACRO does not require many parameters that must be specified at beginning and only defining number of initial reactants is adequate for implementation. As the initial reactants are distributed over feasible global search region, optimal solutions can be obtained with modest iteration and hence significant reduction in computational time is achieved. The ACRO has been successfully used to solve many complex problems in recent years and found to be outperforming many other evolutionary population based algorithms. This is the reason behind choosing ACRO toward constructing evolving FLANN structures in this work. The objective of this research work is to develop a forecasting model which can be applied to global stock market data with improved forecasting accuracies. In order to fill up the need of a good training algorithm, a natural chemical reaction inspired metaheuristic technique has been chosen for optimizing the parameters of FLANN and the model is termed as artificial chemical functional link network (ACFLN). Here, the problem of finding an optimal parameter set to train the model could be seen as a search problem into the space of all promising parameters. A molecule/ individual in ACRO represents a possible FLANN architecture hence a reactant pool is considered as a set of potential FLANN models. The fitness of the best and average individual (i.e. FLANN) in each iteration increases towards a global optimum. The fitness is obtained from the absolute difference between the target and the estimated output. The less the fitness value of an individual, better fit is the FLANN for the next iteration. In this way the networks go through an evolving process and finally the optimal network has been achieved. The prediction of short term (1-day-ahead), medium (1-week-ahead) and long term (1-month-ahead) closing prices of BSE, DJIA, NASDAQ, TAIEX and FTSE have been carried out for financial year 01 January 2003 to

13



Evolving Systems

12 September 2016 data. The sliding window technique has been used to select the training pattern for the network as an alternative of dividing the whole data set into training and test pattern. Instead of normalizing the whole data set before training, we normalize the current training data. Also, for each current training pattern, a previously optimized weight set has been utilized adaptively and hence there is a significant reduction in training time. The suitability of ACFLN can be claimed in support of the following reasons: • The base model used is computationally less complex,

i.e. FLANN has a single layer network.

• The optimization technique, i.e. ACRO is characterized • • • •



with less number of tuning parameters, and faster convergence capability. The optimal model is achieved through an evolving process. Minimal data is used for training i.e. few input neuron with minimal patterns presented in each epoch. Adaptive models are used. To establish the unbiased nature as well as adoptability to changing environment, without much deviation in the capabilities of prediction; the daily closing prices are used for a period 15 years. To ascertain the performance of the suggested model, the daily closing price of different stock markets across the globe are considered for this study.

The rest of the paper is organized as follows. Section 2 covers work related to stock index forecasting. Section 3 describes background of artificial chemical reaction optimization. Section 4 describes the architecture of the proposed ACFLN forecasting models as well as seven comparative models. These models include two popular statistical models, i.e. RW and ARIMA, two ANN based models (BPNN and RBFNN), and three hybrid models (FLANN-GA, MLPGA, and MLP-CRO). The Sect. 5 presents the results and discussion. Finally Sect. 6 gives the concluding remarks followed by a list of references.

2 Related work During last two decades there has been a tremendous development in the area of soft computing methodologies which includes artificial neural network, evolutionary algorithms and fuzzy systems. This improvement in computational intelligence capabilities has enhanced the modeling of complex, dynamic nonlinear systems. The ANN has recently been applied to many areas such as data mining, stock market analysis, medical and many other

13

fields. Kung and Yu adopts (2008) the GM (1, 1) model to predict the rates of return of nine major index futures in the American and Eurasian market and compared the performance with GARCH/TGARCH. The findings reveal that the latter models perform better than the former in terms of forecasting capabilities. Hassan and Nath (2005) had successfully applied the Hidden Markov Model (HMM) for predicting future events and observed that HMM are encouraging and offer a new paradigm for stock market forecasting. Gurusen et al. (2011) evaluates the effectiveness of neural network models such as MLP, dynamic ANN (DAN2) and the hybrid neural networks which use generalized GARCH in stock market predictions. Their findings suggested that, DAN2 model is superior as compared to others. Majhi et al. (2009a, b) used an Bacterial Foraging Optimization (BFO) and Adaptive BFO (ABFO) for stock market index prediction and observed that both BFO and ABFO models are computationally more efficient, prediction wise more accurate and show faster convergence compared to other evolutionary computing models such as GA and PSO based models. Aboueldahab and Fakhreldin (2011) proposes a new hybrid GA/PSO model with perturbation term inspired by the passive congregation biological mechanism to overcome the problem of local search restriction in standard hybrid models. Support Vector Machines (SVM) is an innovative approach to constructing learning machines that minimize generalization error. SVMs are based on very simple and intuitive concepts and its foundation stem from statistical learning theory. They are free of the optimization headaches of neural networks because they present a convex programming problem and guarantee finding a global solution. SVM with a new kernel function called Gaussian Radial Basis Polynomials Function (GRPF) has been proposed by Zanaty (2012) for improving the classification accuracy. A comparative analysis of SVMs versus the Multilayer Perception for data classifications is presented to verify the effectiveness of their proposed kernel function. Comparing the classification accuracy of the SVM to the MLP learning algorithms, it is observed that support vector machines with the proposed new kernel function accomplishes better accuracy than multilayer networks, especially in high dimension data sets. SVM, Least Square SVM (LSSVM) and Relevance Vector Machine (RVM) were employed for prediction of liquefaction susceptibility of soil and produced best performance for prediction (Samui 2014). Pao (1989) has proposed a simplified single layer ANN model called FLANN which maps the nonlinearity of input–output relationship by functional expansion of the input patterns. It is a class of Higher Order Neural Networks (HONN) that utilizes higher combination of inputs (Pao 1989; Pao and Takefuji 1992). The properties of expanding the input space into a higher dimensional space without hidden units of ANN were introduced. The FLANN is basically a single layer network, in which the need of hidden layers has

Evolving Systems

been removed by incorporating functional expansion of the input pattern. The functional expansion effectively increases the dimensionality of the input vector, and hence the hyper planes generated by the FLANN provide greater discrimination capability in the input pattern space. The performance of FLANN models have been experimented during several research works. In order to develop an intelligent model of the capacitor pressure sensor (CPS), involving less computational complexity, three different polynomials such as, Chebysheb, Legendre and power series have been employed in the FLANN (Patra and Bos 2000). The FLANN offers computational advantage over a multilayer perceptron for similar performance in modeling of the CPS. The FLANN also has been applied to the financial forecasting and proved to be computationally efficient (Patra et al. 1999a). The research works in Patra et al. (2006) and Majhi et al. (2009a) proposed a trigonometric FLANN using LMS and RLS to forecast both short and long term forecasting. The work concludes that FLANN based stock market prediction model is an effective approach both computationally as well as performance wise to foresee the market levels both in short and medium term future. There have been several applications of FLANN including pattern classification and recognition (Mishra and Dehuri 2007; Mishra et al. 2008; Dehuri and Cho 2010a, b; Majhi et al. 2009b), system identification and control (Majhi et al. 2012; Purwar et al. 2007), functional approximation (Patra et al. 1999b; Lee and Jeng 1998), and digital communications channel equalization (Yang and Tseng 1996). The trigonometric expansion is chosen in many research works, because such expansion based models have been shown to provide improved performance for various applications (Majhi et al. 2009a; Patra et al. 1994, 1999a, 1999d). Optimization is one of the cornerstones in science and engineering. Most of the problems can be formulated in the form of optimization ranging from power generation scheduling in electrical engineering (AlRashidi and EI-Hawary 2009), DNA sequencing in biomedical science (Shin et al. 2005) to stock market trend prediction (Yu et al. 2009). In the past few decades, the field of nature-inspired optimization techniques such as GA, Memetic Algorithm (MA), Ant Colony Optimization (ACO), PSO, Differential Evolution (DE), and Harmony Search (HS) has grown incredibly fast. Many of them are inspired by the biological process, varying in scale from the genetic level, e.g. GA, MA, and DE, to the creature level, e.g. ACO and PSO. Unlike the others, HS is motivated by the phenomenon of human activities in composing music. These algorithms are successful in solving many different kinds of optimization problems. A new hybrid model based on a heuristic optimization methodology (HS or GA) and ANN, to improve stock market forecasting performance in terms of statistical and financial terms has been proposed in (Göçken et al. 2016). With development to the hybrid ANN models the authors showed that structuring ANN has become

easy in implementation because the proposed models have great capability in variable selection and determining the number of neurons in hidden layer. In order to select the most relevant technical indicators, they firstly set predetermined 45 variables and at the end of the analysis 26 and 23 variable are specified as non redundant by GA and HS models, respectively. In this way the complexity of the variable selection is reduced to almost its half. In addition, determining the optimum number of neurons in hidden layer eliminates the over fitting or under fitting problems of ANN models. Lam and Li (2010) proposed a new metaheuristic for optimization, inspired by the nature of chemical reactions called as Artificial Chemical Reaction Optimization (ACRO). In a short period of time, ACRO has been applied to solve many problems successfully, outperforming many existing evolutionary algorithms. There are few applications of ACRO to multiple-sequence alignment, data mining, classification rule discovery and some benchmark functions and the efficiency has been demonstrated (Alatas 2011, 2012). A complete guideline to help the readers implement ACRO for their optimization problem can be found in the tutorial introduced in Lam and Li (2012). The tutorial summarizes the basic characteristics as well as applications reported in the literature. A real coded version of chemical reaction optimization (RCCRO) has been proposed in Lam et al. (2012) to solve the continuous optimization problems. Also, they proposed an adaptive scheme for RCCRO for performance improvement. The performance of RCCRO has been compared with a large number of techniques experimented on a set of standard continuous benchmark functions. The results show the suitability of ACRO solving problems in the continuous domain. Chemical reaction optimization also successfully applied for population transition peer-to-peer live streaming (Lam et al. 2010). They employed chemical reaction optimization to maximize the probability of universal streaming by manipulating population transition probability matrix. Their simulation results show that ACRO outperforms many commonly used strategies for this problem. Minimizing the number of coding links of a network for a given target transmission rate to improve the network efficiency is a NPhard problem. Pan et al. (2011) adopted chemical reaction optimization to develop an algorithm to solve this complex problem and found that the ACRO based framework outperforms other existing algorithms. The ACRO also has been successfully applied to grid scheduling problem (Xu et al. 2010) and task scheduling problem (Xu et al. 2011a). The authors compared the efficiency of several versions of chemical reaction optimization with other algorithms such as genetic algorithm, simulated annealing, threshold accepting and particle swarm optimization and found chemical reaction optimization to be superior. ACRO also used to replace the back propagation based training of ANN for classification problem (Yu et al. 2011). The simulation results show

13



that the ACRO based ANN outperforms other optimization techniques such as GA, SVM etc. Allocating available channels to the unlicensed users in order to maximize the utility of radio channels is a complex task. An algorithm based on ACRO has been developed by the authors in Lam and Li (2010) to solve this radio spectrum allocation problem which outperforms other evolutionary techniques. Two different types of chemical reaction optimization, named canonical CRO and super molecule based CRO (S-CRO) have been proposed in Xu et al. (2011b) for the problem of stock portfolio selection and suggested that S-CRO is promising in handling the stock portfolio optimization problem. A special type of higher order neural network, called as Pi-Sigma neural network (PSNN) has been trained with ACRO forms a novel CRO-PSNN model by the authors in (Nayak et al. 2015a, b). The model performance has been tested with various benchmark data sets. The performance of their proposed model is found superior to PSNN, GAPSNN and PSO-PSNN. A new chemical reaction optimization with greedy strategy algorithm (CROG) has been proposed in Truong et al. (2013) to solve the 0–1 knapsack problem. The article designed a new repair function integrating a greedy strategy and random selection is used to repair the infeasible solutions. The experimental results show the superiority of CROG over GA, ACO and quantum-inspired evolutionary algorithm. However, there is lacking of ACRO based forecasting model utilizing global data sets. It is found from the above literature study that, there are few applications of ACRO toward data mining, particularly toward the domain of financial forecasting. Motivated by the capability of ACRO, few applications were found towards forecasting stock prices (Nayak et al. 2013, 2015, 2017). In this study we aim to fill this research gap through the incorporation of ACRO with higher order neural network to forecast the movements of five fast growing global stock indices.

3 Artificial chemical reaction optimization Chemical reaction optimization proposed by Lam and Li (2010), is a population based metaheuristic inspired by natural chemical reaction. The concept loosely couples mathematical optimization techniques with properties of chemical reaction. A chemical reaction is a natural process of transforming the unstable chemical substances (reactants/molecules) to the stable ones. A chemical reaction starts with some unstable molecules with excessive energy. The molecules interact with each other through a sequence of elementary reactions producing some intermediate chemical products. At the end, they are converted to those with minimum energy to support their existence. The energy associated with a molecule is called as enthalpy (minimization problem) and/or entropy (maximization

13

Evolving Systems

problem). During a chemical reaction this energy changes with the change in intra-molecular structure of a reactant and becomes stable at one point. Most of the reactions can occur in both forward and backward direction, i.e. reversible reaction. These reactions may be monomolecular or bimolecular depending on the number of reactants taking part in the reaction. This property is embedded in ACRO to solve optimization problems. ACRO algorithm begins with set of initial reactants in a solution. Then reactants are consumed and produced via chemical reactions. Algorithm is terminated when the termination criterion is met similar to the state when no more reactions can take place (inert solution). According to the above concept, the overall process of ACRO can be represented as in Fig. 1.

3.1 Problem and algorithm parameter initialization The optimization problem is specified as follows: Minimize f(x) subject to xj ∈ Xj = 1, 2, … ,N, where f(x) is an objective function; x is the set of each decision variable xj; N is the number of decision variables, Xj is the set of the possible range of values for each decision variable, that is xjmin and xjmax are the lower and upper bound of the jth decision parameter respectively for real-values encoding. The problem may require different types of encoding such as binary, real, or permutation.

3.2 Setting the initial reactants and evaluation In this step initial reactants are evenly initialized in the feasible searching space. Uniform population method proposed in Karci and Arslan (2002), Karci and Alatas (2006) and Karci (2007) for initial population generation can be used for creating initial reactants. In general, all vectors in a space can be obtained as a linear combination of elements of the base set. If one of elements in the base set is absent, then the dimension corresponding to this element may be vanished. That is why, it is important that initial reactants must contain reactants which must hold each element of the base set. By considering regularity case and base set, the initial reactants must be regular and also hold the base set. Generating initial reactants based on divideand-generate paradigm is a method to generate reactants of good quality.

3.3 Applying chemical reactions Different chemical reactions are applied as searching operators similar to crossover and mutation operator in GA. Based on the number of reactants take part in a chemical reaction, the reaction may be divided into two categories:

Evolving Systems Fig. 1  Process of artificial chemical reaction optimization

monomolecular (one reactant takes part in reaction) or bimolecular (two reactants take part in chemical reaction). The monomolecular reactions (Redox1 and Decomposition) assist in intensification while the bimolecular reactions (Synthesis, Redox2 and Displacement) offer the consequence of diversification. ACRO does not attempt to capture every detail of a chemical reaction rather loosely couples chemical reaction with optimization technique. Description of these algorithms can be found in the article by Nayak et al. (2015a, b).

3.4 Reactants update In this step, a chemical equilibrium test is performed. If the newly generated reactants give a better function value, the new reactant set is included and the worse reactant is excluded similar to reversible chemical reactions. The reactant updating processes implemented are described in the algorithms for respective reactions (Nayak et al. 2015a, b).

3.5 Termination criterion check The ACRO is terminated when the termination criterion (e.g. maximum number of iterations) has been met. Otherwise, Steps 3.3 and 3.4 are repeated.

4 Forecasting models This section describes the architecture and specification of the forecasting models considered. First, it shortly describes the most widely statistical models such as RW and ARIMA. Then it describes the three ANNs used as the base model for developing hybrid models. These models include a MLP, a radial basis function neural network (RBFNN), and the traditional trigonometric FLANN. Finally, the proposed ACFLN approach has been described.

4.1 Statistical models 4.1.1 Random walk The simple RW model assumes that the most recent observation is best guide to the immediate forecast, i.e. the information about the future is already contained in the available data (Zhang 2003; Adhikari and Agrawal 2014). Here, each prediction is assumed to be the sum of the most recent observation and a random error term. If yt and yt−1 are the current and previous observations of the time series respectively, mathematically a simple RW model can be represented as: (1) where 𝜀t is a random error term and must satisfy the independent and identical distribution (i.i.d) property. In this experiment

yt = yt−1 + 𝛆t

13



Evolving Systems

( ) the i.i.d pseudorandom normal variables 𝜀t ∼ N 0, 𝜎 2 are generated with 𝜎 2 being the variance of the sample data set. The RW is a subclass of ARIMA, i.e. ARIMA (0, 1, 0). 4.1.2 ARIMA ARIMA models are probably the most extensively used statistical models for financial time series forecasting. These are commonly referred as Box–Jenkins models after the pioneer work proposed by Box and Jenkin (1976). The model is based on the hypothesis that the allied time series can be generated from a linear combination of predefined number of past observations and a random white noise term. Mathematically the model is represented as follows:

( ) �(S)(1 − S)d yt = 𝛉(S)𝛆t where, �(S) = 1 −

p ∑ i=1

�i Si , 𝜃(S) = 1 +

(2) q ∑ j=1

𝜃j Sj . The parameter

p is the number of autoregressive, q is the moving average terms, and d is the degree of differencing. The term 𝜀t is the random error term and yt represents the actual observations of the time series. The random error term basically satisfies the i.i.d property. More generally these models are referred as ARIMA (p, d, q) model. The appropriate model parameters in this experiment were determined by following the Box–Jenkins model build specifications (Box and Jenkins 1976).

4.2 ANN based models 4.2.1 Multi layer perceptron MLP is one of the most widely implemented neural network topologies in different fields of research. The back propagation rule propagates the errors through the network and allows adaptation of the hidden neurons. The feed forward neural network model considered here consists of one hidden layer only. This model consists of a single output unit to estimate the closing index prices. The neurons in the input layer use a linear transfer function, the neurons in the hidden layer and output layer use sigmoidal function as follows:

yout =

1 1 + e−𝛌yin

(3)

where ­yout is the output of the neuron, λ is the sigmoidal gain and ­yin is the input to the neuron. Let there be m neurons in the hidden layer. Since there are n input values in an input vector, the number of neurons in the input layer is equal to n. The first layer corresponds to the problem input variables with one node for each input variable. The second layer is useful in capturing non-linear relationships among variables. At each neuron j in the hidden layer, the weighted output z is calculated using Eq. 4.

13

( zj = f Bj +

n ∑

) Vij × Xi

i=1

(4)

where Xi is the i  th input vector, Vij is the synaptic weight value between i th input neuron and j th hidden neuron and Bj is the bias value and ƒ is sigmoidal activation function. The output y at the single output neuron is calculated using Eq. 5. ( ) m ∑ y = f B0 + Wj × zj (5) j=1

where, Wj is the synaptic weight from j th hidden neuron to output neuron, zj is the output of the jth hidden neuron, and B0 is the output bias. This output y is compared to the desired output and we calculate the total sum of squared errors by using Eq. 6. )2 1 ∑( ei = ti − y i (6) 2 i

where ei is the error signal, ti is the target signal for i­ th training pattern and yi is the estimated output for ith pattern. The mean squared error from all training patterns is calculated and propagated back to train the MLP model. The weight and other parameter values are adjusted by the gradient descent rule. Because of the gradient descent neural network learning, they are characterized with problems like slow convergence, getting trapped to local minima. Therefore possibilities are there, that it may affect the prediction capabilities of the model. 4.2.2 Radial basis function neural network (RBFNN) Radial basis function neural network (RBFNN) can be used for approximating functions and recognizing patterns. The RBFNN is a two layered network. In RBFNN, each hidden unit of hidden layer implements a radial activation function and each output neuron of output layer implements a weighted sum of hidden units’ output. This network is a special class of neural network in which the activation of a hidden neuron is determined by the distance between the input vector and a prototype vector. Prototype vectors refer to centers of clusters formed by the patterns or vectors in the input space. The interconnection between the hidden and output layer are made through a weighted connections Wi . The output layer, a summation unit, supplies the response of the network to the outside world. In the input layer, the number of input neurons is determined based on the input signals that connect the network to the environment. The hidden layer consists of a set of kernels units that carry out a nonlinear transformation from the input space to the hidden space. Two parameters, the center and the width are associated with each RBF node. The

Evolving Systems

centers are determined during RBF training. The problem of selecting suitable number of basis functions is an important issue for RBFN. The number of basis functions controls the approximation and the generalization ability of RBF network. Some of the commonly used kernel functions are the Gaussian function, cubic function, linear function, generalized multi quadratic function etc. We used the Gaussian function which is represented in Eq. 7.

weight values is passed on to the sigmoidal activation function of the output unit. The estimated output is compared with the target output and error signal is obtained. This error signal is used to train the model using gradient descent technique. Figure 2 shows the architecture of the FLANN model developed for this experimental work.

( ) x − 𝜇i 2 �i (x) = exp − 2𝜎i 2

This model employs a traditional trigonometric FLANN as the base model. The binary encoded ACRO technique is employed for FLANN. The model generates a set of weight and bias vectors representing a set of FLANN models. The input vector along with weights and bias values are passed on to the respective FLANN model. ACRO alone handles a set of FLANN and update the weight simultaneously. The absolute difference between the target y and the estimated output ŷ is treated as the fitness of the respective FLANN models. Lower fitness value is considered as the better fit individuals for the purpose of selection of individual to the next generation by the ACRO. The algorithm starts with assigning values for termination criteria (number of iteration here), length of reactant, ReacNum, i.e. total number of reactants in a population. The uniform population generation method has been considered for generation of initial reactants. A string of binary values has been considered as an atom and set of such atoms represents one reactant which corresponds to a weight and bias set of the ACFLN. The length of a reactant depends on number of input signal selected and number of functional expansions used (i.e. number of input × number of basis function). So the number of neurons in the input layer is equal to number of input signal × number of basis function. The reaction processes are simulated in iteration phase. The reactions can be monomolecular (i.e. Redox1, Decomposition) or bimolecular (i.e. Redox2, Synthesis, Displacement). The reactants are updated by applying reversible reactions, equivalent to the selection process in evolutionary algorithms. The fitness of a reactant is the enthalpy associated

(7)

where: ‖⋯‖ represents the Euclidean norm, x is the input vector, 𝜇i is the center, 𝜎i is the spread and �i (x) represents the output of the i  th hidden node. The output of the RBF network is calculated as in Eq. 8.

y = f (x) =

N ∑

( ) wk �k x − ck

(8)

k=1

where: [ y is the network ]T output, x is an input vector signal, w = w1 , w2 , … , wN is the weight vector in the output layer, N is the number of hidden neurons, �k (⋅) is the basis function, k is the bandwidth of the basis ( )Tfunction, x is the input vector and ck = ck1 , ck2 , … , ckm is the center vector for k th node, m is the number of input. 4.2.3 FLANN The FLANN introduces higher-order effects through nonlinear functional transforms via links rather than at nodes. For designing the network, at the first instance a functional expansion unit expands each input attribute of the input data. The simple trigonometric basis functions of sine and cosine are used here to expand the original input value into higher dimensions. An input value xi expanded to several terms by using the trigonometric expansion functions such as: ( ) ( ) c1 xi = xi ,

4.3 Proposed ACFLN

( ) ( ) c2 xi = sin xi , ( ) ( ) c3 xi = cos xi , ( ) ( ) c4 xi = sin 𝜋xi , ( ) ( ) c5 xi = cos 𝜋xi , ( ) ( ) c6 xi = sin 2𝜋xi , ( ) ( ) c7 xi = cos 2𝜋xi . The attributes of each input pattern is passed through the functional expansion unit. The sum of the output signals of the functional expansion units multiplied with the synaptic

Fig. 2  Architecture of traditional FLANN based forecasting model

13



Evolving Systems

Fig. 3  Architecture of proposed ACFLN forecasting model

with it. The mean absolute error is considered as fitness function of a reactant in this experimentation. The lower the value of enthalpy is the better the reactant. The architecture of the proposed ACFLN model is represented by Fig. 3. The estimated outputs of the proposed ACFLN are calculated as follows: { } Let X(n) = xi , xi+1, … , xn , (n = number of input signals) be an input closing prices series. This financial time series is nonlinearly expanded by the seven trigonometric basis expansion functions and represented as Xexpanded (N) , where N = n × 7 . Given the input, Xexpanded (N) , the model produces an output ŷ (n) , which is computed by Eq. 9.

ŷ (n) =

Xexpanded (N) × W(n) + b

13

(9)

where b represents the weighted bias input. W(n) is the weight values for the n th pattern. This output is then passed through a nonlinear transformation functions, here sigmoid activation to produce the estimated output y(n) . The activation function is presented by Eq. 10.

y(n) =

1 1 + e−̂y(n)

(10)

The error signal e(n) is calculated as the difference between the desired response and the estimated output of the model and computed as follows: (11) The high level training algorithm for ACFLN model is described by Algorithm 1.

e(n) = |d(n) − y (n)|

Evolving Systems

The different chemical operators applied on the reactants have been discussed in the previous section. The pseudo code for the ApplyChemicalReaction procedure is presented by Algorithm 2.

5 Experimental results and discussion This section first describes the data collection and description for experimentation. The stationarity as well as statistical description of the financial time series are presented next. The performance metrics, input selection for the forecasting models and data preprocessing, result analysis and statistical significance test are discussed in this section.

5.1 Data collection and description

The enthalpy calculation for each reactant in the reactant set R can be calculated by the procedure as described by Algorithm 3. As earlier mentioned, we used binary encoded reactants for ACRO.

The daily closing index price of five stock markets such as BSE, DJIA, NASDAQ, TAIEX and FTSE are used for this experiment. The daily closing index prices are collected for the period from 01 January 2003 to 12 September 2016 for all five stock markets and plotted by Fig. 4. The descriptive statistics of daily closing prices are summarized in Table 1. The positive skewness value of the closing price as observed from Table 1 imply that BSE, DJIA, and NASDAQ data sets are spread out more toward right. These positive skewness values suggest investment opportunities in these markets. For an example the histogram of daily returns of TAIEX and BSE are presented by Fig. 5. It can be observed that the peaks of the histograms are much higher than the corresponding to the normal distribution. In case of BSE stock data it is slightly skewed to the right and in case of TAIEX stock data it is slightly skewed to the left. The kurtosis analysis implies that stock price of DJIA, and NASDAQ are more outlier prone where as all other financial time series are less outlier prone. Also, from the Jarque–Bera test statistics, it can be observed that all the stock price data sets are nonnormal distributed. It is worth mentioning that stationarity of a financial time series is an important feature which is required to make statements about the future in order to get valid forecast. Therefore, the first thing done is to investigate the time series

13



Evolving Systems

Fig. 4  Daily closing indices of five financial time series

NASDAQ

5000

0 2

0

500

4

1000

1500

x 10

BSE

2000

2500

3000

3500

1 0 2

0

4

500

1000

1500

2000 DJIA

x 10

2500

3000

3500

4000

1 0

0

500

1000

1500

10000

FTSE

2000

2500

3000

3500

5000 0 2

0

4

500

1000

1500

2000 TAIEX

2500

3000

3500

4000

500

1000

1500

2000

2500

3000

3500

4000

x 10

1 0

0

Table 1  Descriptive statistics of closing prices for five stocks Stock Index

Descriptive statistics

BSE DJIA NASDAQ FTSE TAIEX

Minimum

Maximum

Mean

Standard deviation

Skewness

Kurtosis

Jarque–Bera test statistics

792.1800 6.5471e + 003 1.1141e + 003 3287 3.4463e + 003

1.1024e + 004 1.7138e + 004 4.5982e + 003 6.8785e + 003 1.0202e + 004

4.6235e + 003 1.1400e + 004 2.3858e + 003 5.4165e + 003 6.9835e + 003

2.6947e + 003 2.1801e + 003 709.7888 836.2381 1.4846e + 003

0.1154 0.6644 1.0392 − 0.2837 − 0.1776

1.7908 3.0512 4.0027 2.1378 2.0465

236.0430 (h = 1) 253.8134 (h = 1) 764.3663 (h = 1) 158.4568 (h = 1) 159.9786 (h = 1)

BSE

700

TAIEX

450 400

600

350 500

300

400

250

300

200 150

200

100 100

0 -15

50

-10

-5

0

5

10

15

0 -10

-5

0

5

Fig. 5  Histogram of daily returns of the BSE (left panel) and TAIEX (right panel) against the theoretical normal distribution

13

10

Evolving Systems

for the presence of a unit root to determine whether the analyzed time series is stationary or not. We conducted the stationarity test by using two well-known unit root tests, i.e. Augmented Dickey–Fuller (ADF) (1979) and Phillips–Perron (PP) (1988). In case of the standard Dickey–Fuller test the assumptions to be checked can be written as follows: H0:  α = 0   → the series has unit root, so is non-stationary. Against the alternative. Halt:  α 0.6142 CD value

13

ANN based models

AverageRank(RW) − AverageRank(ACFLN) = 7.85 − 1.27 = 6.68 > 0.6142 CD value AverageRank(RW) − AverageRank(RBFNN) = 7.85 − 4.4 = 3.45 > 0.6142 CD value AverageRank(RW) − AverageRank(MLP-CRO) = 7.85 − 2.87 = 4.98 > 0.6142 CD value AverageRank(RW) − AverageRank(MLP-GA) = 7.85 − 4.1 = 3.75 > 0.6142 CD value All the models satisfy the Nemenyi post hoc test for significant difference. Therefore these models can be treated as consistent models. These models may not perform the

Evolving Systems

best always but their results will be competitive with other better performing models. From the above re-ranking values presented in Table 6, it can be observed that ACFLN model produced superior performance measures, obtained rank 1, and shows consistent performance through all the stock indices. For one-week-ahead forecasting it�is found�that the Fried∑� 12 × 2.622 + man statistic for Table  8, 𝛘2F = [4×8×9)] ) 1.222 + 4.052 + 2.82 + 4.352 + 5.852 + 7.92 + 7.152 −

Let the forecast errors defined as: eit = ŷ it − yt , i = 1, 2 . Let the ( loss ) function associated with the forecast is defined as: g eit = ||e2 it || and between the two ( the ) loss ( differential ) forecasts is: dt = g e1t − g e2t . The null hypothesis and the alternative ( ) are defined as: H0 ∶ E dt = 0 ∀t , that ( the ) two forecasts have the same accuracy versus Halt ∶ E dt ≠ 0 , that the two forecast have different levels of accuracy. The DM statistic is defined as:

[3 × 4 × (8 + 1)] = 3.8030 is greater than the F-distribution (4−1)×3.8030 FF = 4(8−1)−3.8030 = 0.4715 , therefore the null-hypothesis is

DM = �

rejected. Further Nemenyi post hoc test is conducted for pair wise comparison. It was found that all models satisfy the Nemenyi post hoc test for significant difference. From Tables 7 and 8, it is also observed that average performance of ACFLN is better in comparison to FLANNGA, RBFNN, MLP-CRO, MLP-GA, BPNN, and statistical models for 1-week-ahead prediction also. Similarly, it can be observed that�the Friedman � ∑ �statistic 12 × 2.122 + 𝛘2F = [4×8×9)] for Table  10, ) 1.222 + 4.62 + 2.92 + 4.222 + 5.952 + 7.852 + 7.152 −

[3 × 4 × (8 + 1)] = 3.8962 is greater than the F- distribu(4−1)×3.8962 = 0.4848 , therefore the null-hypothetionFF = 4(8−1)−3.8962 sis is rejected. Further Nemenyi post hoc test is conducted for pair wise comparison and found that all models satisfy the Nemenyi post hoc test for significant difference. On analyzing the results of 1-month-ahead forecasting at Table 10, it can be revealed that the performance metric values obtained by ACFLN is better in comparison to all hybrid, ANN based, and statistical models.

Deibold–Mariano test (DM test) (Diebold and Mariano 1995; Harvey et al. 1997) is a pair wise comparison of two or more time series models are available for forecasting a particular variable of interest. Let the actual time series is { yt ;t = 1, … , T  } and the two forecasts are { ŷ 1t ;t = 1, … , T  } and { ŷ 2t ;t = 1, … , T  } respectively. Objective: To test whether the forecasts are equally good or not. Indices BSE DJIA NASDAQ FTSE TAIEX

𝛾̂d (0)+2

(20)

∑h−1

k=1 𝛾̂ d (k)

T

where, d̄ is the sample mean of the loss differential, h is the forecast horizon, 𝛾̂d (k) is an estimate to the auto-covariance of the loss differential 𝛾d (k) at lag k. The null hypothesis of no difference will be rejected if the DM statistic value falls outside the range of −z∝∕2 to z∝∕2 , i.e. |DM| > z∝∕2 , where z∝∕2 is the upper z-value from the standard normal table corresponding to half of the desired ∝ level of the test. Consider the significance level of the test is ∝ = 0.05. Since this is a two-tailed test the lower critical z-value corresponds to − 0.025 is − 1.96 and upper critical z-value corresponds to 0.975 is 1.96. If the computed DM statistic falls outside the range of − 1.96 to 1.96 the null hypothesis of no difference will be rejected. The computed DM statistics for 1-dayahead prediction are summarized in the Table 11. It can be observed that the computed DM statistic values are lying outside the critical range. Similar observations have been recorded for 1-week-ahead and 1-month-ahead predictions which support rejection of null hypothesis.

5.6 Further discussion

5.5.2 Deibold–Mariano (DM) test

Table 11  Computed DM statictic values from 1-dayahead forecasting



ACFLN

To find out the exact benefit of using ACFLN over other models for different stock indices, MAPE gain is evaluated as follows at Eq. (21). MAPE gain =

(MAPE of existing model − MAPE of proposed model) MAPE of existing model

× 100%

(21)

FLANN-GA

RBFNN

MLP-CRO

MLP-GA

BPNN

RW

ARIMA

2.0144 1.9888 2.5164 − 3.4602 − 2.7422

1.9849 2.5031 − 2.8163 2.2659 2.2372

− 2.2458 3.3507 2.4531 1.9879 − 2.1562

2.0003 − 5.1282 3.1692 2.5517 2.9676

2.7143 − 5.0262 3.8816 − 3.8135 − 4.6899

3.2205 − 2.8874 3.7215 2.7862 − 4.8206

2.7742 2.0057 − 2.9653 − 4.6866 3.4459

13



Evolving Systems

MAPE gain of ACFLN over FLANN-GA

MAPE gain of ACFLN over MLP-GA

80

100

70

80

60 50

1-day-ahead

60

40

1-week-ahead

40

30

1-month-ahead

20 10 0

DJIA

NASDAQ

FTSE

TAIEX

Fig. 6  MAPE gain of ACFLN over FLANN-GA forecasting model

1-week-ahead 1-month-ahead

20 0

BSE

1-day-ahead

BSE

DJIA

NASDAQ

FTSE

TAIEX

Fig. 9  MAPE gain of ACFLN over MLP-GA forecasting model MAPE gain of ACFLN over BPNN

MAPE gain of ACFLN over RBFNN 100 80 1-day-ahead

60

1-week-ahead

40

1-month-ahead

20 0

BSE

DJIA

NASDAQ

FTSE

100 90 80 70 60 50 40 30 20 10 0

1-day-ahead 1-week-ahead 1-month-ahead

BSE

DJIA

NASDAQ

FTSE

TAIEX

TAIEX

Fig. 7  MAPE gain of ACFLN over RBFNN forecasting model

Fig. 10  MAPE gain of ACFLN over BPNN forecasting model MAPE gain of ACFLN over RW

MAPE gain of ACFLN over MLP-CRO

100 99

80

98

70

97

60 1-day-ahead

40

95

1-week-ahead

94

1-month-ahead

93

30 20

92

10

91

0

1-day-ahead

96

50

BSE

DJIA

NASDAQ

FTSE

TAIEX

1-week-ahead 1-month-ahead

BSE

DJIA

NASDAQ

FTSE

TAIEX

Fig. 11  MAPE gain of ACFLN over RW forecasting model

Fig. 8  MAPE gain of ACFLN over MLP-CRO forecasting model

The MAPE gain of ACFLN over FLANN-GA is presented in Fig. 4. From the figure it can be revealed that, maximum MAPE gain is obtained for BSE data, and minimum gain for TAIEX and NASDAQ data. Considering a particular stock data, it can be observed that the MAPE gain is significant for 1-week-ahead prediction. Similarly, the MAPE gain of ACFLN over FLANN-GA, RBFNN, MLP-CRO, MLPGA, BPNN, RW, and ARIMA forecasting model are shown by Figs. 6, 7, 8, 9, 10, 11 and 12 respectively.

13

Considering the analysis above, it can be concluded that ACFLN model shows better performance in comparison to other seven forecasting models. The average MAPE obtained from ACFLN model is 0.015% for 1-day-ahead prediction for all five datasets, whereas it was 0.32% for FLIT2FNSPSO for three datasets proposed in (Chakravarty and Dash 2012). The actual v/s estimated closing prices by ACFLN (1-day-ahead) are plotted and are presented by the Figs. 13, 14, 15, 16 and 17 for all data sets. For the sake of visibility

Evolving Systems MAPE gain of ACFLN over ARIMA 100 98 96

1-day-ahead

94

1-week-ahead

92

1-month-ahead

90 88

BSE

DJIA

NASDAQ

FTSE

TAIEX

Fig. 12  MAPE gain of ACFLN over ARIMA forecasting model

these prices are plotted for a time window consisting data for 5 months only, i.e. from 15 Feb 2013 to 16 July 2013 for all data sets. The performances of the models can be further compared in terms of computation time. The experiments were carried out by a system with Intel ® core ™ i3 CPU, 2.27 GHz and 2.42 GB memory. The programming environment used MATLAB-2009. The computation time (in terms of second) for short-term, medium-term, and long-term forecasting are summarized in the following tables. As the number of training patters decreases from short-term to long-term prediction, the computational time also decreases. The computation times from 1-day-ahead, 1-week-ahead, and Fig. 13  Estimated v/s actual closing prices (1-day-ahead) by ACFLN for BSE

1-month-ahead forecasting are summarized in the Tables 12, 13 and 14 respectively. Comparing the computation time it is observed that proposed ACFLN based forecasting model requires the minimum time as compared to ANN based and hybrid forecasting models. Statistical models are simple to calculate, hence lowest computation time. However, their performance is quite inferior to ACFLN. Further, as identical windows are used for all the time horizons for training, therefore requirement of time by a model for a stock index is close in all the three time horizons.

6 Conclusions Effective and accurate stock trend prediction has been considered as a challenging task in financial forecasting due to high nonlinearity associated with the stock market. This paper proposes a hybrid and adaptive model called ACFLN for financial forecasting. FLANN and ACRO are hybridized to develop this forecasting model. To develop the initial model, at the first instance higher number of iterations are considered. But subsequently the adaptive models are generated for each change in training set with help of very little iteration. This substantially reduces the computation time and the model is trained for a larger dataset than that of the actual training set. Stock index values from BSE, DJIA, NASDAQ, TAIEX and FTSE are

0.315

0.31

normalized closing prices

0.305

0.3

0.295

0.29

0.285

0.28

Estimated Actual

0.275

0.27

10

20

30

40

50

60

financial day

70

80

90

100

13



Evolving Systems

Fig. 14  Estimated v/s actual closing prices (1-day-ahead) by ACFLN for DJIA

0.435 Estimated Actual

0.43

normalized closing prices

0.425 0.42 0.415 0.41 0.405 0.4 0.395 0.39 0.385

Fig. 15  Estimated v/s actual closing prices (1-day-ahead) by ACFLN for NASDAQ data set

0

10

20

30

40

50

60

financial day

70

80

90

100

0.47 Estimated Actual

0.46

normalized closing prices

0.45 0.44 0.43 0.42 0.41 0.4 0.39 0.38 0.37

0

10

20

collected for a period over 13 years and 8 months and the proposed model is applied to and tested for the quality of prediction. Short term, medium term and long term

13

30

40

50

60

financial day

70

80

90

100

predictions are conducted with the proposed model, to check if it will be helpful for all types of predictions or not. The prediction capability of this model is compared

Evolving Systems Fig. 16  Estimated v/s actual closing prices (1-day-ahead) by ACFLN for FTSE

0.37 Estimated Actual

0.36

normalized closing prices

0.35

0.34

0.33

0.32

0.31

0.3

0.29

0

Fig. 17  Estimated v/s actual closing prices (1-day-ahead) by ACFLN for TAIEX

10

20

30

40

50

60

financial day

70

80

90

100

0.415 Estimated Actual

0.41

normalized closing prices

0.405

0.4

0.395

0.39

0.385

0.38

0.375

0.37

0

10

20

with FLANN-GA, MLP-GA, MLP-CRO, RBFNN, BPNN, RW, and ARIMA. From extensive numerical experimentation results, it is clearly established that the ACFLN helps

30

40

50

60

financial day

70

80

90

100

to deal with the uncertainties associated with the stock market in an efficient manner. For all the stock indices it is observed that the overall prediction capability of the

13

Table 12  Computation time from 1-day-ahead forecasting

Table 13  Computation time from 1-week-ahead forecasting

Table 14  Computation time from 1-month-ahead forecasting

Evolving Systems Model

Stock index BSE

DJIA

NASDAQ

FTSE

TAIEX

MLP-GA FLANN-GA MLP-CRO ACFLN BPNN RW ARIMA RBFNN

85.33 70.09 52.83 38.82 102.28 25.55 27.63 92.28

64.41 69.89 53.76 29.52 93.68 27.74 28.86 83.46

64.41 84.05 53.38 29.57 96.34 27.00 29.22 76.03

62.2 63.1 56.41 23.64 96.34 26.50 27.75 76.38

59.49 64.28 53.31 21.16 103.31 26.25 26.95 93.63

Model

Stock index BSE

DJIA

NASDAQ

FTSE

TAIEX

MLP-GA FLANN-GA MLP-CRO ACFLN RW ARIMA BPNN RBFNN

55.51 65.79 51.53 38.02 24.80 26.88 96.23 90.53

60.45 64.69 51.35 29.12 27.05 28.14 92.14 82.35

60.65 76.45 51.54 29.5 26.43 27.75 91.64 75.1

57.62 60.31 52.57 22.13 26.11 26.05 93.16 74.38

53.95 61.43 51.53 20.69 24.00 26.18 93.44 90.36

Model

Stock index

MLP-GA FLANN-GA MLP-CRO ACFLN RW ARIMA BPNN RBFNN

BSE

DJIA

NASDAQ

FTSE

TAIEX

55.2 63.38 49.51 33.32 24.65 25.55 92.55 87.23

57.24 61.37 49.34 29.01 27.02 27.25 89.84 82.35

58.65 72.55 51.54 29.5 25.33 27.05 91.64 75.1

55.36 59.86 50.46 22.12 26.00 25.45 90.64 72.37

53.43 60.43 50.32 20.29 25.02 25.36 90.46 82.35

proposed ACFLN is much better than other forecasting models considered. Acknowledgements  The authors are very much thankful to the reviewers and the chief editor for their constructive suggestions which significantly facilitated the quality improvement of this article.

Compliance with ethical standards  Conflict of interest  The authors declare that they have no conflict of interest.

13

Average

Rank

61.52 70.28 53.94 28.54 98.39 26.60 28.08 84.36

5 6 4 3 8 1 2 7

Average

Rank

57.64 65.73 51.7 27.89 25.68 27.00 93.32 82.54

5 6 4 3 1 2 8 7

Average

Rank

55.98 63.52 50.23 26.85 25.60 26.13 91.02 79.88

5 6 4 3 1 2 8 7

References Aboueldahab T, Fakhreldin Md (2011) Prediction of stock market indices using hybrid genetic algorithm/particle swarm optimization with perturbation term. In: International Conference on swarm intelligence (ICSI 2011), Cergy, 14–15 June Adhikari R, Agrawal RK (2014) A combination of artificial neural network and random walk models for financial time series forecasting. Neural Comput Appl 24:1441–1449. https​://doi.org/10.1007/ s0052​1-013-1386-y Alatas B (2011) ACROA: artificial chemical reaction optimization algorithm for global optimization. Expert Syst Appl 38:13170–13180

Evolving Systems Alatas B (2012) A novel chemistry based metaheuristic optimization method for mining of classification rules. Expert Syst Appl 39:11080–11088 Al-Hmouz R, Pedrycz W, Balamash A (2015) Description and prediction of time series: A general framework of granular computing. Expert Syst Appl 42:4830–4839. https​://doi.org/10.1016/j. eswa.2015.01.060 AlRashidi M, EI-Hawary M (2009) A survey of particle swarm optimization applications in electric power system. IEEE Trans Evol Comput 13(4):913–918 Ballings M et al (2015) Evaluating multiple classifiers for stock price direction prediction. Expert Syst Appl 42(20):7046–7056 Barak S, Modarres M (2015) Developing an approach to evaluate stocks by forecasting effective features with data mining methods. Expert Syst Appl 42:1325–1339. https​://doi.org/10.1016/j. eswa.2014.09.026 Booth A, Gerding E, McGroarty F (2014) Automated trading with performance weighted random forests and seasonality. Expert Syst Appl 41:3651–3661. https​://doi.org/10.1016/j.eswa.2013.12.009 Box GEP, Jenkins GM (1976) Time series analysis-forecasting and control. Holden-Day Inc., San Francisco Chakravarty S, Dash PK (2012) A PSO based integrated functional link net and interval type-2 fuzzy logic system for predicting stock market indices. Appl Soft Comput 12:931–941 Dehuri S, Cho SB (2010a) Evolutionarily optimized features in functional link neural network for classification. Expert Syst Appl 37:4379–4391 Dehuri S, Cho SB (2010b) A hybrid genetic based functional link artificial neural network with a statistical comparison of classifiers over multiple datasets. Neural Comput Appl 19:317–328 Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30 Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13:253–263 Dorffner G (1996) Neural networks for time series processing. Neural Netw World 4/96:447–468 Dutta G, Jha P, Laha AK, Mohan N (2006) Artificial neural network models for forecasting stock price index in the bombay stock exchange. J Emerg Mark Finance 5:3 Fu-Yuan H (2008) Integration of an improved particle swarm optimization algorithm and fuzzy neural network for Shanghai stock market prediction. In: Workshop on power electronics and intelligent transportation system, IEEE, 978-07695-3342, pp 242–247 Ghazali R, Hussain AJ, Nawi NM, Mohamad B (2009) Nonstationary and stationary prediction of financial time series using dynamic ridge polynomial neural network. J Neurocomput 72:2359–2367 Göçken M, Özçalıcı M, Boru A, Dosdoğru AT (2016) Integrating metaheuristics and artificial neural networks for improved stock price prediction. Expert Syst Appl 44:320–331 Gurusen E, Kayakutlu G, Daim TU (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397 Harvey D, Leybourne S, Newbold P (1997) Testing the equality of prediction mean squared errors. Int J Forecast 13:281–291 Hassan MdR, Nath B (2005) Stock market forecasting using hidden Markov model: a new approach. In: Proc. IEEE international conference on intelligent systems design and applications. Warsaw, Poland. https​://doi.org/10.1109/ISDA.2005.85 Karci A (2007) Theory of sapling growing up algorithm. Lecture notes in computer science 31:450–460 Karci A, Alatas B (2006) Thinking capability of sapling growing up algorithm iDEAL. Lecture notes in computer science, vol 4224. Springer, Berlin, pp 386–393 Karci A, Arslan A (2002) Uniform population in genetic algorithms IU. J Electr Electron 2(2):495–504

Khashei M, Bijari M (2010) An artificial neural network (p, d, q) model for time series forecasting. J Expert Syst Appl 37:479–489 Khashei M, Bijari M (2011) A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Appl Soft Comput 11(2):2664–2675 Kristjanpoller, Werner, Minutolo MC (2015) Gold price volatility: a forecasting approach using the artificial neural network– GARCH model. Expert Syst Appl 42(20):7245–7251 Kung V, Yu S (2008) Prediction of index futures returns and the analysis of financial spillovers—a comparison between GARCH and the grey theorem. Eur J Oper Res 186:1184–1200 Kuremoto T, Kimura S, Kobayashi K, Obayashi M(2014) Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 137(5):47–56 Lam AYS, Li VOK (2010a) Chemical-reaction-inspired metaheuristic for optimization. IEEE Trans Evol Comput 14(3):381–399 Lam AYS, Li VOK (2010b) Chemical reaction optimization for congnitive radio spectrum allocation. In: IEEE global telecommunications conference (GLOBECOM, 2010). Miami, FL, USA, pp 1–5. https​://doi.org/10.1109/GLOCO​M.2010.56840​65 Lam AYS, Li VOK (2012) Chemical reaction optimization: a tutorial. Memet Comput 4:3–17. https​: //doi.org/10.1007/s1229​ 3-012-0075-1 Lam AYS, Xu J, Li VOK (2010) Chemical reaction optimization for population transition in peer-to-peer live streaming. In: The IEEE congress on evolutionary computation (CEC), Barcelona, 18–23 July 2010, pp 1–8 Lam AYS, Li VOK, Yu JJQ (2012) Real-coded chemical reaction optimization. IEEE Trans Evol Comput 16(3):339–353 Lee TT, Jeng JT (1998) The chebyshev polynomial based unified model neural networks for function approximations. IEEE Trans Syst Man Cybern B 28:925–935 Liu H-C, Lee Y-H, Lee M-C (2009) Forecasting china stock markets volatility via GARCH models under skewed-GED distribution. J Money Invest Bank 7:5–14 Majhi R, Panda G, Sahoo G (2009a) Development and performance evaluation of FLANN based model for forecasting of stock markets. Expert Syst Appl 36:6800–6808 Majhi R, Panda G, Majhi B, Sahoo G (2009b) Efficient prediction of stock market indices using adaptive bacterial foraging optimization (ABFO) and BFO based techniques. Expert Syst Appl 36:10097–10104 Majhi R, Majhi B, Panda G (2012) Development and performance evaluation of neural network classifiers for Indian internet shoppers. Expert Syst Appl 39:2112–2118 Mishra BB, Dehuri S (2007) Functional link artificial neural network for classification task in data mining. J Comput Sci 3:948–955 Mishra BB, Dehuri S, Panda G, Dash PK (2008) Fuzzy swarm net (FSN) for classification in data mining. CSI J Comput Sci Eng 5:1–8 Nayak SC, Misra BB, Behera HS (2013) Hybridizing chemical reaction optimization and artificial neural network for stock future index forecasting. In: International conference on emerging trends and applications in computer science, IEEE, Shillong, India. https​://doi.org/10.1109/ICETA​CS.2013.66914​09 Nayak J, Naik B, Behera HS (2015a) A novel chemical reaction optimization based higher order neural network (CRO-HONN) for nonlinear classification. Ain Shams Eng J. https​: //doi. org/10.1016/j.asej.2014.12.013 Nayak SC, Misra BB, Behera HS (2015b) Artificial chemical reaction optimization of neural networks for efficient prediction of stock market indices. Ain Shams Eng J Nayak SC, Misra BB, Behera HS (2017a) Efficient financial time series prediction with evolutionary virtual data position exploration. Neural Comput Appl1–22. https​://doi.org/10.1007/s0052​ 1-017-3061-1

13

Nayak SC, Misra BB, Behera HS (2017b) Artificial chemical reaction optimization based neural net for virtual data position exploration for efficient financial time series forecasting. Ain Shams Eng J Oh KJ, Kim K-J (2002) Analyzing stock market tick data using piecewise non linear model. Expert Syst Appl 22:249–255 Pan B, Lam AYS, Li VOK (2011) Network coding optimization based on chemical reaction optimization. In: IEEE global telecommunications conference (GLOBECOM 2011). Kathmandu, Nepal, pp 1–5. https​://doi.org/10.1109/GLOCO​M.2011.61336​97 Pao YH (1989) Adaptive pattern recognition and neural networks. Addison-Wesley, Reading Pao YH, Takefuji Y (1992) Functional-link net computing: theory, system architecture, and functionalities. Computer 25:76–79 Patra JC, Bos AVD (2000) Modeling of an intelligent pressure sensor using functional link artificial neural networks, ISA Trans Elsevier 39:15–27 Patra JC, Panda G, Baliarsingh R (1994) Artificial neural network based nonlinearity estimation of pressure sensors. IEEE Trans Instrum Meas 43(6):874–881 Patra JC, Pal RN, Baliarsingh R, Panda G (1999a) Nonlinear channel equalization for QAM signal constellation using artificial neural networks. IEEE Trans Syst Man Cybern Part B Cybern 29(2):262–271 Patra JC, Pal RN, Chatterji BN, Panda G (1999d) Identification of nonlinear dynamic systems using functional link artificial neural networks. IEEE Trans Syst Man Cybern Part B Cybern 29(2):254–262 Patra JC, Kim W, Meher PK, Ang EL (2006) Financial prediction of major indices using computational efficient artificial neural networks. IJCNN, Vancouver, pp 2114–2120 Petrică A-C, Stancu S, Tindeche A (2016) Limitation of ARIMA models in financial and monetary economics. Theoret Appl Econ XXIII.4(609):19–42 Purwar S, Kar IN, Jha AN (2007) On-line system identification of complex systems using Chebyshev neural networks. Appl Soft Comput 7:364–372 Samui P (2014) Vector machine techniques for modeling of seismic liquification data. Ain Shams Eng J 5:355–360

13

Evolving Systems Shin SY, Lee IH, Kim D, Zhang BT (2005) Multiobjective evolutionary optimization of DNA sequences for reliable DNA computing. IEEE Trans Evol Comput 9(2):143–158 Taieb SB, Bontempi G, Atiya AF, Sorjamaa A (2012) A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst Appl 39:7067–7083 Truong TK, Li K, Xu Y (2013) Chemical reaction optimization with greedy strategy for the 0–1 knapsack problem. Appl Soft Comput 13:1774–1780 Wang Y (2003) Mining stock prices using fuzzy rough set system. Expert Syst Appl 24:13–23 Xu J, Lam AYS, Li VOK (2010) Chemical reaction optimization for grid scheduling problem. In: IEEE international conference on communications (ICC, 2010), Cape Town, South Africa, pp 1–5. https​://doi.org/10.1109/ICC.2010.55024​06 Xu J, Lam AYS, Li VOK (2011a) Chemical reaction optimization for task scheduling in grid computing. IEEE Trans Parallel Distrib Syst 22(10):1624–1631 Xu J, Lam AYS, Li VOK (2011b) Portfolio selection using chemical reaction optimization. World Acad Sci Eng Technol 5:402–407 Yang SS, Tseng CS (1996) An orthonormal neural network for function approximation. IEEE Trans Syst Man Cybern 26:779–784 Yu L, Chen H, Wang S, Lai KK (2009) Evolving least square support vector machines for stock market trend mining. IEEE Trans Evol Comput 13(1):87–102 Yu JJQ, Lam AYS, Li VOK (2011) Evolutionary artificial neural network based on chemical reaction optimization. In: IEEE congress on evolutionary computation (CEC, 2011), New Orleans, LA, USA, pp 2083–2090. https​://doi.org/10.1109/CEC.2011.59498​72 Zanaty EA (2012) Support vector machines (SVMs) versus multilayer perception (MLP) in data classification. Egypt Inf J 13:177–183 Zar JH (1999) More on dichotomous variables. Biostatistical analysis, 4th edn. Prentice Hall, Upper Saddle River, pp 516–565 Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175