Comparative Study of Artificial Neural Networks and Wavelet Artificial ...

4 downloads 142166 Views 2MB Size Report
Oct 17, 2014 - fluctuations. Water Resour Res 37(4):885–896. Dariane A, Karami F (2014) Deriving hedging rules of multi-reservoir system by online evolving ...
Water Resour Manage (2014) 28:5297–5317 DOI 10.1007/s11269-014-0802-0

Comparative Study of Artificial Neural Networks and Wavelet Artificial Neural Networks for Groundwater Depth Data Forecasting with Various Curve Fractal Dimensions Zhenfang He & Yaonan Zhang & Qingchun Guo & Xueru Zhao

Received: 26 May 2013 / Accepted: 29 September 2014 / Published online: 17 October 2014 # Springer Science+Business Media Dordrecht 2014

Abstract The objective of this study was comparative study of artificial neural networks (ANN) and wavelet artificial neural networks (WANN) for time-series groundwater depth data (GWD) forecasting with various curve fractal dimensions. The paper offered a better method of revealing the change characteristics of GWD. Time series prediction based on ANN algorithms is fundamentally difficult to capture the data change details, when the time-series GWD data changes are more complex. For this purpose, Wavelet analysis and fractal theory methods are proposed to link to ANN models in predicting GWD and analysis the change characteristics. The trend and random components were separated from the original time-series GWD using

Z. He : Y. Zhang (*) : X. Zhao Cold and Arid Regions Environmental and Engineering Research Institute, Chinese Academy of Sciences, Lanzhou 730000, China e-mail: [email protected] Z. He e-mail: [email protected] X. Zhao e-mail: [email protected] Z. He Liaocheng University, Liaocheng 252059, China Z. He : Q. Guo University of Chinese Academy of Sciences, Beijing 100049, China Q. Guo e-mail: [email protected] Q. Guo Shaanxi Radio & TV University, Xi’an 710068, China X. Zhao Lanzhou University, Lanzhou 730000, China

5298

Z. He et al.

wavelet methods. The fractal dimension is convenient for quantitatively describing the irregularity or randomness of time series data. Three types of training algorithms for ANN and WANN models using a Mallat decomposition algorithm were investigated as case study at three sites in the Ganzhou region of northwest China to find an optimal model that is suitable for certain characteristics of time-series GWD data. The simulation results indicate that both WANN and ANN models with the Bayesian regularization algorithm are accurate in reproducing GWD at sites with smaller fractal dimensions. However, WANN models alone are suitable for sites at which the fractal dimension of the wavelet decomposition detail components is larger. Prediction error is also greater when the fractal dimension is larger. Keywords Artificial neural network . Wavelet artificial neural network . Groundwater depth . Training algorithms . Mallat decomposition algorithm . Fractal dimension

1 Introduction The forecasting of groundwater depth (GWD) is very useful in the rational utilisation of water resources (Mohanty et al. 2010). However, in some regions, it is difficult or expensive to measure the input parameters and determine the conditions of physical models (Xiang-Yang Li et al. 2006; W. Chen and Chau KW 2006). Empirical models, particularly empirical time-series models, require fewer data and are able to provide effective results without a detailed understanding of the physical process (Xu et al. 2014; Z. Latt and H. Wittenberg 2014; A. Dariane and F. Karami 2014;). Artificial neural networks (ANNs) and wavelet artificial neural networks (WANNs) are amongst the empirical models that can be considered good alternatives for predicting GWD with time-series data as the input (Rene et al. 2011; C.L.Wu et al. 2009; Chuntian Cheng et al. 2005). In recent years, ANNs have been used as an alternative approach for the simulation of groundwater levels, groundwater salinity determination in island aquifers (Banerjee et al. 2011) and pesticide prediction in groundwater (Sahoo et al. 2006; Nayak et al. 2006). Chau KW et al. (2005) employed the genetic algorithm-based artificial neural network and the adaptive-network-based fuzzy inference system in Water levels forecasting by using known water levels. In the past six to seven years, several researchers have investigated the use of WANNs in the hydrological forecasting of rainfall runoff, floods and monthly flow (Umut 2012; Cannas et al. 2006; Moosavi et al. 2013; Sehgal et al. 2014; Goyal.M 2014; A.Altunkaynak 2014). Many studies of GWD forecasting using ANN and WANN models require inputrelated variables (P.Coulibaly et al. 2001; Coppola et al. 2003; Feng et al. 2008; Chen. CW et al. 2014; A. Ahmad et al. 2014). When data are limited, however, the use of multiple related synthetic data as input parameters is difficult to achieve. In order to resolve the data limited and complexity problems, the wavelet and fractal dimension were combined to ANN and WANN model Wavelet analysis and fractal theory are also widely used in hydrological research. Sang (2013), for example, reviewed and summarised the application of wavelet transform methods in hydrology in six respects, noting that these methods aid in the complex quantification of hydrologic time series and in time-series simulation and forecasting. Fractal analysis is used to aid the interpretation of rainfall time-series data,

Comparative Study of Artificial Neural Networks and Wavelet

5299

and the fractal dimension is a convenient description of the irregularity or randomness inherent in rainfall time series (M.C. Breslin 1999). In the study reported herein, a WANN and the fractal dimension were combined to reveal the change characteristics of GWD, and quantity criteria were put forward to determine which model (i.e. ANN or WANN) is most suitable for various fractal dimensions. To date, no research has been published that the fractal dimension of the approximate and detailed components of wavelet decomposition is a better method of revealing the change characteristics of GWD. In this study, three types of training algorithms for ANN and WANN models using a Mallat decomposition algorithm were investigated at three sites with different characteristics to find the optimal model to suit certain characteristics of GWD time-series data. Here, we discuss the modelling process and the accuracy of the different algorithms to assess their relative advantages and disadvantages for different fractal dimensions.

2 Study Area and Data Analysis The study area is located in the Ganzhou region of the middle Heihe River Basin, which is a typical large inland river basin in the arid zone of northwest China (Fig. 1). The Ganzhou region lies in the middle section of the Hexi Corridor Region in Gansu Province, which lies between 100°6′ to 100°52′ E longitude and 38°32′ to 39°24′ N latitude. GWD data were collected at the city’s hydrology bureau. In this study, three typical wells with different variation characteristics were selected to illustrate the performance of three ANNs with different training algorithms and a WANN. These three wells are Xiaan, Shandanqiao (Shandan hereafter) and Liuquan, respectively. The mean annual GWD in these wells is 0.712 m, 5.258 m and 2.959 m, respectively. Figure 2 shows that the Liuquan site has experienced the smallest GWD change, and the Xiaan site the largest. The ANN and WANN models in this study were constructed using only measured monthly time-series data on GWD as the input parameters to predict future GWD fluctuations.

3 Methodology and Model Design 3.1 ANN Models ANNs are mathematical algorithmic models that imitate the neural network behavioural characteristics of animals and carry out parallel distributed information processing(Gaur et al. 2012; Kumar et al. 2012). A neural network is characterised by its architectures, algorithms and activation function. Multilayer perception trained with the back propagation (BP) algorithm (Basheer and Hajmeer 2000; Seckin et al. 2013; Kim et al. 2013) is the most popular network for hydrologic systems. The architecture of a BP neural network consists of an input layer, one or more hidden layers, an output layer and a layer comprising one or more artificial neurons (Fig. 3). As shown in Fig. 3, X is the neuron input, y is the neuron output and w is the adjustable input weight, where xi is the input of the i th neuron, Wji is the synaptic weight between neurons i and j, θ is offset signal used to model the neuron excitation threshold, and u and f are the basic and activation functions, respectively.

5300

Z. He et al.

Fig. 1 Location of observation sites in the study area

A neuron evaluates the weighted sum of the inputs given by   u ¼ W ji xi þ θ

ð1Þ

The neuron output is calculated by the summation of the weighted inputs with a bias through an activation function. This activation function computes its output as y ¼ f ðuÞ

ð2Þ

3.2 WANN Model The Mallat pyramidal algorithm has been proposed for computing the discrete wavelet transform (DWT) coefficient (Mallat 1989). In this study, the DWT was used to analyse the hydrological sequences. The DWT of a time series f(t) is defined as Eq. (3) (Y. R. Satyaji Rao et al. 2011):

Comparative Study of Artificial Neural Networks and Wavelet

5301

Fig. 2 GWD annual average curve of the typical wells in Ganzhou regsion ∞   1 t−b dt f ða; bÞ ¼ pffiffiffi ∫ f ðtÞψ a −∞ a

ð3Þ

where ψ(t) denotes the basic wavelet with effective length t, which is usually much shorter than the target time series f(t); a represents the scale or dilation factor that determines the characteristic frequency; and b is the translation in time. The main purpose of using a DWT technique is to reduce the complexity of the input data and the amount of relevant information between the decomposition components including

Fig. 3 The ann structure and a neural structure

5302

Z. He et al.

a series of detailed CD1 and CD2 components and approximate CA2. The DWT decomposition can be iteratively applied to approximation components to get components of lower resolution and obtain what is known as multiresolution analysis. The mean correlation coefficient between CA2, CD1 and CD2 is smaller than 0.0013 in three sites. The result best fulfilled our purpose. V 0n ¼ f ½n; n∈N V nj ¼

pffiffiffiX 2 hðk−2nÞV kj−1 ; j ¼ 1; 2; …L

ð4Þ ð5Þ

k∈z

W nj ¼

pffiffiffiX 2 g ðk−2nÞV kj−1 ; j ¼ 1; 2; …L

ð6Þ

k∈z

V nj−1 ¼

i Xh hðn−2k ÞV kj þ g ðn−2k ÞW kj

ð7Þ

k∈z

3.3 The Fractal Dimension Fractal geometry coined a new definition of dimension in the 1980s: the fractal dimension is a basic quantity used to compare fractals that can be non-integers (Mandelbrot et al. 1984; Mandelbrot 1982, 1989). The fractal dimension quantifies the complexity or irregularity of a subjective fractal, and is a measure of what occupies the metric space in which the object lies (Xu 2005; D.J. Schuller et al. 2001). The higher an object’s fractal dimension, the more complex it is. The fractal dimension is usually calculated by the Hausdorff dimension, line segment method, box counting method or a similar approach (Takayasu 1990; K. Foroutan-pour et al. 1999). The box counting method, in contrast, is applicable for dimensions with or without self-similarity and is easy to compute. In this study, we selected the box counting method to calculate the fractal dimension. The object is covered by a sequence of boxes with side length ε ≤ L, which is the maximal side length of the box in which the object can be embedded. Mathematics defines D as a limit: 2 3 lnðN ðεÞÞ ð8Þ D¼ε lim∞ 4  .  5 ln 1 ε where D is the fractal dimension, N(ε) is the number of boxes needed to recover the object and ε is the side length of the box. The regression slope D of the straight line is the fractal dimension that indicates the degree of complexity (K. Foroutan-pour et al. 1999). The linear regression equation (Buczkowski and Cartilier 1998) is  .  lnðN ðεÞÞ¼⋅D ln 1 ε þ lnm

ð9Þ

where m is a constant, and the counting of N(ε) contain points: N ðεÞ∼εD

ð10Þ

Comparative Study of Artificial Neural Networks and Wavelet

5303

The box-counting algorithm was widely used in time series data including rainfall, precipitation, stream flow and so on (Mandelbrot 1982; Maciej and Kundzewicz 1997; H.H.Liu 2005). M.C. Breslin (1999) compared different methods and suggested that box counting algorithm is the best for time series data including any data set of limited size. Moreover, the results of different variation methods from boxcounting were demonstrated consistent changes (slightly large or low) for different data sets. Consequently, different variation methods from box-counting don’t influence compares GWD fractal dimension. Consequently, we select box-counting methods in our study.

4 Analysis, Results, and Discussions 4.1 Determination of ANN and WANN Model Parameters 4.1.1 Division of Data The data series were divided into a training set (January 1995 to December 2001) and a testing set (January 2001 to December 2004). Training data are further divided into 75 % for the training set and 25 % for the validation set. The cross-validation was used to check the generalization capability of the models. Tables 1 and 2 also shows the models resolved problem of overfitting that the models overfits the training data but does not fit well to new data. 4.1.2 Selection of Suitable Wavelet Function In this study, the db4 wavelet function was selected for the data by comparing different wavelet function in Fig. 4. The mother wavelet, for best decomposition of our data, were chosen taking into account the degree of similarity between the detailed CD1, CD2 components and approximate CA2 after compression. The conventional correlation coefficient R was used to quantify the similarity of the components. The smallest R best fulfilled our purpose to analysis the different components change characteristics of GWD. Quantitative calculation shows the various components to be independent of one other. We select 21 different of wavelet functions used in DWT. The Fig. 4 demonstrates the db4 is the best wavelet function in our data because of the lowest R. In accordance with the WANN model architecture in Fig. 5, the time-series monthly average GWD was decomposed into two levels: a series of detailed CD1 and CD2 components and approximate CA2. These were then used as the input data of the ANN models, as the detailed and approximate components make different contributions to GWD forecasts through the connection weights. In Fig. 5, n is 1 to 12 and t is 1 to 3. 4.1.3 Determination of Network Topologies The trial-and-error method was used to obtain the model’s optimal parameters. The key model parameters in Tables 1 and 2 including training algorithms and number of neuron. Moreover iteration and transfer function are also the sensitivities parameters on the results. The iteration is 100, 25, 50, and 50 for ANN GDX, ANN BR, ANN

5304

Z. He et al.

Table 1 Comparison between various training algorithms for Liuquan well ANN

WANN

Algorithm RMSE(training) RMSE(cross RMSE(testing) RMSE(training) RMSE(cross RMSE(testing) validation) validation) trainbr

0.2067

0.4166

0.3505

0.0434

0.4978

0.1505

trainlm

0.0896

0.4472

0.4323

0.0488

0.5306

0.2166

traingdx

0.1656

0.4682

0.443

0.0495

0.5892

0.2885

traingd

1.0842

0.7934

0.9257

0.7446

1.3639

0.9285

traingdm

0.4523

0.5121

0.4852

0.8201

1.164

0.953

trainrp

0.1663

0.5539

0.4431

0.1066

0.9221

0.4552

traincgp traincgf

0.1593 0.3889

0.5266 0.5175

0.4653 0.4523

0.2137 0.2245

0.664 0.7094

0.4157 0.4906

traincgb

0.1654

0.5318

0.4756

0.2212

0.6077

0.3603

trainscg

0.1761

0.5312

0.4988

0.2505

0.6471

0.3906

trainbfg

0.1564

0.5252

0.4481

0.1824

0.74

0.3156

trainoss

0.1821

0.5101

0.4428

0.2121

0.7194

0.4341

LM and WANN BR, respectively. Transfer function (tansig-purelin) is better for Shandan, Xiaan and Liuquan wells than others.

Table 2 Comparison between some of the models (network topologies) for Liuquan well ANN

WANN

model

RMSE RMSE RMSE model (training) (cross validation) (testing)

RMS RMSE RMSE E(training) (cross validation) (testing)

12-1-1

0.2351

0.6264

0.3698

36-1-1 0.0452

0.8659

0.3751

12-4-1

0.2302

0.6021

0.3699

36-4-1 0.0495

0.4648

0.3554

12-5-1

0.2532

0.5997

0.3644

36-5-1 0.0478

0.4639

0.3513

12-6-1

0.2191

0.5864

0.3737

36-6-1 0.0434

0.4563

0.1505

12-7-1 12-8-1

0.2067 0.3787

0.5846 0.5864

0.3505 0.3865

36-7-1 0.0818 36-8-1 0.0481

0.4618 0.4935

0.306 0.3567

12-12-1

0.2267

0.5857

0.3622

36-12-1 0.0574

0.4559

0.3467

12-16-1

0.2255

0.5899

0.3682

36-16-1 0.278

0.4637

0.3166

12-20-1

0.3239

0.5849

0.3687

36-20-1 0.0548

0.4686

0.3521

12-4-6-1

0.2314

0.4954

0.3695

36-4-6-1 0.0498

0.5291

0.4422

12-6-6-1

0.2187

0.5899

0.3872

36-6-6-1 0.183

0.4853

0.2485

12-8-6-1

0.2362

0.5826

0.3658

36-8-6-1 0.0721

0.5099

0.4011

12-12-6-1 12-20-6-1

0.2335 0.2942

0.5809 0.5936

0.3694 0.3648

36-18-6-1 0.0618 36-20-6-1 0.1389

0.5532 0.5186

0.3747 0.2824

12-4-6-6-1

0.4389

0.5965

0.4903

36-4-6-6-1 0.4389

0.4885

0.49

12-6-6-6-1

0.4389

0.5928

0.4912

36-6-6-6-1 0.4389

0.5191

0.4901

12-8-6-6-1

0.4391

0.5641

0.4893

36-8-6-6-1 0.4389

0.5149

0.4901

12-12-6-6-1 0.2772

0.5761

0.3633

36-18-6-6-1 0.0454

0.4537

0.4223

12-20-6-6-1 0.4391

0.5621

0.4886

36-20-6-6-1 0.4389

0.5925

0.4901

Comparative Study of Artificial Neural Networks and Wavelet

5305

Fig. 4 R of various wavelet functions between CA2, CD1 and CD2

The input data were tested from 1- to 12-monthly GWD data. The lag-12 monthly inputs were finally selected after debugging the ANN and WANN models. X ðt þ sÞ ¼ F ðX ðtÞ; X ðt−1Þ; X ðt−2Þ; …; X ðt−qÞÞ

Y ¼

X −X min X max −X min

ð11Þ

ð12Þ

Table 2 shows network topologies (12-7-1 for ANN and 36-6-1 for WANN) for Liuquan well are better than others through trial and error. network topologies (12-5-1 for ANN and 36-7-1 for WANN) are the other feasible alternatives. Those network topologies have smaller RMSE during testing period. Similarly, the best network topologies for Shandan well are 12-6-1 for ANN, and 36-7-1 for WANN, respectively. And the best network topologies obtained for Xiaan well through trial and error are found to be 12-7-1 for ANN, and 36-6-1 for WANN, respectively. The training results show that the group with a hidden layer performs best. 4.1.4 Determination of Training Algorithms The trial-and-error method was used to obtain the model’s optimal training algorithms just as Table 1. Different kinds of training algorithms were selected for comparison at

Fig. 5 WANN Model Design

5306

Z. He et al.

three sites. Table 1 shows three improved ANN training algorithms, i.e. the gradient descent with momentum and adaptive learning rate back propagation (GDX) algorithm (traingdx), Levenberg-Marquardt (LM) algorithm (trainlm) and Bayesian regularisation (BR) algorithm (trainbr), were evaluated to identify the algorithm that performed best in predicting monthly GWD in the study area. Four sub-models (Table 3) were developed to forecast GWD at the three observation sites.



L

ð13Þ

tsize

tsize ¼ 2∧ ði−1Þ::::i∈ð1; log2 L þ 1Þ

ð14Þ

r¼S

ð15Þ

N ðrs Þ ¼ INT

h

i h i =tsize −INT minðzs Þ=tsize

maxðzs Þ

ð16Þ

N ðrÞ ¼ N ðS 1 Þ þ N ðS 2 Þ þ ::::::N ðS S Þ; s∈ð1; S Þ

ð17Þ

log2 ðN ðrÞÞ¼⋅ Dlog2 ðrÞþ log2 m

ð18Þ

4.1.5 Evaluation Criteria for Model Performance Four statistical indicators were used to evaluate the effectiveness of the ANN models developed in this study. They are the correlation coefficient (R), mean error (ME), root mean square error (RMSE) and peak error percentage (EOP(%)), given by the following equations. X

  Oi −O Pi −P R ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2  2 X Pi −P Oi −O

ð7Þ

Table 3 The design of ANN and WANN models for the three wells Model

Training Algorithm

The Lead-time

The Lag-time

Sites

ModelI

ANN GDX

X(T+1), X(T+2), X(T+3)

12

Liuquan, Xiaan,Shandan

Model II

ANN BR

X(T+1), X(T+2), X(T+3)

12

Liuquan, Xiaan,Shandan

Model III

ANN LM

X(T+1), X(T+2), X(T+3)

12

Liuquan, Xiaan,Shandan

Model IV

WANN BR

X(T+1), X(T+2), X(T+3)

12

Liuquan, Xiaan,Shandan

Comparative Study of Artificial Neural Networks and Wavelet

ME ¼

1X ðOi −Pi Þ N

RMSE ¼

ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X ðOi −Pi Þ2 N

5307

ð8Þ

ð9Þ

In the foregoing equations, Oi denotes the observed GWD values for time period i, Pi denotes the predicted GWD values for time period i, O is the mean of the observed values, P is the mean of the predicted values and n is the number of observations. The percentage error of peak GWD, EOp(%), is defined as follows (Chou 2007).

EOp ð%Þ ¼

PP −OP  100%; OP

ð10Þ

where PP denotes the peak data of the predicted GWD, OP is the peak data of the observed GWD and EOP is the relative error of the maximum difference in the highest peak GWD. 4.2 Results and Discussion of Wavelet Analysis We obtained the different scales and details of the time-series GWD after decomposing the wavelet transform at two levels. The time-series GWD data were divided into three parts after two-level decomposition and reconstruction via the DWT. Approximate component CA2 was obtained via a low filter, and reflects the overall trend of the original GWD series, whereas detail CD1 and detail CD2 were the detailed components obtained via a high filter. Figure 6 also shows that detail CD2 extracted the cycle components of the original time-series GWD data and that detail CD1 extracted the randomness components and thus represents the irregularity and complexity of the GWD signals. The overall and cycle trends are easily simulated for an empirical model, but randomness is less easily predicted. In addition, the correlation coefficient between CA2, CD1 and CD2 is smaller than 0.005. Quantitative calculation shows the various components to be independent of one other. The trend and random components were used as input data of the ANN models because the detailed and approximate components make different contributions to forecasting GWD through their connection weights, according to Eq. (1). We selected the observed GWD series of the three sites (Xiaan, Liuquan and Shandan) over 120 months for decomposition and reconstruction, as shown in Fig. 6. It can be seen from Fig. 6 that approximate CA2 demonstrates that the overall trend of the Shandan GWD is irregular change, whereas that of the other two sites is periodic change. In addition, the overall trend of the Liuquan site is similar to that of the Xiaan site. The change characteristics of cycle and randomness reflected by CD2 and CD1 are not readily shown by the curves. Quantity criteria are needed to determine the complexity of the GWD series. In this study, with reference to the wavelet and fractal theory, we chose the fractal dimension of the approximate- and detailed- component time-series curves as our quantity criteria. CD2 reflects the randomness of the GWD data. The target of this research was discoveration the

5308

Z. He et al.

Fig. 6 Two level decomposition and reconstruction of time series GWD at three sites Approximation CA2 is the approximate part of the time deries GWD. The detail CD1 and detail CD2 are the detail parts of the time deries GWD at three sites Xiaan, Liuquan and Shandan

relation between the randomness of these data and the complexity when predict GWD with empirical models. 4.3 Results and Discussion of Fractal Dimension The fractal dimension provides a convenient description of a GWD time series, and describes the irregularity or randomness of the series data. Its combination with the time-series curves offers a better method of revealing the change characteristics of GWD. Figure 7 (a, b, c) shows the regression line corresponding to Table 4 of the three sites. It shows the estimation of the fractal dimensions for the detailed and approximate components of the time-series curves of the (a) Xiaan, (b) Liuquan and (c) Shandan sites; the slope of the curves is the fractal dimension. The line and left

Comparative Study of Artificial Neural Networks and Wavelet

5309

Fig. 7 Regression line of log2r-log2N(r) to caculate the fractal dimention by box-counting methods. The figure (a,b,c) show the fractal dimension for the wavelet decomposion approxicate parts CA2 and detail parts CD1,CD2 at the Xiaan Liuquan, Shandan site respectively

axis correspond to the number of box (grid) counting points, and the bottom axis to the side length of the box (grid). The results confirm that the fractal dimension of the approximate curve, CA2, is smaller than that of CD1 and CD2 at all three sites. The fractal dimension of CD2 is also smaller than the corresponding fractal dimension of CD1. These results are consistent with the principle that the approximate component CA2 describes the overall trend of GWD, whereas detailed component CD2 describe its cycle trend and detailed component CD1 its randomness. At the same time, the results also demonstrate the fractal dimension’s ability to explain the complexity of time-series GWD as a quantity criterion. The detailed components changed irregularly and in complex fashion in this study. Table 4 shows that the fractal dimensions of CA2 at the three sites were very close to one another: 1.2578, 1.2379 and 1.1391. These values reflect the fractal dimension of the overall trend curves. The overall trend and cycle trend are easy to predict with both ANN and WANN models, and the fractal dimension describes the randomness and complexity of time-series GWD via CD1. Table 4 shows that the fractal dimension of detailed component CD1 was largest at the Shandan site, with a value 1.6501. That at the Liuquan site (1.5817) was slightly larger than that at the Xiaan site (1.5704). Another way of expressing these results is to say that the randomness of the Shandan GWD is more complex than that of the other two sites, whereas the irregularity of the latter two is similar. The implication is that the GWD of the Shandan site is the most difficult to predict, followed by that of

Table 4 Parameters for the linear regression between Dlog2(r) , log2(N(r)) and the fractal dimension Curve

The Linear Regression Equation

Adj. R2

Fractal Dimension(D)

XiaanCA2

y=1.25781x+1.60424

0.98107

1.2578

XiaanCD1

y=1.57036x+0.79237

0.99894

1.5704

XiaanCD2 ShandanCA2

y=1.48349x+0.92634 y=1.23791x+0.04461

0.98875 0.99071

1.4835 1.2379

ShandanCD1

y=1.65013x+0.0007

0.9966

1.6501

ShandanCD2

y=1.56548x+0.26127

0.99009

1.5655

LiuquanCA2

y=1.13909x+2.41407

0.99448

1.1391

LiuquanCD1

y=1.58186x+0.56098

0.99901

1.5817

LiuquanCD2

y=1.45271x+1.31834

0.99096

1.4527

5310

Z. He et al.

the Liuquan and Xiaan sites, whereas the fractal dimension at CD2 is largest at the Xiaan site, followed by the Liuquan and Shandan sites. 4.4 Comparison of Model Performance Through Evaluative Criteria The values of the statistical indicators for the four models at the three sites during the training and testing periods are shown in Table 5. It can be seen that all four models demonstrated good performance during both periods at the Xiaan and Liuquan sites. These results show that the WANN BR model more easily captures the change characteristics and trends, including the overall trend and the cycle and randomness trends in CA2, CD2 and CD1. It is obvious from these statistical indicators that the three ANN BP training algorithms produced roughly the same results and that the ANN BR training algorithm, in general, performed better than the ANN LM and ANN GDX algorithms. Hence, we selected the BR training algorithm for the WANN method. The three ANN models were unsuitable for the Shandan site, with RMSE values ranging from 0.5142 to 0.5762 m. The WANN BR method, in contrast, worked well during both the training and testing periods at this site to forecast GWD one month ahead, with an RMSE value of 0.321 m, ME of -0.0261 m and R value of 0.8791. Calculation of these values suggested that all of the training algorithms were suitable for the Xiaan site, although the WANN BR model demonstrated better performance than the ANN models. It is evident that the WANN model describes the non-linear characteristics of GWD well and that its advantages were more obvious in this study at the sites with a larger fractal dimension. Groundwater extrema are affected by random factors, which makes them difficult to predict. The EOP values in Table 5 reflect the models’ performance in simulating the extremum at each site. We can see that the EOP values of the ANN LM and GDX algorithms indicate an overestimation at the three sites during the training period, but an underestimation during the testing period. These EOP values indicate that the WANN model was better able to predict randomness than the three ANN models, Table 5 Comparison of performance statistics for ANN LM, ANN BR, ANN GDX and WANN BR models for 1 month lead time forecasting Algorithm

Site

RMSE(m) Training

ANN LM

ANN BR

ANN GDX

WANN BR

R Testing

Training

EOP(%)

ME(m) Testing

Training

Testing

Training

Testing

Xiaan

0.2753

1.1597

0.9590

0.9159

−0.0523

0.5008

−7.72

20.62

Liuquan

0.0896

0.4323

0.9793

0.8674

0.0008

0.2417

−0.96

39.12

Shandan

0.3069

0.5142

0.7515

0.6362

0.0034

0.1154

−2.52

−10.15

Xiaan

0.3224

0.5490

0.9403

0.9394

0

0.3719

−4.33

5.02

Liuquan

0.2067

0.3505

0.9331

0.8080

0.0396

0.1946

−22.00

12.53

Shandan Xian

0.3423 0.4290

0.5748 1.0600

0.6814 0.8924

0.4921 0.9113

0 0.0025

0.0884 0.8299

−6.90 −7.37

−9.46 18.31

Liuquan

0.1656

0.4430

0.9279

0.8416

−0.0005

0.2372

−4.73

21.91

Shandan

0.4197

0.5762

0.4385

0.5751

0.0029

0.1094

−11.23

−15.41

Xiaan

0.0998

0.2018

0.9950

0.9799

0.0135

−0.0168

−4.47

2.05

Liuquan

0.0434

0.1505

0.9954

0.9574

0.0017

0.0333

−2.43

14.71

Shandan

0.1603

0.321

0.9434

0.8791

−0.0016

−0.0261

−4.31

−8.44

Comparative Study of Artificial Neural Networks and Wavelet

5311

with the ANN BR model achieving the second best performance. The magnitude of the ME values was also greater for the ANN models than the WANN model, implying a higher degree of bias in the prediction results. The maximum RMSE value for the WANN model at the three sites was 0.321 m, and the minimum value for the three ANN models was 0.3505 m, again demonstrating that the former generates generally superior prediction results. Figure 8 presents a comparison of the observed and calculated GWDs generated by the four models at the Xiaan site during the testing period. It is obvious that the WANN BR model is the best, whether the GWD values are high, medium or low, followed by the ANN BR. The ANN LM and GDX models generally underestimated or overestimated the extreme values, although their estimates of the medium values were closer to the observed data, and the overall trend and cycle trend were in agreement with the observed overall and cycle trends. These results are consistent with our analysis in Table 5. A possible reason for the WANN BR’s higher degree of accuracy is that the input nodes was able to capture the data change details through the approximate and detailed inputs at the Xiaan well. Figures 9 and 10 present a comparison between the one-month-ahead GWDs predicted by the four models and the observed GWDs at the Liuquan and Shandan sites. These figures indicate that the models generated good predictions of the overall trend between the observed and simulated GWDs at the two sites, just as they did at the Xiaan site. However, the WANN BR model exhibited even greater superiority at these two sites than at the Xiaan site, particularly at the Shandan site. The WANN BR model produces even better prediction results, regardless of the extremum and median data. Its prediction results were better at the Xiaan site than the Liuquan site and worst at the Shandan site. None of the ANN models was suitable for the Shandan site, although any of the three algorithms can be used to predict the overall trend in

Fig. 8 Comparison between the observed groundwater depths and the groundwater depths predicted 1 month ahead by ANN LM, ANN BR and ANN GDX algorithms and WANN BR methods at Xiaan site during testing period

5312

Z. He et al.

Fig. 9 Comparison between the observed groundwater depths and the groundwater depths predicted 1 month ahead by ANN LM, ANN BR and ANN GDX algorithms and WANN BR methods at liuquan site during testing period

GWD in the study region. The fractal dimension of the detailed CD1 at the Shandan site is larger than that at the other two sites, and its randomness and complexity render prediction difficult, as shown in Figs. 9 and 10, in which the extreme values are underestimated or overestimated.

Fig. 10 Comparison between the observed groundwater depths and the groundwater depths predicted 1 month ahead by ANN LM, ANN BR and ANN GDX algorithms and C at Shandan site during testing period

Comparative Study of Artificial Neural Networks and Wavelet

5313

In this study, we selected the best ANN methods with the trainbr algorithm to compare them with the WANN method in predictions with a long lead time shown in Table 6. The predictions in Table 6 show that at the site with the smallest fractal dimension of CD1, there was little difference between the ANN and WANN models and the errors become larger with a longer lead time. The WANN BR method remained stable with lead time changes, with the value of R varying from 0.9570 for a one-month lead time to 0.8888 for a three-month lead time. Further, prolongation of the forecast period had no significant effect on the testing accuracy of the WANN model, because of its robustness. As previously stated, the WANN BR method alone was able to predict GWD at the Shandan site, albeit only with a one-month lead time. The RMSE values of the WANN BR were smaller than those of the ANN BR in both the training and testing stages, which implies the better calibration capability of the former for the given data. The greater the degree of randomness, the more suitable the WANN model becomes, albeit only at a relatively short lead time. Figure 11 compares the observed and predicted GWDs forecast one, two and three months ahead by the ANN BR and WANN BR models at the (a) Xiaan, (b) Liuquan and (c) Shandan sites during the testing period. It is apparent from Fig. 11 that an increase in the prediction time horizon from one month to three is accompanied by an increase in the WANN BR’s range of error at the Xiaan and Liuquan sites, which confirms the earlier findings based on the statistical indicators. At the Shandan site, when the lead time was two or three months, the two BR methods were able to predict the overall trend alone, although, as noted, the WANN method produced acceptable prediction results when the lead time was one month. In summary, it can be deduced that, despite the data restrictions, the WANN model and ANN models with a suitable training algorithm considered in this study can Table 6 Goodness-of-fit statistics for different lead times forecasts using ANN BR model and WANN BR model site

Algorithm and Leadtime(month)

RMSE(m) training

Xiaan

Liuquan

Shandan

Xiaan

Liuquan

Shandan

R testing

training

ME(m) testing

training

testing

ANN BR -1

0.3234

0.5490

0.9403

0.9394

0

0.3719

ANN BR -2

0.2942

1.0813

0.9511

0.9106

0.0028

0.7378

ANN BR -3

0.3225

0.4917

0.8613

0.8951

−0.0058

0.6914

ANN BR -1 ANN BR -2

0.2067 0.0001

0.3505 0.6135

0.9331 0.9738

0.8080 0.8329

0.0396 0.1005

0.1946 0.4082

ANN BR -3

0.1805

0.3895

0.9144

0.8557

−0.0204

0.2227

ANN BR -1

0.3423

0.5748

0.6814

0.49921

0.0000

0.0884

ANN BR -2

0.4278

0.6193

0.4149

0.4570

−0.0049

0.1271

ANN BR -3

0.4267

0.6118

0.4167

0.5758

−0.0018

0.1348

WANN BR -1

0.0998

0.2018

0.9950

0.9799

0.0135

−0.0168

WANN BR -2

0.1717

0.2115

0.9837

0.8934

0.0132

0.0021

WANN BR -3 WANN BR -1

0.2096 0.0434

0.7093 0.1505

0.9761 0.9954

0.8945 0.957

0.0009 0.0017

0.0830 0.0333

WANN BR -2

0.0729

0.2534

0.9870

0.8918

0.0084

0.0745

WANN BR -3

0.1549

0.3098

0.9440

0.8888

0.0170

0.1874

WANN BR -1

0.1603

0.3210

0.9434

0.8791

−0.0016

0.0261

WANN BR -2

0.2115

0.4694

0.8934

0.7062

0.0021

0.0378

WANN BR -3

0.3691

0.6521

0.6220

0.2697

−0.0066

0.0710

5314

Z. He et al.

Fig. 11 Comparison between the observed groundwater depths and the groundwater depths predicted 1,2,3 month ahead by ANN BR and ANN BR algorithm at Xiaan(a), Liuquan (b), Shandan (c)sites during testing period

predict monthly GWDs at one-, two- and three-month lead times at sites where the degree of randomness is not overly great. The WANN technique is more appropriate when knowledge of the hydrological parameters is restricted. It is also more suitable for GWD series with complex change characteristics. 4.5 Discussion of the Relationship Between Model Performance and the Fractal Dimension As we have seen, WANN was slightly better than the ANN model with the trainbr training algorithm for the time-series groundwater data at the Xiaan site. At the Liuquan site, the WANN model was superior to the ANN model, and was the only suitable model at the Shandan site. In addition, we have also explained that the fractal dimension of detailed component CD1 indicates the randomness of a GWD series. We found the fractal dimension of GWD time-series detailed component CD1 at the Shandan site (1.6501) to be the maximum value, followed by that at the Liuquan site (1.5817). The fractal dimension of detailed component CD1 indicates the randomness and irregularity of time-series GWD. Hence, when the degree of randomness

Comparative Study of Artificial Neural Networks and Wavelet

5315

is great and the changes in the time-series GWD dataset are complex, the WANN model is the most suitable for predicting time-series data. The fractal dimension of detailed component CD1 was only slightly smaller at the Xiaan site (1.5704) than at Liuquan. There was thus little difference in the error rate between the WANN model and the ANN model with trainbr at these two sites, although the WANN model was more suitable at the Liuquan site than at the Xiaan site. The foregoing analysis allows us to conclude that the fractal dimension of the detailed CD1 can be used as a criterion in choosing a suitable model (ANN or WANN) to predict time-series GWD. When the fractal dimension of this component is large, such as at the Shandan site, the changes in time-series GWD become more complicated, and the WANN model thus becomes more suitable than the ANN model. Approximate component CA2 and detailed component CD2 reflect the overall trend and cycle trend of time-series GWD, respectively, and all four models were found capable of predicting these trends. When the fractal dimensions of CA2, CD1 and CD2 are larger, the time-series GWD data changes are more complex, the degree of testing error is greater and prediction is more difficult. When GWD series changes are complicated, the WANN model is the best choice for accurately predicting GWDs with a higher fractal dimension.

5 Conclusion The fractal dimensions of these three wavelet decomposition components were found to be suitable quantity criteria for explaining the complexity and randomness of time-series GWD. The combination of the wavelet decomposition and fractal dimension was found to be the best method for revealing the change characteristics of GWD. The fractal dimension describes the irregularity or randomness of series data. Consequently, the fractal dimension of detailed component CD1 can be used as a quantitative criterion to determine which model (ANN or WANN) is suitable for various time-series GWDs to improve prediction accuracy. Four models and three sites were used as case study to confirm our conclusions. When the fractal dimension of detailed component CD1 is large, as it was at the Shandan site considered here, time-series GWD changes become more complex and the degree of prediction error greater, and the WANN model becomes the most suitable, producing highly accurate onemonth-ahead forecasts. ANN models are unable to generate reasonable predictions in this case, predicting the overall trend alone. When the fractal dimension of detailed component CD1 is small, in contrast, both the WANN and ANN models can predict GWD with an acceptable degree of accuracy. The WANN model easily captures the relationships among the various input and output variables through the approximate and detailed components. In the study reported herein, we investigated the use of wavelet decomposition and the fractal dimension to choose suitable models for time-series GWD data. When these data are restricted, choosing the most suitable method helps to improve the accuracy of GWD prediction. However, further studies are needed for more different kinds of GWD data series. It seems that further insight into the non-stationary process of GWD by linking wavelet analysis and fractal dimension to ANN, such as detecting man-made impacts or climate change impacts. Acknowledgments This work was part of the Ecological-hydrological Modelling and Parameter Optimizing Based on An Integrated Modelling Framework project supported by the National Science Foundation of China (grant number: 91125005/D011004) and the Incubation Foundation for Special Disciplines of the National Science Foundation of China (grant number: J1210003/J0109).

5316

Z. He et al.

References Ahmad A, El-Shafie A et al (2014) Reservoir optimization in water resources: a review. Water Resour Manag 28(11):3391–3405 Altunkaynak A (2014) Predicting water level fluctuations in Lake Michigan-Huron using wavelet-expert system methods. Water Resour Manag 28(8):2293–2314 Banerjee P, Singh VS, Chatttopadhyay K, Chandra PC, Singh B (2011) Artificial neural network model as a potential alternative for groundwater salinity forecasting. J Hydrol 398(3–4):212–220 Basheer IA, Hajmeer M (2000) Artificial neural networks fundamentals, computing, design, and application. J Microbiol Methods 43:3–31 Breslin MC, JAB (1999) Fractal dimensions for rainfall time series. Mathematics and Computers in Simulation 1999 (48):437–446 Buczkowski PH, Cartilier L (1998) Measurements of fractal dimension by box-counting: a critical analysis of data scatter. Physica A 252(1998):23–24 Cannas B, Fanni A, See L, Sias G (2006) Data preprocessing for river flow forecasting using neural networks: wavelet transforms and data partitioning. Phys Chem Earth Parts A/B/C 31(18):1164–1171 Chau KW, Wu CL, Li YS (2005) Comparison of several flood forecasting models in yangtze river. J Hydrol Eng 2005(10):485–491 Chen W, Chau KW (2006) Intelligent manipulation and calibration of parameters for hydrological models. Int J Environ Pollut 28(3–4):432–447 Chen C-W, Wei C-C et al (2014) Application of neural networks and optimization model in conjunctive use of surface water and groundwater. Water Resour Manag 28(10):2813–2832 Cheng CT, Chau KW, Sun JYG, Lin Y (2005) Long-term prediction of discharges in Manwan hydropower using adaptive-network-based fuzzy inference systems models. Lecture Notes in Computer Science. 2005(3612): 1152-1161 Chou CM (2007) Efficient nonlinear modeling of rainfall-runoff process using wavelet compression. J Hydrol 332(3–4):442–455 Coppola E, Szidarovszky F, Poulton M, Charles E (2003) Artificial neural network approach for predicting transient water levels in a multilayered groundwater system under variable state, pumping, and climate conditions. J Hydrol Eng 8(6):348–360 Coulibaly P, Anctil F, Aravena R, Bobee B (2001) Artificial neural network modeling of water table depth fluctuations. Water Resour Res 37(4):885–896 Dariane A, Karami F (2014) Deriving hedging rules of multi-reservoir system by online evolving neural networks. Water Resour Manag 28(11):3651–3665 Feng S, Kang S, Huo Z, Chen S, Mao X (2008) Neural networks to simulate regional ground water levels affected by human activities. Ground Water 46(1):80–90 Foroutan-pour K, Dutilleul P, Smith DL (1999) Advances in the implementation of the box-counting method of fractal dimension estimation. Appl Math Comput 105:195–210 Gaur S, Ch S, Graillot D, Chahar BR, Kumar DN (2012) Application of artificial neural networks and particle swarm optimization for the management of groundwater resources. Water Resour Manag 27(3):927–941 Goyal M (2014) Modeling of sediment yield prediction using M5 model tree algorithm and wavelet regression. Water Resour Manag 28(7):1991–2003 Kim S, Shiri J, Kisi O, Singh VP (2013) Estimating daily pan evaporation using different data-driven methods and lag-time patterns. Water Resour Manag 27(7):2267–2286 Kumar ARS, Goyal MK, Ojha CSP, Singh RD, Swamee PK, Nema RK (2012) Application of ANN, fuzzy logic and decision tree algorithms for the development of reservoir operating rules. Water Resour Manag 27(3):911–925 Latt Z, Wittenberg H (2014) Improving flood forecasting in a developing country: a comparative study of stepwise multiple linear regression and artificial neural network. Water Resour Manag 28(8):2109–2128 Li, XY, Chau KW, et al (2006) A Web-based flood forecasting system for Shuangpai region Advances in Engineering Software 37(3): 146-158 Liu HH, Zhang R, Bodvarsson GS (2005) An active region model for capturing fractal flow patterns in unsaturated soils: model development. J Contam Hydrol 80:18–30 Maciej R, Kundzewicz ZW (1997) Fractal analysis of flow of the river Warta. J Hydrol 200:280–294 Mallat SG (1989) Multifrequency channel decompositions of images and wavelet models. Acoust Speech Signal Process IEEE Trans 37(12):2091–2110 Mandelbrot BB (1982) The Fractal Geometry of Nature. W.H. Freeman, New York, NY, USA Mandelbrot BB (1989) Fractal geometry: what is it, and what does it do? In: Tildesley D, Ball RC (eds) FRS Fleischmann. Fractals in the Natural Sciences Princeton University Press, Princeton, NJ Mandelbrot BB, Dann E, Passoja J, Paulay A (1984) Fractal character of fracture surfaces of metals. Nature 308(19):721–722

Comparative Study of Artificial Neural Networks and Wavelet

5317

Mohanty S, Jha MK, Kumar A, Sudheer KP (2010) Artificial neural network modeling for groundwater level forecasting in a River Island of Eastern India. Water Resour Manag 24(9):1845–1865 Moosavi V, Vafakhah M, Shirmohammadi B, Behnia N (2013) A wavelet-ANFIS hybrid model for groundwater level forecasting for different prediction periods. Water Resour Manag 27(5):1301–1321 Nayak PC, Rao YRS, Sudheer KP (2006) Groundwater level forecasting in a shallow aquifer using artificial neural network approach. Water Resour Manag 20(1):77–90 Rene ER, Estefania Lopez M, Veiga MC, Kennes C (2011) Neural network models for biological waste-gas treatment systems. New Biotechnol 29(1):56–73 Sahoo GB, Ray C, Mehnert E, Keefer DA (2006) Application of artificial neural networks to assess pesticide contamination in shallow groundwater. Sci Total Environ 367(1):234–251 Sang Y-F (2013) A review on the applications of wavelet transform in hydrology time series analysis. Atmos Res 122:8–15 Satyaji Rao YR, Krishna B, Nayak PC (2011) Time series modeling in water resources planning and management. Int J Earth Sci Eng 4(6):247–253 Schuller DJ, Rao AR, Jeong GD (2001) Fractal characteristics of dense stream networks. J Hydrol 2001(243):1–16 Seckin N, Cobaner M, Yurtal R, Haktanir T (2013) Comparison of artificial neural network methods with Lmoments for estimating flood flow at ungauged sites: the case of East Mediterranean River Basin, Turkey. Water Resour Manag 27(7):2103–2124 Sehgal V, Sahay R et al (2014) Effect of utilization of discrete wavelet components on flood forecasting performance of wavelet based ANFIS models. Water Resour Manag 28(6):1733–1749 Takayasu H (1990) Fractals in the physical sciences. Manchester University Press, Manchester Umut O (2012) Using wavelet transform to improve generalization capability of feed forward neural networks in monthly runoff prediction. Scientific Research and Essays 7 (17) Wu CL, Chau KW et al (2009) Predicting monthly streamflow using data-driven models coupled with datapreprocessing techniques. Water Resour Res 45(8):1–23 Xu Y (2005) Explanation of scaling phenomenon based on fractal fragmentation. Mech Res Commun 32(2): 209–220 Xu J, Chen Y et al (2014) Integrating wavelet analysis and BPANN to simulate the annual runoff with regional climate change: a case study of Yarkand River, Northwest China. Water Resour Manag 28(9):2523–2537