HYDROLOGICAL PROCESSES Hydrol. Process. 19, 3819– 3835 (2005) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/hyp.5983

Rainfall-runoff models using artificial neural networks for ensemble streamflow prediction Dae-Il Jeong and Young-Oh Kim* School of Civil, Urban & Geosystems Engineering, Seoul National University, San 56-1, Shillim-Dong, Gwanak-gu, Seoul 151-742, South Korea

Abstract: Previous ensemble streamflow prediction (ESP) studies in Korea reported that modelling error significantly affects the accuracy of the ESP probabilistic winter and spring (i.e. dry season) forecasts, and thus suggested that improving the existing rainfall-runoff model, TANK, would be critical to obtaining more accurate probabilistic forecasts with ESP. This study used two types of artificial neural network (ANN), namely the single neural network (SNN) and the ensemble neural network (ENN), to provide better rainfall-runoff simulation capability than TANK, which has been used with the ESP system for forecasting monthly inflows to the Daecheong multipurpose dam in Korea. Using the bagging method, the ENN combines the outputs of member networks so that it can control the generalization error better than an SNN. This study compares the two ANN models with TANK with respect to the relative bias and the root-mean-square error. The overall results showed that the ENN performed the best among the three rainfall-runoff models. The ENN also considerably improved the probabilistic forecasting accuracy, measured in terms of average hit score, half-Brier score and hit rate, of the present ESP system that used TANK. Therefore, this study concludes that the ENN would be more effective for ESP rainfall-runoff modelling than TANK or an SNN. Copyright 2005 John Wiley & Sons, Ltd. KEY WORDS

artificial neural networks; ensemble neural network; ensemble streamflow prediction; probabilistic forecasting; rainfall-runoff model

INTRODUCTION Introduced in the 1970s, ensemble streamflow prediction (ESP) became a key part of the advanced hydrologic prediction system for the National Weather Service in the USA. ESP inputs historical meteorological scenarios to a rainfall-runoff model to forecast future streamflows using the current soil moisture, river, and reservoir conditions. Therefore, it is generally true that the accuracy of ESP forecasts relies primarily on the rainfallrunoff model being used. Jeong and Kim (2002) confirmed that the rainfall-runoff model used for their ESP study did not perform well for winter and spring, i.e. the dry season, and thus suggested that the model should be improved to obtain more accurate probabilistic ESP forecasts. The present study proposed another rainfall-runoff model using artificial neural networks (ANNs), which can be used for ESP. Once this ANN rainfall-runoff model is proven to perform reasonably well, it can be substituted for or combined with (Kim et al., 2003) the existing model to improve the performance of the existing ESP. ANNs have been widely used for various aspects of hydrology: the ASCE Task Committee on Artificial Neural Networks in Hydrology (2000) and Govindaraju and Rao (2000) reviewed ANN theories and applications in hydrology. Previous studies have demonstrated that ANNs are appropriate for complex nonlinear rainfall-runoff modelling (Hsu et al., 1995; Minns and Hall, 1996; Shamseldin, 1997; Sajikumar and Thandaveswara, 1999; Tokar and Johnson, 1999), streamflow forecasting (Karunanithi et al., 1994; Campolo et al., 1999a,b; Zealand et al., 1999; Zhang and Govindaraju, 2000; Kim and Barros, 2001; Birikundavyi et al., * Correspondence to: Young-Oh Kim, School of Civil, Urban and Geosystems Engineering, Seoul National University, San 56-1, ShillimDong, Gwanak-gu, 151-742 Seoul, Korea. E-mail: [email protected] Copyright 2005 John Wiley & Sons, Ltd.

Received 3 February 2004 Accepted 22 June 2005

3820

D.-I. JEONG AND Y.-O. KIM

2002; Sivakumar et al., 2002), and reservoir inflow forecasting (Saad et al., 1996; Jain et al., 1999; Coulibaly et al., 2000, 2001). Models based on ANNs are, in general, simple and reasonably accurate. Hsu et al. (1995) showed that the (nonlinear) ANN model provided a better representation of the rainfall-runoff relationships than the (linear) ARMAX (autoregressive moving average with exogenous inputs) time series model or the conceptual SAC-SMA (Sacramento soil moisture accounting) models. Zealand et al. (1999) compared ANNs with conventional streamflow forecasting approaches to demonstrate their capabilities. Including the type and the number of input variables and the size of hidden layers to be included in the ANN, several issues associated with the use of ANNs were examined in their study. Campolo et al. (1999b) used ANNs for forecasting the river flow for up to 6 days during low-flow periods. Zhang and Govindaraju (2000) showed the potential of modular neural networks for predicting monthly runoff for three medium-sized watersheds in Kansas, USA. Coulibaly et al. (2000) used the early stopping method, one of the regulation procedures, to train multilayer feedforward neural networks for real-time reservoir inflow forecasting. They showed that the early stopping method can provide better and more reliable generalization performance than the Levenberg–Marquardt backpropagation (LMBP) method alone. In this paper, a single neural network (SNN) using the early stopping method was used for our ANN rainfall-runoff model. Birikundavyi et al. (2002) investigated the performance of ANN models for daily streamflow forecasting and showed that ANNs outperformed a deterministic model called PREVIS and the classic autoregressive model coupled with a Kalman filter. Cannon and Whitfield (2002) introduced in their climate change studies the bagging (or bootstrap aggregation) method as an ensemble neural network (ENN) approach and showed the suitability of ENNs for downscaling techniques. Combining outputs of several member models can significantly improve generalization performance, because the generalization error of the final predictive model is controlled. In this study, rainfall-runoff models using ENNs with the bagging and early stopping method were compared with those using SNNs with the early stopping method. The primary objective of this paper is to improve the accuracy of the present ESP forecasting system for monthly inflows of the Daecheong dam in Korea. To achieve this objective, we attempted to develop new rainfall-runoff models using ANN techniques, which can be substituted for the existing rainfall-runoff model.

THE ANN The SNN ANNs are based on the highly interconnected structure of brain cells. This approach is fast and robust in noisy environments, flexible in the range of problems it can solve, and highly adaptive to newer environments. Owing to these established advantages, ANNs currently have numerous real-world applications, such as time series prediction, rule-based control, and rainfall-runoff modelling (Jain et al., 1999). The multilayer feedforward neural network, also referred to as an SNN in this study, consists of a set of sensory units that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. The input signal propagates through the network in a forward direction, layer by layer. These neural networks are commonly referred to as multilayer perceptrons. A typical three-layer feedforward ANN is shown in Figure 1. The mathematical form of a three-layer feedforward ANN is given as Ok D g2 [ wkj g1 wji Ii C wj0 C wk0 ] 1 j

i

where Ii is the input value to node i of the input layer, Vj is the hidden value to node j of the hidden layer, and Ok is the output at node k of the output layer. An input layer bias term I0 D 1 with bias weights wj0 and an output layer bias term V0 D 1 with bias weights wk0 are included to permit adjustments of the mean level at each stage. There are two sets of adjustable weights; wji controls the strength of the connection between Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3821

(V0)

(I0)

Input signal

Output signal ...

... (Ii) Input layer

(wji)

(Vj) Hidden layer

(wkj)

(Ok) Output layer

Figure 1. Multilayer feedforward with one hidden layer

the input node i and the hidden node j, and wkj controls the strength of the connection between the hidden node j and the output node k. g1 and g2 are activation functions for the hidden layer and the output layer respectively. The activation function is usually selected to be a continuous and bounded nonlinear transfer function, for which the sigmoid and hyperbolic tangent functions are commonly used (e.g. Haykin, 1994; Hsu et al., 1995; Govindaraju and Rao, 2000). Multilayer perceptrons have been applied successfully to solve some difficult and diverse problems. They have been trained under supervision with a highly popular algorithm known as the error back-propagation algorithm (Haykin, 1994). There are many variations of the back-propagation algorithm, including gradient descent and the faster algorithms using heuristic or optimization techniques (Demuth and Beale, 1998). In this study we use the LMBP (Levenberg-Marquardt Back-Propagation) algorithm method for ANN training. The LMBP, one of the second-order nonlinear optimization techniques, is usually faster and more reliable than any other back-propagation techniques. The LMBP uses the approximate Hessian matrix that can be approximated as H D JT J 2 where J is the Jacobian matrix, which contains the first derivatives of the ANN errors with respect to weights and biases. The LMBP algorithm uses this approximation to the Hessian matrix in the following update: W D [JT J C µI]1 JT e

3

where e is the residual error vector and is a variable small scalar that controls the learning process. When the scalar is zero, Equation (3) is equivalent to Newton’s method, using the approximate Hessian matrix. When is large, Equation (3) is equivalent to the gradient descent method with a small step size. Newton’s method is faster and more accurate near an error minimum than the gradient descent method is; thus, the aim is to shift towards Newton’s method as quickly as possible. In practice, LMBP is faster in finding better optima for a variety of problems than the other usual methods (Coulibaly et al., 2000). Early stopping method An ANN is defined as ‘generalized’ when it produces reasonable outputs from inputs that have not been used during training. When not generalized, an ANN can suffer from either underfitting or overfitting. While a too complex ANN may likely fit the noise and lead to overfitting, an insufficiently complex network can fail to detect the regularities in the data set and, thus, lead to underfitting. Underfitting produces excessive bias in the model outputs, whereas overfitting produces excessive variance (Coulibaly et al., 2000). Among various regulation procedures that generalize SNNs, three methods have been popularly used: (1) the method for limiting the number of hidden units, (2) the weight decay method, and (3) the early stopping method. In Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3822

D.-I. JEONG AND Y.-O. KIM

this study, the early stopping method was selected because it has been generally recognized to be the most popular for optimizing the generalization performance of ANNs in practice. Typically, an ANN divides the available data into a training set and a test set: performance of the trainingset-calibrated ANN model is verified using the test subset. The early stopping method, however, requires one more subset between the training and the test sets, called the validation set. The typical ANN approach monitors the training error only during the training procedure, but the early stopping method monitors the validation error as well as the training error. The training procedure is terminated when the (root-) mean-square error (RMSE) of the validation set reaches its minimum. If the ANN is trained further, then it begins to overfit, and the mean-square error begins increasing from its minimum. Without the early stopping method, data that have high complexity typically produce high variance and low bias when the training is terminated. Since the early stopping method minimizes the mean square error (which is the bias squared minus the variance), the error variance is decreased at the expense of added bias. As a result, such models become less dependent on the training data, as well as become less complex. Therefore, the early stopping method is fast and can be applied successfully to networks where the number of weights exceeds the sample size. However, there are several difficult issues associated with the early stopping method: (1) How many cases do we assign to the training and validation sets? (2) How do we split the data into training and validation sets? (3) When do the validation error rates ‘start to go up’? Furthermore, this method also uses data inefficiently because of the additional validation data set requirements. The ENN and the bagging method The ENN is a subject of active research. An ensemble in an ANN is a set of independently trained member models of SNNs whose predictions are combined in some way to obtain a single estimate of the desired output (Figure 2). This allows greater model stability than any SNN regulation procedures. It can also use training data more efficiently than the early stopping method. Bagging (otherwise referred to as bootstrap aggregation) has been widely accepted as one of the most popular methods for producing ensembles. The bagging approach (Breiman, 1996) also groups the available training data into a training set and a test set. A training subset is created by resampling with replacement (i.e. using the bootstrap approach) from the training set. The probability that an individual training subset from the training set will not be a part of the bootstrap resampled training set is 1 1/NN ³ 0Ð368, where N is the number of training samples in the training set. This means that only approximately 63% of the distinct training sample will be included in a bootstrap training subset. Each member model is trained using a different training subset. The final model member model 1

O1(n)

member model 2

O2(n) input I(n)

combiner

output O(n)

… member model L

OL(n)

Figure 2. Block diagram of ENNs

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3823

output takes the average of the outputs from all members of the ensemble. By averaging the outputs of member models that typically have low bias and high variance, the variance of the ensemble model can be reduced without increasing the overall ensemble bias (Breiman, 1996; Cannon and Whitfield, 2002). To describe bagging in more concrete terms, let Oi (x) be the output, t(x) be the target, and εi D Oi x tx be the error of a member model i. The average sum-of-squares error (Ei ) for model Oi (x) can be written as Ei D E[Oi x tx2 ] D E[εi 2 ]

4

where E[Ð] denotes the expectation. The average of Ei EAV is given by EAV D

L 1 1 Ei D E[εi 2 ] L iD1 L

5

We now introduce a simple form of committee. This form takes the output of the committee to be the average of the outputs of the L networks, which comprise the committee. Thus, we write the committee prediction in the form L 1 Ocom D Oi x 6 L iD1 The error due to the committee Ecom can then be written as Ecom D E[Ocom x tx2 ] 2 L 1 D E Oi x tx L iD1 L 1 2 D 2E Oi x tx L iD1 2 L 1 D 2E εi L iD1 2 L 1 2 E L εi D EAV L iD1

7

If we now assume that the errors εi (x) have zero mean and are uncorrelated, so that E[εi ] D 0 E[εi εj ] D 0

if j 6D i

8

then the sum-of-squares error becomes the error variance, which are further evaluated as 2 L L 1 1 1 εi D 2 E[εi 2 ] D EAV Ecom D 2 E L L L iD1 iD1

9

This represents the apparently dramatic result that the sum-of-squares error can be reduced by a factor of L simply by averaging the predictions of L networks. In practice, the reduction in error is generally much smaller than Equation (9) would suggest, because the errors εi (x) of different models are typically highly correlated, Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3824

D.-I. JEONG AND Y.-O. KIM

and so the second assumption in Equation (8) does not hold. As Breiman (1996) pointed out, improvements in performance tend to level out when more than 25 member models (L ½ 25) are added to the ensemble. In this study, the early stopping method was also applied to the bagged ensemble to optimize generalization performance as follows. First, L bootstrap resamples were generated from the training set. Then, the generalization error was estimated for each network using the early stopping method. For each network in the ensemble, we found the epoch that minimized the generalization error. The corresponding networks were chosen as the optimal set for the ensemble. Carney and Cunningham (2000) first introduced this technique in their paper, and it has since generally been accepted and often used because it is fast and easy to implement.

THE ESP METHOD Probabilistic forecasts with ESP ESP runs a rainfall-runoff model with meteorological inputs to generate an ensemble of possible streamflow hydrographs. Note that the ‘ensemble’ of ESP is independent of the ensemble of an ENN. A generated streamflow ensemble is a function of the initial hydrological states at the forecast time in the rainfall-runoff model, and thus the technique is sometimes called a conditional Monte Carlo simulation approach (Day, 1985). A best probability density function (referred to as the ESP p.d.f. hereafter) can then be fitted to the generated streamflow ensemble to describe the likelihood of an event occurring during a certain time period t being forecasted. In this study, the whole range of streamflow Qt was divided into three flow categories: low, medium, and high. A probability was then assigned to each category from the ESP p.d.f. The high, medium, and low flow probabilities (denoted PrH , PrM , and PrL ) were computed using PrH,t D 1 Pr[Qt qU,t ] PrM,t D Pr[Qt qL,t ] Pr[Qt qU,t ] PrL,t D Pr[Qt qL,t ]

10

respectively, where qL and qU are respectively the lower and upper limits of the medium flow category. In this study, the lower and upper limits are defined as the 33Ð3% and 66Ð7% cumulative quantiles respectively of a p.d.f. fitted to the historical streamflow data. Accuracy measures for probabilistic forecasts Since a probabilistic forecast specifies a probability distribution of the predictand, deterministic forecast accuracy measures such as bias and RMSE are not suitable for probabilistic forecasts. In this study, the half-Brier score, average hit score (AHS), hit rate, and bias ratio were used as accuracy measures. For a forecast that gives probability to events of r categories, the skill score of the overall forecast for r categories and N observations is measured with the half-Brier score (Brier, 1950), according to 1 υij ij 2 N jD1 iD1 r

BD

N

11

where ij is the forecast probability that the event will occur in category j and υij D 1 or 0, depending respectively on whether or not the event occurred in category j. The half-Brier score is the mean-squared error of the categorical forecast and has a value of zero when the forecast is perfect or a probability of one for the observed category. The worst possible forecast has a half-Brier score of two, which implies that the actual event always occurs in the category denoted with a probability of zero. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3825

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

The effectiveness of the probabilistic forecast can also be measured using the AHS (Kim et al., 2001). If flow i occurs in a month m, a score of S1 D 100Pri is assigned, where i is low, medium, or high, and Pri is the probability assigned to the category in which the flow occurs. For example, if a low flow occurs in month m D 1 in Figure 3 then the score will be 50, since Pr1 D 0Ð5. This is repeated throughout the entire verification period T in Figure 3. The resulting average score is denoted by Savg . If Savg is higher than 33Ð3, then it can be concluded that the ESP technique performs better than the naive (or climatology) forecast. Wilks (1995) describes the theoretical basis of hit rate and bias ratio. If we change the probabilistic forecast to a categorical forecast, which consists of a flat statement that only one of a set of possible events will occur, forecast verification is easy to understand. Categorical verification data are conventionally displaced in an I ð J contingency table of absolute frequencies, or counts, of the I ð J possible combinations of forecast and event pairs. Figure 4 illustrates the essential equivalence of the contingency table of forecasts and observations for the I D J D 3 categorical case. The total number of forecast/event pairs in the data set is n D a C b C c C d C e C f C g C h C i. The letters a, e, and i respectively represent the numbers of occasions when the low, medium, and high event were correctly forecasted. The most direct and intuitive measure of the accuracy of categorical forecasts is the hit rate. This rate is simply a function of the n forecasting occasions when the categorical forecast correctly ∗occurrence of actual streamflows

month = 1

50%*

20%

qL,1

30%

10%

month = 2 20%

70%

qL,2

*

S2 = 10

65% *

ST = 65

qU,2

25% 10% qL,n

…

…

… month = T

S1 = 50

qU,1

qU,n

Average Hit Score:

Savg =

1

T

T

∑ 100 × Pri

i =1

Figure 3. Example of AHS

oL

oM

oH

fL

a

b

c

fM

d

e

f

fH

g

h

i

Figure 4. Contingency table of forecasts and observations for the I D J D 3 categorical case

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3826

D.-I. JEONG AND Y.-O. KIM

anticipates the subsequent event or nonevent. The hit rate is given by aCeCi 12 n The worst possible hit rate is zero and the best possible hit rate is one. The bias ratio (or comparison of the average forecast with the average observation), of categorical forecasts, is usually represented as a ratio (Wilks, 1995), which is given by HD

aCbCc aCdCg dCeCf BM D bCeCh gChCi BH D cCfCi BL D

13

Unbiased forecasts exhibit B D 1, indicating that the events were forecasted the same number of times as they were observed. However, this result says nothing about the correspondence between the forecasted and observed values of the events. A bias ratio greater than one indicates that an event was forecasted more often than observed, i.e. overforecasted, whereas a bias ratio of less than one indicates that an event was underforecasted.

APPLICATION Study basin Completed in 1981, the Daecheong dam is located approximately 150 km upstream from the mouth of the Geum river basin in Korea. The Geum River flows in a westerly direction for 401 km, draining a 9810 km2 area of Korea (Figure 5). As one of 15 multipurpose dams in Korea, the Daecheong dam is operated mainly for flood control, water supply, and energy generation. Observed inflow data at the Daecheong dam have been available since 1981. Like other hydrologic time series, the Daecheong monthly inflow shows strong seasonality (Figure 6). Because of the monsoon climate, two-thirds of the annual flow occurs during the 3 month flood season from July to September, with only a third of the annual streamflow available during the remaining 9 months. The hydrometeorological input ensemble of ESP for the Daecheong monthly inflow consisted of observed precipitation and evaporation data for 15 years from 1981 to 1995. Therefore, an ensemble of 15 streamflow scenarios was available for each forecasting month of the test period from 1996 to 2001. TANK model The existing conceptual rainfall-runoff model for the Daecheong dam basin is called TANK. Having three tanks, TANK simulates the net stream discharge as the sum of the discharges from the side orifices of the tanks (Sugawara, 1974). Figure 7 shows the structure and 12 parameters of TANK. The parameters of TANK can be classified into three types: runoff coefficients (A11, A12, A2, and A3), infiltration coefficients (B1 and B2), and storage parameters, such as height of the side outlets of each tank (H11, H12, and H2) and initial storage height (SH1, SH2, and SH3). In this model, the shuffled complex evolution method was used to estimate the required parameters. TANK has been applied to the entire Geum river basin, but its simulation capability was not satisfactory, especially during the dry season from October. to May (Jeong and Kim, 2002). This shortcoming is demonstrated in Figure 8, which shows the monthly absolute relative bias (ARB) and relative RMSE (RRMSE) of the TANK model simulated from 1981 to 2001. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3827

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

Figure 5. Location map of the Daecheong dam in Korea

800

Inflow(CMS)

600

400

200

Dec

Nov

Oct

Sep

Aug

Jul

Jun

May

Apr

Mar

Feb

Jan

0

Month Figure 6. Box plot of the Daecheong dam average monthly inflows

Rainfall-runoff model using ANN techniques This study developed two rainfall-runoff models for the proposed ESP system in Korea: one is an SNN with the early stopping method and the other is an ENN with the bagging method. Input variables used for these rainfall-runoff models were the previous and the current months’ precipitation (Rt , Rt1 , . . .) and evaporation (Et , Et1 , . . .), as well as the previous month’s inflows (It1 , It2 , . . .). Using the current month’s inflow It as the target variable, the input–output structure was given by IO t D fRt , Rt1 , . . ., Et , Et1 , . . ., It1 , It2 , . . . Copyright 2005 John Wiley & Sons, Ltd.

14

Hydrol. Process. 19, 3819– 3835 (2005)

3828

D.-I. JEONG AND Y.-O. KIM

Evapotranspiration Precipitation

A12 SH1

H12

A11 H11

B1 A2 SH2 H2

B2

SH3 A3

Runoff

Figure 7. Structure and parameters of TANK

1.000

ARB

error measures

R-RMSE

0.500

0.000

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

month Figure 8. Monthly absolute relative bias and R-RMSE of TANK (1981– 2001)

The observed precipitation, evaporation, and inflow data from 1981 to 1995 were used for training the ANN rainfall-runoff models. The recent 6 years from 1996 to 2001 was used for testing the performance of the trained models. Such ANN models were then compared with TANK, a well-known conceptual rainfall-runoff model in Korea. In Equation (14), the historical ensembles of rainfall and evaporation were input for Rt and Et respectively to create the inflow It ensemble. The initial condition of the ANN rainfall-runoff model could be set by using the lagged variables such as Rt1 , Et1 , It1 , etc. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3829

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

An SNN rainfall-runoff model was developed using a multilayer feedforward ANN with one hidden layer. The input and target data were normalized in the range from zero to one because a sigmoid function was employed as the transfer function. From the input layer to the hidden layer, the log sigmoid function has been commonly used in hydrologic ANN models. From the hidden layer to the output layer, a linear function was employed as the transfer function because the linear function is known to be robust for a continuous output variable, such as the monthly inflow in this study. For the learning algorithm and the regulation procedure, this study used the LMBP algorithm and the early stopping method respectively. An important issue in the ANN modelling is the choice of the input variables and the determination of the number of hidden nodes, but there is no general theory yet to solve this issue, and it is rather problem dependent. In this study, a trial-and-error procedure selected the final structure of the SNN model, i.e. the input variables and the number of hidden nodes. To select the input variables of SNN, correlation coefficients between the target variable It and the candidate input variables such as (Rt , . . ., Rt4 ), (Et , . . ., Et4 ), and (It1 , . . ., Rt4 ) were used as the initial criteria (Table I). Table II shows 10 combinations of the input variables tested in this study. Since there may exist nonlinear relationships that cannot be measured with the correlation coefficient, Table II includes some input variables that showed weak correlations in Table I. For each test model in Table II, the number of the hidden nodes was initially set to half of the number of the input variables. Figure 9 compares the 10 test models in terms of the correlation coefficient and RMSE, which were calculated for the training, validation, and test sets. Since model 8 gave the largest correlation coefficient and the smallest RMSE in most cases, this study selected model 8 for the SNN. Note that this study randomly used 20 different values for the weights (wji and wkj in Figure 1) for each structure that was characterized by the input variables and the number of the hidden nodes, trained those ANNs, and averaged the 20 resulting values of the correlation coefficient and RMSE. Therefore, each value shown in Figure 9 (also in Figures 10–12) was an average of the 20 random sample results. Table I. Correlation coefficients between the target variable and the candidate input variables Lag months

Inflow I

Precipitation R

Evaporation E

t t1 t2 t3 t4

1Ð000 0Ð408 0Ð164 0Ð090 0Ð157

0Ð929 0Ð503 0Ð190 0Ð058 0Ð228

0Ð223 0Ð350 0Ð428 0Ð316 0Ð273

Table II. Test cases for the input variables selection Model Model Model Model Model Model Model Model Model Model Model

Input–output structures 1 2 3 4 5 6 7 8 9 10

Copyright 2005 John Wiley & Sons, Ltd.

IO t IO t IO t IO t IO t IO t IO t IO t IO t IO t

D fRt , . . . , Rt4 , Et , . . . , Et4 , It1 , . . . , It4 D fRt , . . . , Rt4 , Et , . . . , Et4 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et4 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et3 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et3 , It1 , It2 D fRt , Rt1 , Rt2 , Et , . . . , Et3 , It1 , It2 D fRt , Rt1 , Rt2 , Et , Et1 , Et2 , It1 , It2 D fRt , Rt1 , Rt2 , Et , Et1 , Et2 , It1 D fRt , Rt1 , Et , Et1 , Et2 , It1 D fRt , Rt1 , Et , Et1 , It1

Hydrol. Process. 19, 3819– 3835 (2005)

3830

D.-I. JEONG AND Y.-O. KIM

cor(trn)

model10

model9

model8

model7

model6

model5

model4

model3

0.90

model2

0.95

model1

Correlation Coefficient

(a) 1.00

cor(test)

cor(val)

(b) 45.0 RMSE(CMS)

40.0 35.0 30.0 25.0

RMSE(trn)

RMSE(val)

model10

model9

model8

model7

model6

model5

model4

model3

model2

15.0

model1

20.0

RMSE(test)

Figure 9. Correlation coefficients and RMSEs of 10 SNN models: (a) correlation coefficient; (b) RMSE

The best number of the hidden layer was then searched within the model-8 structure. Two performance indices were observed with 2, 3, . . ., 20 nodes (see Figure 10). The best results for the validation and test sets were achieved using four hidden nodes. ENN is an aggregate of multiple member models. In this study, each member model of an ENN was developed using the same procedure as the SNN modelling. For example, the LMBP algorithm, the log sigmoid, the linear transfer function, and the early stopping method were also employed for the ENN. Thirty member models were developed and then combined for the ENN, because Breiman (1996) recommended that at least 25 member models should be added to the ENN to reduce the generalization errors. Using the comparison procedure as shown in Figures 11 and 12, model 5 (in Table II) with 10 hidden nodes was finally selected as the best ENN model. Comparing the SNN with the ENN, the test performance of the ENN model in Figure 11 is less sensitive to the input variable selection than that of the SNN in Figure 9. The ENN in Figure 12 is also less sensitive to the number of hidden nodes than the SNN is in Figure 10. Furthermore, the test performance in Figures 9–12 reported that the ENN, in general, produced smaller RMSEs than the corresponding SNN, which implies that the ENN can reduce the generalization error more efficiently than the SNN. Comparison between TANK, SNN, and ENN The ANN models developed in this study were compared with the existing TANK rainfall-runoff model for the Daecheong basin. Table III shows the relative bias (RB) and R-RMSE of the three rainfall-runoff Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3831

Correlation Coefficient

(a) 1.00

0.95

0.90

0

2

4

6

8

10 12 14 16 18 20

Number of hidden nodes Cor(val) Cor(test) Cor(trn) (b) 50.0 RMSE(CMS)

45.0 40.0 35.0 30.0 25.0 20.0 15.0

0

2

4

6

8

10 12 14 16 18 20

Number of hidden nodes RMSE(trn) RMSE(val) RMSE(test) Figure 10. Correlation coefficients and RMSEs with various numbers of hidden nodes (SNN): (a) correlation coefficient; (b) RMSE

models (TANK, SNN, and ENN) in terms of seasonal and annual scales. The best results for each period are highlighted in bold. Table III reports some promising results. First, both the SNN and the ENN always performed better than TANK, except in the one case of RMSE in spring. In particular, the ANN performed very well, reducing the bias of the existing model TANK. Of special note is that the ENN on an annual basis was almost unbiased (0Ð006). Second, overall the ENN was superior to the SNN. In winter, when TANK performed worst, the ENN performed considerably better than the SNN in bias and RMSE. Note that the seasonal performances of the ANN models would have been improved if the models had been trained on a seasonal basis. In this study, however, the ANN models were trained on an annual basis because of the short record, and thus the annual performance indices are more meaningful. TANK and the ENN were further compared with respect to performance of the ESP probabilistic forecasts. In other words, using TANK and the ENN, ESP was applied to making 1-month ahead probabilistic forecasts from 1996 to 2001. For each forecasting month, historical rainfall and evaporation scenarios for the 15 year period from 1981 to 1995 were input into the TANK and ENN rainfall-runoff models to create 15 inflow scenarios. Figure 13 compares the AHSs of TANK and the ENN. In March and August, TANK was superior to the ENN, but the ENN generally outperformed TANK, especially in the dry season (November–February). The overall averages of AHP were 38Ð8% and 48Ð5% for TANK and the ENN respectively, which imply that the ENN was a considerable improvement over TANK in terms of ESP probabilistic forecasting. Figure 14 shows the half-Brier scores of two rainfall-runoff models, also reporting a similar result. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3832

D.-I. JEONG AND Y.-O. KIM

cor(cal)

model10

model9

model8

model7

model6

model5

model4

model3

0.90

model2

0.95

model1

Correlation Coefficient

(a) 1.00

cor(test)

(b) 45.0

RMSE(CMS)

40.0 35.0 30.0 25.0

RMSE(cal)

model10

model9

model8

model7

model6

model5

model4

model3

model2

15.0

model1

20.0

RMSE(test)

Figure 11. (a) Correlation coefficients and (b) RMSEs of 10 ENN models

Table IV summarizes the contingency table, hit rates, and bias ratios of TANK and the ENN. The hit rate of the ENN was greater than that of TANK, and the bias ratios of the two models are significantly different. Table IV indicates that the medium flows were forecasted less often than they were actually observed, i.e. underforecasted, when TANK was used for ESP. Similarly, Table IV indicates that the high flows were underforecasted while the low flows were overforecasted when the ENN was used for ESP.

CONCLUSIONS This study developed new rainfall-runoff models that can be used for ensemble streamflow prediction. The new models used two types of ANN, i.e. an SNN and an ENN. Both ANN models used the early stopping method to optimize generalization performance during training. The bagging method was used in this study for the ENN to control the generalization error better than the SNN. The ANN models were applied to making 1-month ahead probabilistic forecasts for inflows to the Daecheong multipurpose dam in Korea. The calibrated ANN models were compared with each other first. The results show that the ENN is less sensitive to the input variable selection and the number of hidden nodes than the SNN is, and the ENN, in general, produced smaller RMSEs than the corresponding SNN, which implies that the ENN can reduce the generalization error more efficiently than the SNN can. Comparing the SNN and ANN with a rainfall-runoff model TANK, which has been widely used in Korea, with respect to their simulation accuracy, this study found that the new ANN models performed better than TANK for 9 out of 10 test cases. Finally, this study tested TANK and the ENN using some probabilistic forecasting accuracy measures and showed that, for most Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3833

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

Correlation Coefficient

(a) 1.00

0.95

0.90

0

2

4

6 8 10 12 14 16 Number of hidden nodes cor(cal)

18

20

cor(test)

(b) 45.0

RMSE(CMS)

40.0 35.0 30.0 25.0 20.0 15.0

0

2

4

6 8 10 12 14 16 18 20 Number of hidden nodes RMSE(test)

RMSE(cal)

Figure 12. (a) Correlation coefficients and (b) RMSEs with various numbers of hidden nodes (ENN)

Table III. Relative bias and R-RMSE of three rainfall-runoff models Measure

Model

Year

Spring

Summer

Autumn

Winter

Relative bias (RB)

TANK SNN ENN TANK SNN ENN

0Ð061 0Ð048 −0·017 0Ð393 0Ð346 0·319

0Ð145 −0·062 0Ð122 0·281 0Ð304 0Ð336

0Ð020 0Ð033 −0·003 0Ð277 0Ð249 0·221

0Ð240 0Ð048 0·005 0Ð396 0·339 0Ð352

0Ð279 0Ð493 0·013 0Ð977 0Ð602 0·294

R-RMSE

of the test months from 1996 to 2001, the skills of the ENN were better than those of TANK. During the dry season in particular, the ENN improved its ESP performance considerably better than that of TANK. Therefore, this study concludes that an ENN should be substituted for the existing rainfall-runoff model, TANK, for the ESP probabilistic forecasting system for the Daecheong dam inflows in Korea.

ACKNOWLEDGEMENTS

This research was supported by a grant (code 1-6-1) from the Sustainable Water Resources Research Center of 21st Century Frontier Research Program and was also supported by the Research Institute of Engineering Science (RIES), Seoul National University, Seoul, Korea. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3834

D.-I. JEONG AND Y.-O. KIM

TANK

ENN

AHS (%)

66.7

33.3

0.0

Jan

Feb

Mar

Apr

May

Jun

Jul Aug month

Sep

Oct

Nov

Dec

Avg.

Nov

Dec Avg.

Figure 13. AHS of ESP forecast of TANK and the ENN

HBS

TANK

ENN

0.667

0.000

Jan

Feb

Mar

Apr

May

Jun

Jul Aug month

Sep

Oct

Figure 14. Half-Brier score (HBS) of ESP forecast of TANK and the ENN

Table IV. Contingency table, hit rate, and bias ratios of the TANK and the ENN model TANK oL fL fM fH H BL BM BH

16 6 3

ENN

oM

oH

oL

oM

oH

10 1 16 0Ð35 1Ð28 0Ð48 1Ð35

6 6 8

23 2 0

10 16 1 0Ð58 1Ð68 0Ð96 0Ð20

9 8 3

REFERENCES ASCE Task Committee on Artificial Neural Networks in Hydrology. 2000. Artificial neural networks in hydrology. II. Hydrologic applications. Journal of Hydrologic Engineering 5(2): 124– 137. Birikundavyi S, Labib R, Trung HT, Rousselle J. 2002. Performance of neural networks in daily streamflow forecasting. Journal of Hydrologic Engineering 7(5): 392–398.

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3835

Breiman L, 1996. Bagging predictors. Machine Learning 24(2): 123–140. Brier GW. 1950. Verification of forecasts expressed in terms of probabilities. Monthly Weather Review 78: 1–3. Campolo M, Andreussi P, Soldati A. 1999a. River flood forecasting with a neural network model. Water Resources Research 35(4): 1191– 1197. Campolo M, Soldati A, Andreussi P. 1999b. Forecasting river flow rate during low-flow periods using neural networks. Water Resources Research 35(11): 3547– 3552. Cannon AJ, Whitfield PH. 2002. Downscaling recent streamflow conditions in British Columbia, Canada using ensemble neural network models. Journal of Hydrology 259: 136– 151. Carney JG, Cunningham P. 2000. Tuning diversity in bagged ensembles. International Journal of Neural Systems 10: 267– 279. Coulibaly P, Anctil F, Bobee B. 2000. Daily reservoir inflow forecasting using temporal neural networks. Journal of Hydrology 230: 244–257. Coulibaly P, Anctil F, Bobee B. 2001. Multivariate reservoir inflow forecasting using temporal neural networks. Journal of Hydrologic Engineering 6(5): 367– 376. Day GN. 1985. Extended streamflow forecasting using NWSRFS. Journal of Water Resources Planning and Management 111(WR2): 147–170. Demuth H, Beale M. 1998. Neural Network Toolbox: For Use with MATLAB User’s Guide. The Math Works Inc. Govindaraju RS, Rao AR. 2000. Artificial Neural Networks in Hydrology. Kluwer: The Netherlands. Haykin S. 1994. Neural Networks: A Comprehensive Foundation. MacMillan: New York. Hsu KL, Gupta HV, Sorooshian S. 1995. Artificial neural network modeling of the rainfall-rainoff process. Water Resources Research 31(10): 2517– 2530. Jain SK, Das D, Srivastava DK. 1999. Application of ANN for reservoir inflow prediction and operation. Journal of Water Resources Planning and Management 125(5): 263– 271. Jeong DI, Kim Y-O. 2002. Forecasting monthly inflow to Chungju dam using ensemble streamflow prediction. Journal of Korean Society of Civil Engineers 22(3-B): 321–331 (in Korean). Karunanithi N, Grenney WJ, Whitley D, Bovee K. 1994. Neural networks for river flow prediction. Journal of Computing in Civil Engineering 8(2): 201– 220. Kim G, Barros AP. 2001. Quantitative flood forecasting using multisensor data and neural networks. Journal of Hydrology 246: 45–62. Kim Y-O, Jeong DI, Kim HS. 2001. Improving water supply outlooks in Korea with ensemble streamflow prediction. Water International 26(4): 563–568. Kim Y-O, Jeong DI, Ko IH. 2003. Combining rainfall-runoff models for improving ensemble streamflow prediction. Journal of Hydrologic Engineering, in press. Minns AW, Hall MJ. 1996. Artificial neural network as rainfall-runoff models. Hydrological Science Journal 41(3): 399– 419. Murphy AH, Daan H. 1985. Forecast evaluation. In Probability, Statistics, and Decision Making in the Atmospheric Sciences. Murphy AH, Katz RW (eds). Westview Press: New York; 379– 437. Saad M, Bigras P, Turgeon A, Duquette R. 1996. Fuzzy learning decomposition for the scheduling of hydroelectric power systems. Water Resources Research 32(1): 179– 186. Sajikumar N, Thandaveswara BS. 1999. A non-linear rainfall–runoff model using an artificial neural network. Journal of Hydrology 216: 32–55. Shamseldin AY. 1997. Application of a neural network technique to rainfall–runoff modelling. Journal of Hydrology 199: 272– 294. Sivakumar B, Jayawardena AW, Fernando TMKG. 2002. River flow forecasting: use of phase-space reconstruction and artificial neural networks approaches. Journal of Hydrology 265: 225– 245. Sugawara M. 1974. Tank Model with Snow Component. National Research Center for Disaster Prevention: Japan. Tokar AS, Johnson PA. 1999. Rainfall–runoff modeling using artificial neural networks. Journal of Hydrologic Engineering 4(3): 232–239. Wilks DS. 1995. Statistical Method in the Atmospheric Sciences: An Introduction. Academic Press: San Diego; 238–344. Zealand CM, Burn DH, Simonovic SP. 1999. Short term streamflow forecasting using artificial neural networks. Journal of Hydrology 214: 32–48. Zhang B, Govindaraju RS. 2000. Prediction of watershed runoff using Bayesian concepts and modular neural networks. Water Resources Research 36(3): 753–762.

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

Rainfall-runoff models using artificial neural networks for ensemble streamflow prediction Dae-Il Jeong and Young-Oh Kim* School of Civil, Urban & Geosystems Engineering, Seoul National University, San 56-1, Shillim-Dong, Gwanak-gu, Seoul 151-742, South Korea

Abstract: Previous ensemble streamflow prediction (ESP) studies in Korea reported that modelling error significantly affects the accuracy of the ESP probabilistic winter and spring (i.e. dry season) forecasts, and thus suggested that improving the existing rainfall-runoff model, TANK, would be critical to obtaining more accurate probabilistic forecasts with ESP. This study used two types of artificial neural network (ANN), namely the single neural network (SNN) and the ensemble neural network (ENN), to provide better rainfall-runoff simulation capability than TANK, which has been used with the ESP system for forecasting monthly inflows to the Daecheong multipurpose dam in Korea. Using the bagging method, the ENN combines the outputs of member networks so that it can control the generalization error better than an SNN. This study compares the two ANN models with TANK with respect to the relative bias and the root-mean-square error. The overall results showed that the ENN performed the best among the three rainfall-runoff models. The ENN also considerably improved the probabilistic forecasting accuracy, measured in terms of average hit score, half-Brier score and hit rate, of the present ESP system that used TANK. Therefore, this study concludes that the ENN would be more effective for ESP rainfall-runoff modelling than TANK or an SNN. Copyright 2005 John Wiley & Sons, Ltd. KEY WORDS

artificial neural networks; ensemble neural network; ensemble streamflow prediction; probabilistic forecasting; rainfall-runoff model

INTRODUCTION Introduced in the 1970s, ensemble streamflow prediction (ESP) became a key part of the advanced hydrologic prediction system for the National Weather Service in the USA. ESP inputs historical meteorological scenarios to a rainfall-runoff model to forecast future streamflows using the current soil moisture, river, and reservoir conditions. Therefore, it is generally true that the accuracy of ESP forecasts relies primarily on the rainfallrunoff model being used. Jeong and Kim (2002) confirmed that the rainfall-runoff model used for their ESP study did not perform well for winter and spring, i.e. the dry season, and thus suggested that the model should be improved to obtain more accurate probabilistic ESP forecasts. The present study proposed another rainfall-runoff model using artificial neural networks (ANNs), which can be used for ESP. Once this ANN rainfall-runoff model is proven to perform reasonably well, it can be substituted for or combined with (Kim et al., 2003) the existing model to improve the performance of the existing ESP. ANNs have been widely used for various aspects of hydrology: the ASCE Task Committee on Artificial Neural Networks in Hydrology (2000) and Govindaraju and Rao (2000) reviewed ANN theories and applications in hydrology. Previous studies have demonstrated that ANNs are appropriate for complex nonlinear rainfall-runoff modelling (Hsu et al., 1995; Minns and Hall, 1996; Shamseldin, 1997; Sajikumar and Thandaveswara, 1999; Tokar and Johnson, 1999), streamflow forecasting (Karunanithi et al., 1994; Campolo et al., 1999a,b; Zealand et al., 1999; Zhang and Govindaraju, 2000; Kim and Barros, 2001; Birikundavyi et al., * Correspondence to: Young-Oh Kim, School of Civil, Urban and Geosystems Engineering, Seoul National University, San 56-1, ShillimDong, Gwanak-gu, 151-742 Seoul, Korea. E-mail: [email protected] Copyright 2005 John Wiley & Sons, Ltd.

Received 3 February 2004 Accepted 22 June 2005

3820

D.-I. JEONG AND Y.-O. KIM

2002; Sivakumar et al., 2002), and reservoir inflow forecasting (Saad et al., 1996; Jain et al., 1999; Coulibaly et al., 2000, 2001). Models based on ANNs are, in general, simple and reasonably accurate. Hsu et al. (1995) showed that the (nonlinear) ANN model provided a better representation of the rainfall-runoff relationships than the (linear) ARMAX (autoregressive moving average with exogenous inputs) time series model or the conceptual SAC-SMA (Sacramento soil moisture accounting) models. Zealand et al. (1999) compared ANNs with conventional streamflow forecasting approaches to demonstrate their capabilities. Including the type and the number of input variables and the size of hidden layers to be included in the ANN, several issues associated with the use of ANNs were examined in their study. Campolo et al. (1999b) used ANNs for forecasting the river flow for up to 6 days during low-flow periods. Zhang and Govindaraju (2000) showed the potential of modular neural networks for predicting monthly runoff for three medium-sized watersheds in Kansas, USA. Coulibaly et al. (2000) used the early stopping method, one of the regulation procedures, to train multilayer feedforward neural networks for real-time reservoir inflow forecasting. They showed that the early stopping method can provide better and more reliable generalization performance than the Levenberg–Marquardt backpropagation (LMBP) method alone. In this paper, a single neural network (SNN) using the early stopping method was used for our ANN rainfall-runoff model. Birikundavyi et al. (2002) investigated the performance of ANN models for daily streamflow forecasting and showed that ANNs outperformed a deterministic model called PREVIS and the classic autoregressive model coupled with a Kalman filter. Cannon and Whitfield (2002) introduced in their climate change studies the bagging (or bootstrap aggregation) method as an ensemble neural network (ENN) approach and showed the suitability of ENNs for downscaling techniques. Combining outputs of several member models can significantly improve generalization performance, because the generalization error of the final predictive model is controlled. In this study, rainfall-runoff models using ENNs with the bagging and early stopping method were compared with those using SNNs with the early stopping method. The primary objective of this paper is to improve the accuracy of the present ESP forecasting system for monthly inflows of the Daecheong dam in Korea. To achieve this objective, we attempted to develop new rainfall-runoff models using ANN techniques, which can be substituted for the existing rainfall-runoff model.

THE ANN The SNN ANNs are based on the highly interconnected structure of brain cells. This approach is fast and robust in noisy environments, flexible in the range of problems it can solve, and highly adaptive to newer environments. Owing to these established advantages, ANNs currently have numerous real-world applications, such as time series prediction, rule-based control, and rainfall-runoff modelling (Jain et al., 1999). The multilayer feedforward neural network, also referred to as an SNN in this study, consists of a set of sensory units that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. The input signal propagates through the network in a forward direction, layer by layer. These neural networks are commonly referred to as multilayer perceptrons. A typical three-layer feedforward ANN is shown in Figure 1. The mathematical form of a three-layer feedforward ANN is given as Ok D g2 [ wkj g1 wji Ii C wj0 C wk0 ] 1 j

i

where Ii is the input value to node i of the input layer, Vj is the hidden value to node j of the hidden layer, and Ok is the output at node k of the output layer. An input layer bias term I0 D 1 with bias weights wj0 and an output layer bias term V0 D 1 with bias weights wk0 are included to permit adjustments of the mean level at each stage. There are two sets of adjustable weights; wji controls the strength of the connection between Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3821

(V0)

(I0)

Input signal

Output signal ...

... (Ii) Input layer

(wji)

(Vj) Hidden layer

(wkj)

(Ok) Output layer

Figure 1. Multilayer feedforward with one hidden layer

the input node i and the hidden node j, and wkj controls the strength of the connection between the hidden node j and the output node k. g1 and g2 are activation functions for the hidden layer and the output layer respectively. The activation function is usually selected to be a continuous and bounded nonlinear transfer function, for which the sigmoid and hyperbolic tangent functions are commonly used (e.g. Haykin, 1994; Hsu et al., 1995; Govindaraju and Rao, 2000). Multilayer perceptrons have been applied successfully to solve some difficult and diverse problems. They have been trained under supervision with a highly popular algorithm known as the error back-propagation algorithm (Haykin, 1994). There are many variations of the back-propagation algorithm, including gradient descent and the faster algorithms using heuristic or optimization techniques (Demuth and Beale, 1998). In this study we use the LMBP (Levenberg-Marquardt Back-Propagation) algorithm method for ANN training. The LMBP, one of the second-order nonlinear optimization techniques, is usually faster and more reliable than any other back-propagation techniques. The LMBP uses the approximate Hessian matrix that can be approximated as H D JT J 2 where J is the Jacobian matrix, which contains the first derivatives of the ANN errors with respect to weights and biases. The LMBP algorithm uses this approximation to the Hessian matrix in the following update: W D [JT J C µI]1 JT e

3

where e is the residual error vector and is a variable small scalar that controls the learning process. When the scalar is zero, Equation (3) is equivalent to Newton’s method, using the approximate Hessian matrix. When is large, Equation (3) is equivalent to the gradient descent method with a small step size. Newton’s method is faster and more accurate near an error minimum than the gradient descent method is; thus, the aim is to shift towards Newton’s method as quickly as possible. In practice, LMBP is faster in finding better optima for a variety of problems than the other usual methods (Coulibaly et al., 2000). Early stopping method An ANN is defined as ‘generalized’ when it produces reasonable outputs from inputs that have not been used during training. When not generalized, an ANN can suffer from either underfitting or overfitting. While a too complex ANN may likely fit the noise and lead to overfitting, an insufficiently complex network can fail to detect the regularities in the data set and, thus, lead to underfitting. Underfitting produces excessive bias in the model outputs, whereas overfitting produces excessive variance (Coulibaly et al., 2000). Among various regulation procedures that generalize SNNs, three methods have been popularly used: (1) the method for limiting the number of hidden units, (2) the weight decay method, and (3) the early stopping method. In Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3822

D.-I. JEONG AND Y.-O. KIM

this study, the early stopping method was selected because it has been generally recognized to be the most popular for optimizing the generalization performance of ANNs in practice. Typically, an ANN divides the available data into a training set and a test set: performance of the trainingset-calibrated ANN model is verified using the test subset. The early stopping method, however, requires one more subset between the training and the test sets, called the validation set. The typical ANN approach monitors the training error only during the training procedure, but the early stopping method monitors the validation error as well as the training error. The training procedure is terminated when the (root-) mean-square error (RMSE) of the validation set reaches its minimum. If the ANN is trained further, then it begins to overfit, and the mean-square error begins increasing from its minimum. Without the early stopping method, data that have high complexity typically produce high variance and low bias when the training is terminated. Since the early stopping method minimizes the mean square error (which is the bias squared minus the variance), the error variance is decreased at the expense of added bias. As a result, such models become less dependent on the training data, as well as become less complex. Therefore, the early stopping method is fast and can be applied successfully to networks where the number of weights exceeds the sample size. However, there are several difficult issues associated with the early stopping method: (1) How many cases do we assign to the training and validation sets? (2) How do we split the data into training and validation sets? (3) When do the validation error rates ‘start to go up’? Furthermore, this method also uses data inefficiently because of the additional validation data set requirements. The ENN and the bagging method The ENN is a subject of active research. An ensemble in an ANN is a set of independently trained member models of SNNs whose predictions are combined in some way to obtain a single estimate of the desired output (Figure 2). This allows greater model stability than any SNN regulation procedures. It can also use training data more efficiently than the early stopping method. Bagging (otherwise referred to as bootstrap aggregation) has been widely accepted as one of the most popular methods for producing ensembles. The bagging approach (Breiman, 1996) also groups the available training data into a training set and a test set. A training subset is created by resampling with replacement (i.e. using the bootstrap approach) from the training set. The probability that an individual training subset from the training set will not be a part of the bootstrap resampled training set is 1 1/NN ³ 0Ð368, where N is the number of training samples in the training set. This means that only approximately 63% of the distinct training sample will be included in a bootstrap training subset. Each member model is trained using a different training subset. The final model member model 1

O1(n)

member model 2

O2(n) input I(n)

combiner

output O(n)

… member model L

OL(n)

Figure 2. Block diagram of ENNs

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3823

output takes the average of the outputs from all members of the ensemble. By averaging the outputs of member models that typically have low bias and high variance, the variance of the ensemble model can be reduced without increasing the overall ensemble bias (Breiman, 1996; Cannon and Whitfield, 2002). To describe bagging in more concrete terms, let Oi (x) be the output, t(x) be the target, and εi D Oi x tx be the error of a member model i. The average sum-of-squares error (Ei ) for model Oi (x) can be written as Ei D E[Oi x tx2 ] D E[εi 2 ]

4

where E[Ð] denotes the expectation. The average of Ei EAV is given by EAV D

L 1 1 Ei D E[εi 2 ] L iD1 L

5

We now introduce a simple form of committee. This form takes the output of the committee to be the average of the outputs of the L networks, which comprise the committee. Thus, we write the committee prediction in the form L 1 Ocom D Oi x 6 L iD1 The error due to the committee Ecom can then be written as Ecom D E[Ocom x tx2 ] 2 L 1 D E Oi x tx L iD1 L 1 2 D 2E Oi x tx L iD1 2 L 1 D 2E εi L iD1 2 L 1 2 E L εi D EAV L iD1

7

If we now assume that the errors εi (x) have zero mean and are uncorrelated, so that E[εi ] D 0 E[εi εj ] D 0

if j 6D i

8

then the sum-of-squares error becomes the error variance, which are further evaluated as 2 L L 1 1 1 εi D 2 E[εi 2 ] D EAV Ecom D 2 E L L L iD1 iD1

9

This represents the apparently dramatic result that the sum-of-squares error can be reduced by a factor of L simply by averaging the predictions of L networks. In practice, the reduction in error is generally much smaller than Equation (9) would suggest, because the errors εi (x) of different models are typically highly correlated, Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3824

D.-I. JEONG AND Y.-O. KIM

and so the second assumption in Equation (8) does not hold. As Breiman (1996) pointed out, improvements in performance tend to level out when more than 25 member models (L ½ 25) are added to the ensemble. In this study, the early stopping method was also applied to the bagged ensemble to optimize generalization performance as follows. First, L bootstrap resamples were generated from the training set. Then, the generalization error was estimated for each network using the early stopping method. For each network in the ensemble, we found the epoch that minimized the generalization error. The corresponding networks were chosen as the optimal set for the ensemble. Carney and Cunningham (2000) first introduced this technique in their paper, and it has since generally been accepted and often used because it is fast and easy to implement.

THE ESP METHOD Probabilistic forecasts with ESP ESP runs a rainfall-runoff model with meteorological inputs to generate an ensemble of possible streamflow hydrographs. Note that the ‘ensemble’ of ESP is independent of the ensemble of an ENN. A generated streamflow ensemble is a function of the initial hydrological states at the forecast time in the rainfall-runoff model, and thus the technique is sometimes called a conditional Monte Carlo simulation approach (Day, 1985). A best probability density function (referred to as the ESP p.d.f. hereafter) can then be fitted to the generated streamflow ensemble to describe the likelihood of an event occurring during a certain time period t being forecasted. In this study, the whole range of streamflow Qt was divided into three flow categories: low, medium, and high. A probability was then assigned to each category from the ESP p.d.f. The high, medium, and low flow probabilities (denoted PrH , PrM , and PrL ) were computed using PrH,t D 1 Pr[Qt qU,t ] PrM,t D Pr[Qt qL,t ] Pr[Qt qU,t ] PrL,t D Pr[Qt qL,t ]

10

respectively, where qL and qU are respectively the lower and upper limits of the medium flow category. In this study, the lower and upper limits are defined as the 33Ð3% and 66Ð7% cumulative quantiles respectively of a p.d.f. fitted to the historical streamflow data. Accuracy measures for probabilistic forecasts Since a probabilistic forecast specifies a probability distribution of the predictand, deterministic forecast accuracy measures such as bias and RMSE are not suitable for probabilistic forecasts. In this study, the half-Brier score, average hit score (AHS), hit rate, and bias ratio were used as accuracy measures. For a forecast that gives probability to events of r categories, the skill score of the overall forecast for r categories and N observations is measured with the half-Brier score (Brier, 1950), according to 1 υij ij 2 N jD1 iD1 r

BD

N

11

where ij is the forecast probability that the event will occur in category j and υij D 1 or 0, depending respectively on whether or not the event occurred in category j. The half-Brier score is the mean-squared error of the categorical forecast and has a value of zero when the forecast is perfect or a probability of one for the observed category. The worst possible forecast has a half-Brier score of two, which implies that the actual event always occurs in the category denoted with a probability of zero. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3825

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

The effectiveness of the probabilistic forecast can also be measured using the AHS (Kim et al., 2001). If flow i occurs in a month m, a score of S1 D 100Pri is assigned, where i is low, medium, or high, and Pri is the probability assigned to the category in which the flow occurs. For example, if a low flow occurs in month m D 1 in Figure 3 then the score will be 50, since Pr1 D 0Ð5. This is repeated throughout the entire verification period T in Figure 3. The resulting average score is denoted by Savg . If Savg is higher than 33Ð3, then it can be concluded that the ESP technique performs better than the naive (or climatology) forecast. Wilks (1995) describes the theoretical basis of hit rate and bias ratio. If we change the probabilistic forecast to a categorical forecast, which consists of a flat statement that only one of a set of possible events will occur, forecast verification is easy to understand. Categorical verification data are conventionally displaced in an I ð J contingency table of absolute frequencies, or counts, of the I ð J possible combinations of forecast and event pairs. Figure 4 illustrates the essential equivalence of the contingency table of forecasts and observations for the I D J D 3 categorical case. The total number of forecast/event pairs in the data set is n D a C b C c C d C e C f C g C h C i. The letters a, e, and i respectively represent the numbers of occasions when the low, medium, and high event were correctly forecasted. The most direct and intuitive measure of the accuracy of categorical forecasts is the hit rate. This rate is simply a function of the n forecasting occasions when the categorical forecast correctly ∗occurrence of actual streamflows

month = 1

50%*

20%

qL,1

30%

10%

month = 2 20%

70%

qL,2

*

S2 = 10

65% *

ST = 65

qU,2

25% 10% qL,n

…

…

… month = T

S1 = 50

qU,1

qU,n

Average Hit Score:

Savg =

1

T

T

∑ 100 × Pri

i =1

Figure 3. Example of AHS

oL

oM

oH

fL

a

b

c

fM

d

e

f

fH

g

h

i

Figure 4. Contingency table of forecasts and observations for the I D J D 3 categorical case

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3826

D.-I. JEONG AND Y.-O. KIM

anticipates the subsequent event or nonevent. The hit rate is given by aCeCi 12 n The worst possible hit rate is zero and the best possible hit rate is one. The bias ratio (or comparison of the average forecast with the average observation), of categorical forecasts, is usually represented as a ratio (Wilks, 1995), which is given by HD

aCbCc aCdCg dCeCf BM D bCeCh gChCi BH D cCfCi BL D

13

Unbiased forecasts exhibit B D 1, indicating that the events were forecasted the same number of times as they were observed. However, this result says nothing about the correspondence between the forecasted and observed values of the events. A bias ratio greater than one indicates that an event was forecasted more often than observed, i.e. overforecasted, whereas a bias ratio of less than one indicates that an event was underforecasted.

APPLICATION Study basin Completed in 1981, the Daecheong dam is located approximately 150 km upstream from the mouth of the Geum river basin in Korea. The Geum River flows in a westerly direction for 401 km, draining a 9810 km2 area of Korea (Figure 5). As one of 15 multipurpose dams in Korea, the Daecheong dam is operated mainly for flood control, water supply, and energy generation. Observed inflow data at the Daecheong dam have been available since 1981. Like other hydrologic time series, the Daecheong monthly inflow shows strong seasonality (Figure 6). Because of the monsoon climate, two-thirds of the annual flow occurs during the 3 month flood season from July to September, with only a third of the annual streamflow available during the remaining 9 months. The hydrometeorological input ensemble of ESP for the Daecheong monthly inflow consisted of observed precipitation and evaporation data for 15 years from 1981 to 1995. Therefore, an ensemble of 15 streamflow scenarios was available for each forecasting month of the test period from 1996 to 2001. TANK model The existing conceptual rainfall-runoff model for the Daecheong dam basin is called TANK. Having three tanks, TANK simulates the net stream discharge as the sum of the discharges from the side orifices of the tanks (Sugawara, 1974). Figure 7 shows the structure and 12 parameters of TANK. The parameters of TANK can be classified into three types: runoff coefficients (A11, A12, A2, and A3), infiltration coefficients (B1 and B2), and storage parameters, such as height of the side outlets of each tank (H11, H12, and H2) and initial storage height (SH1, SH2, and SH3). In this model, the shuffled complex evolution method was used to estimate the required parameters. TANK has been applied to the entire Geum river basin, but its simulation capability was not satisfactory, especially during the dry season from October. to May (Jeong and Kim, 2002). This shortcoming is demonstrated in Figure 8, which shows the monthly absolute relative bias (ARB) and relative RMSE (RRMSE) of the TANK model simulated from 1981 to 2001. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3827

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

Figure 5. Location map of the Daecheong dam in Korea

800

Inflow(CMS)

600

400

200

Dec

Nov

Oct

Sep

Aug

Jul

Jun

May

Apr

Mar

Feb

Jan

0

Month Figure 6. Box plot of the Daecheong dam average monthly inflows

Rainfall-runoff model using ANN techniques This study developed two rainfall-runoff models for the proposed ESP system in Korea: one is an SNN with the early stopping method and the other is an ENN with the bagging method. Input variables used for these rainfall-runoff models were the previous and the current months’ precipitation (Rt , Rt1 , . . .) and evaporation (Et , Et1 , . . .), as well as the previous month’s inflows (It1 , It2 , . . .). Using the current month’s inflow It as the target variable, the input–output structure was given by IO t D fRt , Rt1 , . . ., Et , Et1 , . . ., It1 , It2 , . . . Copyright 2005 John Wiley & Sons, Ltd.

14

Hydrol. Process. 19, 3819– 3835 (2005)

3828

D.-I. JEONG AND Y.-O. KIM

Evapotranspiration Precipitation

A12 SH1

H12

A11 H11

B1 A2 SH2 H2

B2

SH3 A3

Runoff

Figure 7. Structure and parameters of TANK

1.000

ARB

error measures

R-RMSE

0.500

0.000

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

month Figure 8. Monthly absolute relative bias and R-RMSE of TANK (1981– 2001)

The observed precipitation, evaporation, and inflow data from 1981 to 1995 were used for training the ANN rainfall-runoff models. The recent 6 years from 1996 to 2001 was used for testing the performance of the trained models. Such ANN models were then compared with TANK, a well-known conceptual rainfall-runoff model in Korea. In Equation (14), the historical ensembles of rainfall and evaporation were input for Rt and Et respectively to create the inflow It ensemble. The initial condition of the ANN rainfall-runoff model could be set by using the lagged variables such as Rt1 , Et1 , It1 , etc. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3829

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

An SNN rainfall-runoff model was developed using a multilayer feedforward ANN with one hidden layer. The input and target data were normalized in the range from zero to one because a sigmoid function was employed as the transfer function. From the input layer to the hidden layer, the log sigmoid function has been commonly used in hydrologic ANN models. From the hidden layer to the output layer, a linear function was employed as the transfer function because the linear function is known to be robust for a continuous output variable, such as the monthly inflow in this study. For the learning algorithm and the regulation procedure, this study used the LMBP algorithm and the early stopping method respectively. An important issue in the ANN modelling is the choice of the input variables and the determination of the number of hidden nodes, but there is no general theory yet to solve this issue, and it is rather problem dependent. In this study, a trial-and-error procedure selected the final structure of the SNN model, i.e. the input variables and the number of hidden nodes. To select the input variables of SNN, correlation coefficients between the target variable It and the candidate input variables such as (Rt , . . ., Rt4 ), (Et , . . ., Et4 ), and (It1 , . . ., Rt4 ) were used as the initial criteria (Table I). Table II shows 10 combinations of the input variables tested in this study. Since there may exist nonlinear relationships that cannot be measured with the correlation coefficient, Table II includes some input variables that showed weak correlations in Table I. For each test model in Table II, the number of the hidden nodes was initially set to half of the number of the input variables. Figure 9 compares the 10 test models in terms of the correlation coefficient and RMSE, which were calculated for the training, validation, and test sets. Since model 8 gave the largest correlation coefficient and the smallest RMSE in most cases, this study selected model 8 for the SNN. Note that this study randomly used 20 different values for the weights (wji and wkj in Figure 1) for each structure that was characterized by the input variables and the number of the hidden nodes, trained those ANNs, and averaged the 20 resulting values of the correlation coefficient and RMSE. Therefore, each value shown in Figure 9 (also in Figures 10–12) was an average of the 20 random sample results. Table I. Correlation coefficients between the target variable and the candidate input variables Lag months

Inflow I

Precipitation R

Evaporation E

t t1 t2 t3 t4

1Ð000 0Ð408 0Ð164 0Ð090 0Ð157

0Ð929 0Ð503 0Ð190 0Ð058 0Ð228

0Ð223 0Ð350 0Ð428 0Ð316 0Ð273

Table II. Test cases for the input variables selection Model Model Model Model Model Model Model Model Model Model Model

Input–output structures 1 2 3 4 5 6 7 8 9 10

Copyright 2005 John Wiley & Sons, Ltd.

IO t IO t IO t IO t IO t IO t IO t IO t IO t IO t

D fRt , . . . , Rt4 , Et , . . . , Et4 , It1 , . . . , It4 D fRt , . . . , Rt4 , Et , . . . , Et4 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et4 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et3 , It1 , . . . , It3 D fRt , . . . , Rt3 , Et , . . . , Et3 , It1 , It2 D fRt , Rt1 , Rt2 , Et , . . . , Et3 , It1 , It2 D fRt , Rt1 , Rt2 , Et , Et1 , Et2 , It1 , It2 D fRt , Rt1 , Rt2 , Et , Et1 , Et2 , It1 D fRt , Rt1 , Et , Et1 , Et2 , It1 D fRt , Rt1 , Et , Et1 , It1

Hydrol. Process. 19, 3819– 3835 (2005)

3830

D.-I. JEONG AND Y.-O. KIM

cor(trn)

model10

model9

model8

model7

model6

model5

model4

model3

0.90

model2

0.95

model1

Correlation Coefficient

(a) 1.00

cor(test)

cor(val)

(b) 45.0 RMSE(CMS)

40.0 35.0 30.0 25.0

RMSE(trn)

RMSE(val)

model10

model9

model8

model7

model6

model5

model4

model3

model2

15.0

model1

20.0

RMSE(test)

Figure 9. Correlation coefficients and RMSEs of 10 SNN models: (a) correlation coefficient; (b) RMSE

The best number of the hidden layer was then searched within the model-8 structure. Two performance indices were observed with 2, 3, . . ., 20 nodes (see Figure 10). The best results for the validation and test sets were achieved using four hidden nodes. ENN is an aggregate of multiple member models. In this study, each member model of an ENN was developed using the same procedure as the SNN modelling. For example, the LMBP algorithm, the log sigmoid, the linear transfer function, and the early stopping method were also employed for the ENN. Thirty member models were developed and then combined for the ENN, because Breiman (1996) recommended that at least 25 member models should be added to the ENN to reduce the generalization errors. Using the comparison procedure as shown in Figures 11 and 12, model 5 (in Table II) with 10 hidden nodes was finally selected as the best ENN model. Comparing the SNN with the ENN, the test performance of the ENN model in Figure 11 is less sensitive to the input variable selection than that of the SNN in Figure 9. The ENN in Figure 12 is also less sensitive to the number of hidden nodes than the SNN is in Figure 10. Furthermore, the test performance in Figures 9–12 reported that the ENN, in general, produced smaller RMSEs than the corresponding SNN, which implies that the ENN can reduce the generalization error more efficiently than the SNN. Comparison between TANK, SNN, and ENN The ANN models developed in this study were compared with the existing TANK rainfall-runoff model for the Daecheong basin. Table III shows the relative bias (RB) and R-RMSE of the three rainfall-runoff Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3831

Correlation Coefficient

(a) 1.00

0.95

0.90

0

2

4

6

8

10 12 14 16 18 20

Number of hidden nodes Cor(val) Cor(test) Cor(trn) (b) 50.0 RMSE(CMS)

45.0 40.0 35.0 30.0 25.0 20.0 15.0

0

2

4

6

8

10 12 14 16 18 20

Number of hidden nodes RMSE(trn) RMSE(val) RMSE(test) Figure 10. Correlation coefficients and RMSEs with various numbers of hidden nodes (SNN): (a) correlation coefficient; (b) RMSE

models (TANK, SNN, and ENN) in terms of seasonal and annual scales. The best results for each period are highlighted in bold. Table III reports some promising results. First, both the SNN and the ENN always performed better than TANK, except in the one case of RMSE in spring. In particular, the ANN performed very well, reducing the bias of the existing model TANK. Of special note is that the ENN on an annual basis was almost unbiased (0Ð006). Second, overall the ENN was superior to the SNN. In winter, when TANK performed worst, the ENN performed considerably better than the SNN in bias and RMSE. Note that the seasonal performances of the ANN models would have been improved if the models had been trained on a seasonal basis. In this study, however, the ANN models were trained on an annual basis because of the short record, and thus the annual performance indices are more meaningful. TANK and the ENN were further compared with respect to performance of the ESP probabilistic forecasts. In other words, using TANK and the ENN, ESP was applied to making 1-month ahead probabilistic forecasts from 1996 to 2001. For each forecasting month, historical rainfall and evaporation scenarios for the 15 year period from 1981 to 1995 were input into the TANK and ENN rainfall-runoff models to create 15 inflow scenarios. Figure 13 compares the AHSs of TANK and the ENN. In March and August, TANK was superior to the ENN, but the ENN generally outperformed TANK, especially in the dry season (November–February). The overall averages of AHP were 38Ð8% and 48Ð5% for TANK and the ENN respectively, which imply that the ENN was a considerable improvement over TANK in terms of ESP probabilistic forecasting. Figure 14 shows the half-Brier scores of two rainfall-runoff models, also reporting a similar result. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3832

D.-I. JEONG AND Y.-O. KIM

cor(cal)

model10

model9

model8

model7

model6

model5

model4

model3

0.90

model2

0.95

model1

Correlation Coefficient

(a) 1.00

cor(test)

(b) 45.0

RMSE(CMS)

40.0 35.0 30.0 25.0

RMSE(cal)

model10

model9

model8

model7

model6

model5

model4

model3

model2

15.0

model1

20.0

RMSE(test)

Figure 11. (a) Correlation coefficients and (b) RMSEs of 10 ENN models

Table IV summarizes the contingency table, hit rates, and bias ratios of TANK and the ENN. The hit rate of the ENN was greater than that of TANK, and the bias ratios of the two models are significantly different. Table IV indicates that the medium flows were forecasted less often than they were actually observed, i.e. underforecasted, when TANK was used for ESP. Similarly, Table IV indicates that the high flows were underforecasted while the low flows were overforecasted when the ENN was used for ESP.

CONCLUSIONS This study developed new rainfall-runoff models that can be used for ensemble streamflow prediction. The new models used two types of ANN, i.e. an SNN and an ENN. Both ANN models used the early stopping method to optimize generalization performance during training. The bagging method was used in this study for the ENN to control the generalization error better than the SNN. The ANN models were applied to making 1-month ahead probabilistic forecasts for inflows to the Daecheong multipurpose dam in Korea. The calibrated ANN models were compared with each other first. The results show that the ENN is less sensitive to the input variable selection and the number of hidden nodes than the SNN is, and the ENN, in general, produced smaller RMSEs than the corresponding SNN, which implies that the ENN can reduce the generalization error more efficiently than the SNN can. Comparing the SNN and ANN with a rainfall-runoff model TANK, which has been widely used in Korea, with respect to their simulation accuracy, this study found that the new ANN models performed better than TANK for 9 out of 10 test cases. Finally, this study tested TANK and the ENN using some probabilistic forecasting accuracy measures and showed that, for most Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3833

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

Correlation Coefficient

(a) 1.00

0.95

0.90

0

2

4

6 8 10 12 14 16 Number of hidden nodes cor(cal)

18

20

cor(test)

(b) 45.0

RMSE(CMS)

40.0 35.0 30.0 25.0 20.0 15.0

0

2

4

6 8 10 12 14 16 18 20 Number of hidden nodes RMSE(test)

RMSE(cal)

Figure 12. (a) Correlation coefficients and (b) RMSEs with various numbers of hidden nodes (ENN)

Table III. Relative bias and R-RMSE of three rainfall-runoff models Measure

Model

Year

Spring

Summer

Autumn

Winter

Relative bias (RB)

TANK SNN ENN TANK SNN ENN

0Ð061 0Ð048 −0·017 0Ð393 0Ð346 0·319

0Ð145 −0·062 0Ð122 0·281 0Ð304 0Ð336

0Ð020 0Ð033 −0·003 0Ð277 0Ð249 0·221

0Ð240 0Ð048 0·005 0Ð396 0·339 0Ð352

0Ð279 0Ð493 0·013 0Ð977 0Ð602 0·294

R-RMSE

of the test months from 1996 to 2001, the skills of the ENN were better than those of TANK. During the dry season in particular, the ENN improved its ESP performance considerably better than that of TANK. Therefore, this study concludes that an ENN should be substituted for the existing rainfall-runoff model, TANK, for the ESP probabilistic forecasting system for the Daecheong dam inflows in Korea.

ACKNOWLEDGEMENTS

This research was supported by a grant (code 1-6-1) from the Sustainable Water Resources Research Center of 21st Century Frontier Research Program and was also supported by the Research Institute of Engineering Science (RIES), Seoul National University, Seoul, Korea. Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

3834

D.-I. JEONG AND Y.-O. KIM

TANK

ENN

AHS (%)

66.7

33.3

0.0

Jan

Feb

Mar

Apr

May

Jun

Jul Aug month

Sep

Oct

Nov

Dec

Avg.

Nov

Dec Avg.

Figure 13. AHS of ESP forecast of TANK and the ENN

HBS

TANK

ENN

0.667

0.000

Jan

Feb

Mar

Apr

May

Jun

Jul Aug month

Sep

Oct

Figure 14. Half-Brier score (HBS) of ESP forecast of TANK and the ENN

Table IV. Contingency table, hit rate, and bias ratios of the TANK and the ENN model TANK oL fL fM fH H BL BM BH

16 6 3

ENN

oM

oH

oL

oM

oH

10 1 16 0Ð35 1Ð28 0Ð48 1Ð35

6 6 8

23 2 0

10 16 1 0Ð58 1Ð68 0Ð96 0Ð20

9 8 3

REFERENCES ASCE Task Committee on Artificial Neural Networks in Hydrology. 2000. Artificial neural networks in hydrology. II. Hydrologic applications. Journal of Hydrologic Engineering 5(2): 124– 137. Birikundavyi S, Labib R, Trung HT, Rousselle J. 2002. Performance of neural networks in daily streamflow forecasting. Journal of Hydrologic Engineering 7(5): 392–398.

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)

RAINFALL-RUNOFF MODELS USING ANNS FOR ESP

3835

Breiman L, 1996. Bagging predictors. Machine Learning 24(2): 123–140. Brier GW. 1950. Verification of forecasts expressed in terms of probabilities. Monthly Weather Review 78: 1–3. Campolo M, Andreussi P, Soldati A. 1999a. River flood forecasting with a neural network model. Water Resources Research 35(4): 1191– 1197. Campolo M, Soldati A, Andreussi P. 1999b. Forecasting river flow rate during low-flow periods using neural networks. Water Resources Research 35(11): 3547– 3552. Cannon AJ, Whitfield PH. 2002. Downscaling recent streamflow conditions in British Columbia, Canada using ensemble neural network models. Journal of Hydrology 259: 136– 151. Carney JG, Cunningham P. 2000. Tuning diversity in bagged ensembles. International Journal of Neural Systems 10: 267– 279. Coulibaly P, Anctil F, Bobee B. 2000. Daily reservoir inflow forecasting using temporal neural networks. Journal of Hydrology 230: 244–257. Coulibaly P, Anctil F, Bobee B. 2001. Multivariate reservoir inflow forecasting using temporal neural networks. Journal of Hydrologic Engineering 6(5): 367– 376. Day GN. 1985. Extended streamflow forecasting using NWSRFS. Journal of Water Resources Planning and Management 111(WR2): 147–170. Demuth H, Beale M. 1998. Neural Network Toolbox: For Use with MATLAB User’s Guide. The Math Works Inc. Govindaraju RS, Rao AR. 2000. Artificial Neural Networks in Hydrology. Kluwer: The Netherlands. Haykin S. 1994. Neural Networks: A Comprehensive Foundation. MacMillan: New York. Hsu KL, Gupta HV, Sorooshian S. 1995. Artificial neural network modeling of the rainfall-rainoff process. Water Resources Research 31(10): 2517– 2530. Jain SK, Das D, Srivastava DK. 1999. Application of ANN for reservoir inflow prediction and operation. Journal of Water Resources Planning and Management 125(5): 263– 271. Jeong DI, Kim Y-O. 2002. Forecasting monthly inflow to Chungju dam using ensemble streamflow prediction. Journal of Korean Society of Civil Engineers 22(3-B): 321–331 (in Korean). Karunanithi N, Grenney WJ, Whitley D, Bovee K. 1994. Neural networks for river flow prediction. Journal of Computing in Civil Engineering 8(2): 201– 220. Kim G, Barros AP. 2001. Quantitative flood forecasting using multisensor data and neural networks. Journal of Hydrology 246: 45–62. Kim Y-O, Jeong DI, Kim HS. 2001. Improving water supply outlooks in Korea with ensemble streamflow prediction. Water International 26(4): 563–568. Kim Y-O, Jeong DI, Ko IH. 2003. Combining rainfall-runoff models for improving ensemble streamflow prediction. Journal of Hydrologic Engineering, in press. Minns AW, Hall MJ. 1996. Artificial neural network as rainfall-runoff models. Hydrological Science Journal 41(3): 399– 419. Murphy AH, Daan H. 1985. Forecast evaluation. In Probability, Statistics, and Decision Making in the Atmospheric Sciences. Murphy AH, Katz RW (eds). Westview Press: New York; 379– 437. Saad M, Bigras P, Turgeon A, Duquette R. 1996. Fuzzy learning decomposition for the scheduling of hydroelectric power systems. Water Resources Research 32(1): 179– 186. Sajikumar N, Thandaveswara BS. 1999. A non-linear rainfall–runoff model using an artificial neural network. Journal of Hydrology 216: 32–55. Shamseldin AY. 1997. Application of a neural network technique to rainfall–runoff modelling. Journal of Hydrology 199: 272– 294. Sivakumar B, Jayawardena AW, Fernando TMKG. 2002. River flow forecasting: use of phase-space reconstruction and artificial neural networks approaches. Journal of Hydrology 265: 225– 245. Sugawara M. 1974. Tank Model with Snow Component. National Research Center for Disaster Prevention: Japan. Tokar AS, Johnson PA. 1999. Rainfall–runoff modeling using artificial neural networks. Journal of Hydrologic Engineering 4(3): 232–239. Wilks DS. 1995. Statistical Method in the Atmospheric Sciences: An Introduction. Academic Press: San Diego; 238–344. Zealand CM, Burn DH, Simonovic SP. 1999. Short term streamflow forecasting using artificial neural networks. Journal of Hydrology 214: 32–48. Zhang B, Govindaraju RS. 2000. Prediction of watershed runoff using Bayesian concepts and modular neural networks. Water Resources Research 36(3): 753–762.

Copyright 2005 John Wiley & Sons, Ltd.

Hydrol. Process. 19, 3819– 3835 (2005)