Artificial Neural Network Based Technique for Lightning ... - IEEE Xplore

The5th5thStudent StudentConference ConferenceononResearch Researchand andDevelopment Development–SCOReD –SCO The 2007 11-12December December2007, 2007,Malaysia Malaysia 11-12

Artificial Neural Network Based Technique for Lightning Prediction 1

Dalina Johari, 2Titik Khawa Abdul Rahman, Senior Member, IEEE 3 Ismail Musirin, Member, IEEE

Abstract - Malaysia has high lightning and thunderstorm occurrences throughout the year. A vast amount of its data have been recorded which allows various lightning-related studies to be conducted. This paper presents the application of artificial neural network (ANN) in predicting the occurrence of lightning events based on historical lightning and meteorological data. ANN, which was inspired by the way biological nervous systems process information, is utilized in this study due to its strong pattern recognition capabilities; implemented through learning patterns and relationships in data. A two layer back-propagation neural network has been developed to predict the occurrence of lightning at least four hours prior to its arrival. Several network structures, training algorithms and activation functions have been rigorously tested in order to obtain the most suitable network with high accuracy and convergence capability, while the perfection of the developed network was conducted through postprocessing, indicated by the closeness of correlation coefficient to unity. The computation burden experienced in this study in achieving the converged solution has been alleviated by the introduction of indicator module to the original features of the training and testing patterns. Index Terms-- Artificial neural network, back-propagation, correlation coefficient, lightning prediction, training, testing, indicator.

I. INTRODUCTION Malaysia lies near the equator line which is characterized by high lightning and thunderstorm activity [1, 2]. With an isokeraunic level of more than 200 days per year, vast amounts of its data have been recorded by the Malaysian Meteorological Services (MMS) and a private utility company, the Tenaga Nasional Berhad (TNB). The extensive database has enabled various lightning researches to be conducted. Among them include Yahaya et al. [2, 3], who presented characteristics of lightning in West Malaysia and Hartono et al. [4], who described thunderstorm day and ground flash density in Malaysia. Artificial neural network (ANN) is an intelligent machine learning technique inspired by the way biological nervous systems process information. It has emerged as a powerful tool in various engineering and non-engineering applications. With the ability to learn by example and do tasks based on training experience, it is profoundly suitable for pattern recognition 1

Dalina Johari can be reached at [email protected]. Titik Khawa Abdul Rahman is with Universiti Teknologi MARA, Shah Alam, Malaysia. She can be reached at [email protected]. 3 Ismail Musirin is with Universiti Teknologi MARA, Shah Alam, Malaysia. He can be reached at [email protected]

and forecasting tasks. Various researches have been conducted using ANN due to its computational speed, robustness and its ability to handle complex non-linear functions. Among them were lightning performance evaluation [5], lightning location system [6] and lightning models evaluation [7]. A pilot study [8] to predict lightning using ANN was conducted in 1991 by the National Aeronautics and Space Administration of the United States (NASA). It was initiated in an effort to minimise losses due to shuttle launches delays or cancellations, and loss of life due to lightning-related incidents during the operation. By using input data such as wind, electric field, nearby lightning and product of these data, NASA used a three-level back propagation ANN to predict lightning in four future intervals. The results showed a satisfactory performance with further improvement expected by the use of higher number of data sets and other types of inputs such as temperature, humidity and satellite data. In another study, ANN has also been employed in the forecast of the thunderstorm. Choudhury et al. [9] used a three-layer multilayer perceptron (MLP) with back propagation to forecast thunderstorm at least ten to twelve hours before its arrival using relevant weather data such as wind and pressure as the inputs. The main reason behind the use of these input data was that they played a key role in the occurrence of storms and it was observed that the performance improved when training set size was increased. This paper presents ANN based-technique for lightning prediction. Malaysian data were collected prior to the development of ANN block for the prediction task. Preprocessing and post-processing tasks were rigorously conducted to the data in an attempt to develop a fully trained network. Results obtained from the experiment indicated that the fully trained network is able to predict lightning occurrence accurately characterized by R=1.0. II. ANN DEVELOPMENT This study involved the development of an ANN that is capable of predicting lightning four hours before its arrival. The work comprised data collection, network development, training and testing processes and post-processing of results. The overall system block diagram is illustrated in Fig. 1. Meteorological data are cascaded with month indicator and season indicator in order to form the overall input data of the ANN. The developed ANN receives the input data, denoted as P and also the targeted output denoted as t.

2

1-4244-1470-9/07/$25.00 ©2007 IEEE.

A. Data Preparation Two types of data were used in the proposed ANN: meteorological data as the original input and lightning as the

Meteorological data

Input data ANN

Month indicator

Network output

Target output Season indicator Fig. 1: Lightning prediction system architecture

target output. They were taken of Subang (3°7' N, 101°33' E), a city in Selangor state, Malaysia which has the highest mean annual number of days with lightning of 317 days based on 50 years analysis (1951-2000) [10]. The data cover one environmental cycle that consist of the entire monsoon and inter-monsoon seasons. This ensures that all patterns are captured for the purpose of training and testing. Meteorological parameters such as wind, dew point, humidity, pressure, temperature, cloud height and moisture difference were chosen as they play important roles in shaping the weather and hence, the occurrence or non-occurrence of lightning [8, 9] , [11-15]. B. Design of Neural Networks In this study, an ANN with two hidden layers trained with back propagation technique is developed in Matlab. The initial layer where the input comes into the ANN is called the input layer. It is immediately followed by two hidden layers while the last layer where the output is produced is called the output layer. Various four-layered network models having different network configurations were tested. Network configurations such as the number of neurons, transfer functions, learning rate and momentum constant were determined heuristically [1618]. The syntax to represent the network configuration is written in the following form. net = newff (minmax, [α β γ],{T1, T2, T3},λ)

(1)

where: α= number of nodes in the first layer β= number of nodes in the first layer γ= number of nodes in the first layer T1,T2,T3 = transfer functions for layers 1, 2, 3 λ= training technique The general features for the input matrix are prepared in the following form:  D P=  I 

where: D = meteorology data I = indicator

(2)

 x11 x12 x13 ............x1n   x x x ...........x  2n   21 22 23 D = .................................     .................................   x m1 x m 2 x m3 ..........x mn   

(3)

where: n = number of input patterns m = number of input variables of the meteorological data

Γ  I =  A ΓB 

(4)

where: ГA= month indicator ГB= season indicator

a11 a12 a13 ....... a1n  a a a ........ a  2n   21 22 23  ΓA = ..............................    ..............................  a a a ....... a  kn   k1 k 2 k 3

(5)

matrix size [k x n] where: akn can only be either ‘0’ or ‘1’ 0 = the months the nth input pattern is not recorded 1 = the month the nth input pattern is recorded

b11 b12 b13 ....... b1n  b b b ........ b  2n   21 22 23 ΓB = ..............................    ..............................  b a a ....... a  jn   j1 j 2 j 3

matrix size [j x n] where; bjn can only be either ‘0’ or ‘1’

(6)

0 = the seasons the nth input pattern is not recorded 1 = the season the nth input pattern is recorded

Start Load data

The targeted output, denoted by t is given in the following matrix form: t = [c1 c2 c3 … cn]

Define training data

(7)

Normalize training data

where: cn can either be ‘0’ or ‘1’ 0 = indicating the non-occurrence of lightning 1 = indicating the occurrence of lightning

Design network Set network parameters

B1. Indicator Matrix Data indicator is a factor that helps the convergence of ANN training process [19, 20]. It is necessary to avoid the ANN solution from diverging and produce undesirable results. There are two indicators introduced to the original input data. One is month indicator while the other is season indicator. They were chosen so that elements of seasons and period could be incorporated. The month indicator is a matrix of size [k x n] where k denotes the number of month in a year and n is the number of input patterns. The season indicator, on the other hand, is a matrix of size [j x n] where j represents the number of seasons Malaysia has in one environmental cycle. For this study, the matrices would be in the forms of [12 x n] and [4 x n] for month and season indicator respectively. Meteorological data has eight parameters, thus, they become a matrix of size [8 x n]. The indicators prove to be useful in the study as they help in the convergence of the ANN training process. Without these features, it would cause a slow or even failure of convergence during the training process.

Initialize network Train network Simulate network De-normalize results

No

Solution converges? Yes Save network Define testing data Transform testing data

B2. System algorithm The training and testing processes are embedded together in a single algorithm. The whole processes are represented in the form of a flowchart as in Fig. 2 below.

Simulate network using testing

B3 Training process The training can be made more efficient if certain preprocessing steps are performed. Depending on the steps, the data are normalized so that they always fall within a specified range. The normalize equation embedded in Matrix toolbox is given as:

Regression value ~ to 1?

pn = 2*(p-minp)/(maxp-minp) - 1;

(8)

where: p = input data For this study, the data is preprocessed using their min and max values giving the data in the range of [-1, 1]. After training, the results can be seen in its original form by performing the post-processing step (de-normalize). Other processing routines include using the mean & standard deviation and principal component analysis. The network is trained until it achieves a very small error. The network is then saved for the testing process.

Measure network performance

No

Yes End Fig. 2: System algorithm

B4. Testing Process Testing process is carried out to measure the performance of the developed network. This can be achieved by either measuring the errors obtained during the training, validation and test sets; or by performing a linear regression analysis. Before the testing can be done, though, the data should be preprocessed so that they are transformed according to the min and max values of the training data. A fully trained network should be able to predict lightning from a set of unseen data. However, it should be noted that developed neural network

may not necessarily perform well during the testing in which case, re-training is essential.

Best Linear Fit: A = (0.999) T + (0.000885) 1.2 R=1

III. RESULTS AND DISCUSSION The data was randomly selected from a pool of database collected from the meteorological and lightning. 378 patterns were utilized for training the network while 197 patterns were employed for the testing process. As the accuracy of the developed network can be measured from the training errors, one option is to calculate the rms error using the following mathematical formula:

1

0.8

A

0.6

0.4

P

rms error = 1/P ( ∑ [f(x)] 2 dx)

0.2

(9)

Data Points Best Linear Fit A=T

i=1

0

where f(x) is the difference between network output, A and target output T for each input pattern, P. Further evaluation of the network performance can also be conducted using post regression analysis where comparison between the network response and the corresponding targets is made. The regression coefficient, R with values close to one indicates that there is a strong correlation between the targeted outputs and network outputs while the values that are close to zero indicates otherwise. Heuristic technique was employed to carefully train the network. Results showed that the best possible configuration for the system is a [8, 5, 1] configuration with logsig, logsig, purelin as the transfer functions trained using Levenberg Marquardt learning technique. The 8 and 5 represent the number of neurons at the first and second hidden layers respectively; while the 1 represents the single output. The rms error achieved was ~0.41 % with R-value of 0.99997. This implies that the network is fully trained and it was achieved by using a learning rate of 0.4819 and momentum constant of 0.0577. The properties of the developed network can then be summarized in Table I. TABLE I PROPERTIES OF DEVELOPED NETWORK

ANN Properties Network configuration Transfer functions

Properties

-0.2

0.8

1

V. ACKNOWLEDGEMENTS The authors would like to express their gratitude to the Malaysian Meteorological Services (MMS) and Tenaga Nasional Berhad (TNB) Research and Development for the supply of meteorological data and lightning data respectively.

[8,5,1] logsig, logsig, purelin

VI. REFERENCES [1]

0.0577

[2]

LevenbergMarquardt 57

[3]

1e-8

[4]

Regression coefficient, R

0.6

IV. CONCLUSION An artificial neural network based technique for the prediction of lightning occurrence was presented and serves as an alternative to other forecast methods available. With the use of historical lightning and meteorological data, the developed network is trained and tested until it obtained a correlation coefficient close to unity. This was achieved after meticulously testing different network structures, training algorithms and activation functions, all of which were determined using a heuristic technique. The end results were excellent, and the introduction of indicator module to the original data has significantly alleviated the computation burden in achieving the converged solution.

Momentum constant

Training goal

0.4

Fig. 3: Results for the regression analysis

0.4819

Epochs

0.2

T

Learning rate

Training technique

0

0.99997

Training patterns

378

Testing patterns

197

No of input variables

24

[5]

[6]

R. B. Anderson, "Lightning research: where do we go from here?," Power Engineering Journal [see also Power Engineer], vol. 6, pp. 179190, 1992. M. P. Yahaya, Norhafzanizam, and S. A. Fuad, "Lightning Detection System in Tenaga Nasional," presented at 3rd TNB Technical Conference, University Tenaga Nasional, Bangi, Malaysia, 2001. M. P. Yahaya and S. A. F. S. Zain, "Characteristics of Cloud-to-Ground Lightning in Malaysia," presented at Lightning Protection and Earthing Systems, Kuala Lumpur, Malaysia, 2000. Z. A. Hartono and I. Robiah, "Thunderstorm day and ground flash density in Malaysia," presented at National Power and Energy Conference (PECon), Bangi, Malaysia, 2003. L. Ekonomou, D. P. Iracleous, I. F. Gonos, and I. A. Stathopulos, "Application of artificial neural network methods for the lightning performance evaluation of Hellenic high voltage transmission lines," Electric power systems research, vol. 77, pp. 55-63, 2007. J. L. Bermudez, A. Piras, and M. Rubinstein, "Artificial neural networks in lightning location systems," presented at International Symposium on Neuro-Fuzzy Systems (AT'96), Lausanne, Switzerland, 1996.

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

I. N. Da Silva, A. N. De Souza, and M. E. Bordon, "Evaluation and identification of lightning models by artificial neural networks," International Joint Conference on Neural Networks (IJCNN '99), 1999. D. Frankel, I. Schiller, J. S. Draper, and A. A. Barnes, Jr., "Use of neural networks to predict lightning at Kennedy Space Center," presented at Seattle International Joint Conference, Seattle, WA 1991. S. Choudhury, S. Mitra, and H. Chakraborty, "A connectionist approach to thunderstorm forecasting," presented at Processing NAFIPS '04 (IEEE Annual Meeting of the Fuzzy Information), 2004. M. M. S. (MMS), "General Climate Information," The Malaysian Meteorological Department, http://www.met.gov.my/home_e.html, 2006. W. R. Burrows and P. King, "Neuro-Statistical Models for predicting Lightning Occurrence in Canada: Climatology and Potential Predictors," in International Lightning Detection Conference. Tucson, Arizona, USA: Global Atmosphere, Inc., 2000. W. R. Burrows, C. Price, and L. J. Wilson, "Warm season lightning probability prediction for Canada and the northern United States," Journal of Weather and Forecasting, vol. 20, pp. 971-988, 2005. W. C. Lambert, M. Wheeler, and W. Roeder, "Objective Lightning Forecasting at Kennedy Space Center and Cape Canaveral Air Force Station using Cloud-to-Ground Lightning Surveillance System Data," presented at Conference on Meteorological Applications of Lightning Data, San Diego, CA, 2005. W. Kise, M. Ito, A. Mitsuishi, Y. Kosuge, F. Asano, and S. I. Watanabe, "Prediction support system for lightning flash based on case-based retrieval," presented at Proceedings of the 39th SICE Annual Conference (International Session Papers), Iizuka, 2000. R. A. Maddox, C. E. Wallace, J. Zhang, J. J. Gourley, K. W. Howard, and C. L. Dempsey, "Use of Real-Time CG Strike Data for Short-term Forecasts for the Phoenix Metropolitan Area," in International Lightning Detection Conference. Tucson, Arizona, USA: Global Atmosphere, Inc., 2000. I. Musirin and T. K. A. Rahman, "Simulation Technique for Voltage Collapse Prediction and Contigency Ranking in Power System," in Student Conference on Research and Development Proceedings. Shah Alam Malaysia, 2002, pp. 188-191. I. Musirin, T. K. A. Rahman, and M. K. Idris, "Artificial Neural Network Based Prediction Technique for Voltage Collapse Assessment," in International Conference on Artificial Intelligent and Engineering Technology (ICAIET). Sabah, 2004. I. Musirin and T. K. A. Rahman, "ANN Based Technique for Voltage Stability and Transmission Loss Prediction in Power Security Assessment," in ICGST International Conference on Automation, Robotics and Autonomous Systems (ARAS-06). Sharm El Sheikh, Egypt. , 2006. I. Musirin and T. K. A. Rahman, "Hybrid Neural Network Topology (HNNT) for Line Outage Contingency Ranking," presented at National Power and Energy Conference (PECon) Proceedings, Bangi, Malaysia, 2003.

[20] I. Musirin and T. K. A. Rahman, "Line Outage Security Assessment Using Combinatorial Form Neural Network Topology (CFNNT)," in European Power and Energy Systems (EuroPES). Marbella, Spain, 2003.

VII. BIOGRAPHIES Dalina Johari was born in Kuala Lumpur, Malaysia on May 25, 1976. She graduated from the University of Liverpool, UK with honours degree in Electrical Engineering in 1999. She is currently pursuing a Master’s degree in power system at the Universiti Teknologi MARA Malaysia. Her research interests include lightning protection system, artificial neural network and evolutionary programming techniques.

Dr. Titik Khawa Abdul Rahman received BSc E.E. (Hons) and PhD on 1988 from Loughborough University of Technology and University of Malaya, Malaysia on 1996 respectively. She is an Assoc. Prof. at the Faculty of Electrical Engineering and currently appointed as the Head of Quality Academic Assurance, Academic Affair Division of Universiti Teknologi MARA Malaysia. She has written more 80 technical papers and also supervising post-graduate students. Her research interest includes volatge profile studies, artificial neural network, Evolutionary Computation such as Evolutionary Programming and Genetic Algorithm, Artifical Immune System (AIS), Economic Dispatch and Loss Minimization. Dr. Ismail Musirin obtained Diploma of Electrical Power Engineering in 1987, Bachelor of Electrical Engineering (Hons) in 1990; both from Universiti Teknologi Malaysia, MSc in Pulsed Power Technology in 1992 from University of Strathclyde, United Kingdom and PhD in Electrical Engineering from Universiti Teknologi MARA, Malaysia in 2005. He has published 2 books and more than 80 technical papers in the international and national conferences, and also in the international journals. He is a reviewer of IEEE transactions and IET journals. He is also appointed as the permanent reviewer for the World Scientific and Engineering Academy and Society (WSEAS) centered in Greece. His research interest includes power system stability, optimization techniques, distributed generator and artificial intelligence. He is also a member of IEEE, IEEE Power Engineering Society and Artificial Immune System Society (ARTIST).