Epileptic Seizure Detection Using Neural Fuzzy Networks

Epileptic Seizure Detection Using Neural Fuzzy Networks Nasser Sadati, Member, IEEE, Hamid R. Mohseni, Arash Maghsoudi

Abstract - The electroencephalogram (EEG) is a representative signal containing information about the condition of the brain. The shape of the wave may contain useful information about its state. However, the human observer cannot directly monitor these subtle details. Besides, since bio-signals are highly subjective, the symptoms may appear at random in the time scale. Therefore, the EEG signal parameters, extracted and analyzed using computers, are highly useful in diagnosis. The aim of this work is to compare the different classifiers when applied to EEG data from normal and epileptic subjects. For this purpose an adaptive neural fuzzy network (ANFN) to classify normal and epileptic EEG signals is proposed. The results are compared with other classifiers such as SVM (Support Vector Machine), ANFIS and FBNN (Feed forward Back-propagation Neural Network). It is shown that a classification accuracy of about 85.9% can be achieved using ANFN.

I. INTRODUCTION

E

PILEPTIC seizures are the results of the transient and unexpected electrical disturbances of the brain. Approximately one in every 100 persons will experience a seizure at some time in their life [1]. But the occurrence of an epileptic seizure is not predictable and its process is not completely understood yet. Electroencephalogram (EEG) as a representative signal containing information of the electrical activities generated by the cerebral cortex nerve cells, has been the most utilized signal to clinically assess brain activities. Also the detection of epileptiform discharges in the EEG is an important component in the diagnosis of epilepsy. The detection of epilepsy which includes visual scanning of EEG recordings for the spikes and seizures is usually time consuming, especially in the case of long recordings. In addition, bio-signals are highly subjective so disagreement on the same record is possible; consequently, the EEG signal parameters extracted and analyzed using computers are highly useful in diagnostics. The early methods of automatic EEG processing, were based on Fourier transform. This approach is based on earlier observations that the EEG spectrum contains some characteristic waveforms that fall primarily within four frequency bands. Such methods have proved to be beneficial for various EEG characterizations, although fast Fourier transform (FFT) suffers from having large noise sensitivity.

The authors are with the Department of Electrical Engineering, Intelligent Systems Laboratory, Sharif University of Technology, Tehran, Iran (e-mail: [email protected] , [email protected] , [email protected]).

Parametric methods for power spectrum estimation such as autoregressive (AR), reduces the spectral loss problems and gives better frequency resolution. Since the EEG signals are non-stationary, the parametric methods are not suitable for frequency decomposition of these signals. A powerful method was proposed in the late 1980s to perform time-scale and also frequency analysis of signals; the wavelet transforms (WT). This method provides a unified framework for different techniques that have been developed for various applications. One of the efficient properties of WT is that it is appropriate for analysis of non-stationary signals, and this represents a major advantage over spectral analysis. Hence, the WT is well suited to locating transient events. Such transient events as spikes can occur during epileptic seizures. WT feature extraction and representation properties can be used to analyze various transient events in biological signals. Adeli et al. gave an overview of the DWT developed for recognizing and quantifying spikes, sharp waves and spikewaves [2]. They used wavelet transform to analyze and characterize epileptiform discharges in the form of 3-Hz spike and wave complex in patients with absence seizure. Through wavelet decomposition of the EEG records, transient features are accurately captured and localized in both time and frequency context. It is shown that an ANN performs better, if the input and output data can be processed to capture the characteristic features of the signal [3]. They used a wavelet representation for automated detection of the EEG spikes. More recently, ANN that applies Bayesian methods are shown to be more robust compared with the other techniques because they incorporate measures of confidence in their output for the Levenberg-Marquardt (LM) procedure. In addition, standard MLP was improved by using finite impulse response filters (FIR) instead of static weights for a temporal processing of data [4]. Petrosian et al. also showed the ability of specifically designed and trained recurrent neural networks (RNN) combined with wavelet preprocessing, to predict the onset of epileptic seizures both on scalp and on intracranial recordings [5]. Recently, Kiymik et al. also presented time–frequency analysis of EEG signals for detecting the information on alertness and drowsiness using spectral densities of DWT coefficients as an input to ANN [6]. As compared to the conventional method of frequency analysis using Fourier transform or short time Fourier transform, wavelets enable analysis with a coarse to fine multi-resolution perspective of the signals [7]. Neuro-fuzzy systems connect the power of the two paradigms: Fuzzy Logic and ANNs, by utilizing the

mathematical properties of ANNs in tuning the rule-based of fuzzy systems. A particular advance in neuro-fuzzy development is the fuzzy neural network (FNN), which has shown considerable results in modeling nonlinear functions. In FNN, the membership function parameters are extracted from a data set that describes the system behavior. The FNN learns features in the data set and adjusts the system parameters according to a given error criterion. In this work, a novel neuro-fuzzy network is used to classify EEG signals into two categories; normal and epileptic. In order to compare the efficiency of this network, we also use other well-known methods such as FBNN (feedforward back-propagation neural network), adaptive network-based-fuzzy inference systems (ANFIS) and support vector machine (SVM). The energy of discrete wavelet transform (DWT) sub-bands was used as the network input. The network is trained by these features in order to be able to classify normal and epileptic EEG signals precisely. II. PRELIMINARIES A. Database The data described by Andrzejak et al. [8], which is publicly available, has been used in this paper. In this section we restrict ourselves to only a short description and refer to Andrzejak et al. for further details. The complete dataset consists of five sets (denoted by A–E), each containing 100 single-channel EEG signals of 23.6 s. Sets A and B have been taken from surface EEG recordings of five healthy volunteers with eyes open and closed, respectively. Signals in two sets have been measured in seizure-free intervals from five patients in the epileptogenic zone (D) and from the hippocampal formation of the opposite hemisphere of the brain (C). Set E contains seizure activity, selected from all recording sites exhibiting ictal activity. Sets A and B have been recorded extra cranially, whereas sets C, D, and E have been recorded intracranially. In our applications, performance degraded for a more detailed classification which further disassociated between sets A (healthy volunteer, eyes open) and B (healthy volunteer, eyes closed), and sets D (epileptogenic zone) and C (hippocampal formation of opposite hemisphere). Therefore, in the present study, we classify three dataset (A, D and E) of the complete dataset. B. Pre-Processing and Wavelet Transform First, the data mentioned above were segmented into signals with the length of 5.9s (256 points). The energy of each segment is also normalized. These signals were used for feature extraction with wavelet transform. The discrete wavelet transform is a versatile signal processing tool that has many engineering and scientific applications. One area in which the DWT has been particularly successful is the epileptic seizure detection because it captures transient features and localizes them in both time and frequency content accurately.

The wavelet functions are derived from a prototype function ψ (i.e., mother wavelet) according to

ψ j , k ( x ) = 2 − j / 2ψ ( 2 − j / 2 x − k )

(1)

The associated scaling functions obey a similar relation as follows

ϕ j ,k ( x) = 2 − j / 2 ϕ (2 − j / 2 x − k )

(2)

The integer’s j and k are the dilation or scale parameters, and the translation parameters, respectively. Note that for a given space resolution N = 2J, the dilation (scale) parameter j belongs to the set {1, 2 , ... , J} and to each scale value j is associated Nj = 2J-j wavelets. The spatial function is given by J

N j −1

j =1

k=0

f ( x) = ∑

∑ d ( j , k )ψ

j,k

( x ) + a( J ,0)ϕ J , 0 ( x )

(3)

where d ( j , k ) =    a ( J ,0) = 

∫x ψ j,k ( x) f ( x) dx ∫x ϕ J ,0 ( x) f ( x) dx

The scaling function

ϕ J ,0

covers the entire space to be

transformed into a given direction, and plays the role of an averaging function similar to the center point of k-space. The wavelet function ψ is localized to parts of space in the same direction according to the translation and dilation parameter values. To cover the space in a given direction up to a resolution N, (N-1) inner products between the translated and dilated wavelets and the spatial function, plus one inner product of the averaging function are required. The coefficients a (J ,0) and d ( j,0) are obtained with no translation of the scaling and the wavelet functions, respectively, at the resolution level of j = J. In this work the signals decomposed in five levels using DB4 wavelet filter. The energy of details d1 , d 2 , d3 , d 4 , d 5 and approximation a5 (totally 6 features) were used as the input features. C. The Support Vector Machine The support vector machine (SVM) is a relatively new approach for solving supervised classification problems and is very useful due to its generalization ability. In essence, such an approach maximizes the margin between the training data and the decision boundary, which can be cast as a quadratic optimization problem. The subsets of the patterns that are closest to the decision boundary are called the support vectors. For a linearly separable binary classification problem, the construction of a hyperplane so that the margin between the hyperplane and the nearest point is maximized can be posed as the quadratic optimization problem [9]. Maximizing the margin corresponds to minimizing the Euclidean norm of the

weight vector. Often, in practice, a separating hyperplane does not exist. Hence, the constraint is relaxed by introducing slack variables. In this paper, an GRB-SVM using an GRBF (Gaussian Radial Basis Functions) as the kernel function, is also constructed for comparison. D. Adaptive-Network-Based- Fuzzy Inference Systems ANFIS is a neuro-fuzzy inference system implemented in the framework of adaptive networks [10]. An adaptive network is a superset of all kinds of feedforward neural networks with supervised learning capability. ANFIS serves as a basis for constructing a set of fuzzy if–then rules with appropriate membership functions to generate the stipulated input–output pairs. The ANFIS network is a 5-layered network, in which the layers are not fully connected. The transfer function of a neuron is determined according to the layer where the neuron is in. All the ANFIS structures used in this paper possess six inputs and one output. Each input of the ANFIS structures has two Gaussian membership functions and the rule base contains 144 rules. The parameters of membership functions were obtained for each Gaussian by training the ANFIS structures with 20 epochs. A combination of least squares and backpropagation gradient descent methods is used for training the fuzzy structure, in order to compute the parameters of the membership functions, which are used to model the given set of inputs and output.

networks, this network has the advantage that the level of each neuron’s firing can be varied adaptively. By using the Mamdani’s inference rule (product inference), TSK (Takagi-sugeno-kung) scheme, singleton fuzzifier, center average defuzzifier, and Gaussian membership functions, the activation function of the ith neuron can be expressed as  1 xi − cij 2  )  exp − (  2 σ  j =1 ij j =1   ϕi ( xi ) = R = R  1 xi − cij 2  µ j ( xi ) exp − ( )  ∑ ∑  2 σ  j =1 j =1 ij   Ri

Ri

∑a µ ( x ) ∑ a ij

j

i

(4)

i

where cij is the center, σ ij is the standard deviation of the jth receptive field unit of the ith neuron and Rij is the number of fuzzy rules. The general computational model that we have used for ANFN is summarized in the following equations (5)

N

Z =

∑ qi yi i =1

where  1 xi − cij 2  exp − ( )    2 σ j =1 ij j =1   yi = ϕ ( X i ) = Ri = Ri  1 xi − cij 2  µ j ( xi ) exp − ( )  ∑ ∑   2 σ j =1 j =1 ij   RI

∑a

ij

µ j ( xi )

Ri

∑a

ij

xi =

M +1

∑ w ij u j + b i = ∑ j =1

w ij u j ,

; uM

+1

= 1.

By ANFN, we mean a neural network followed by an ANFIS network. As it is can be seen from Fig.1, this network is a neural network whose activation functions are ANFIS networks which can be trained. Compared to other

(7)

i =1

N is the number of neurons; Z is the output of the network; M is the number of inputs; bi is the bias of each neuron and uj is the input of the network. Also to train the network, the back-propagation and gradient descent method is used. The back-propagation has some problems for many applications. The algorithm is not guaranteed to find the global minimum of the error function since the gradient descent may get stuck in local minima, where it may remains there indefinitely. In addition to this, long training sessions are often required in order to find an acceptable weight solution because of the well-known difficulties inherent in gradient descent optimization [13]. In order to train the network using the gradient descent method, an error function is defined as follows

1 E = (Z − d ) 2 2

Fig.1. An ANFN with 3 inputs and 1 output

(6)

and M

E. Adaptive Neural Fuzzy Network Becericely et al. used a dynamical neural network followed by a fuzzy system for trajectory planning in nonlinear control [11]. Subasi used this network for EEG classification and their results show that it works more efficiently than MLP [12]. As it is shown in Fig. 1, we modified the network to be appropriate for classification purposes and we call it to be ANFN (Adaptive Neural Fuzzy Network).

ij

i

(8)

where d is the desired and Z is the network output, respectively. By applying the proposed approach, we maintain learning rule for each parameter. For example, the learning rule for wij can be expressed as ∂E ∂E ∂Z ∂y ∂x = × × × = e × qi × ϕ i′ ( x i ) × u j ∂ wij ∂Z ∂ y ∂ x ∂ wij

(9)

where e = Z − d .

of the network, as a result of the large number of parameters, which have to be trained. See also equations (4)-(7).

Then by applying the gradient descent method, the parameters are updated in the following way (10)

w(t +1) = w(t ) − µ ∇E

where

µ is the of learning rate. III. SIMULATION RESULTS

The EEG database mentioned above has 100 cut of EEG signals from both healthy and epileptic patients. After segmentation, 400 signals were obtained. 200 were used as the training data and the others were used as the testing data. Table I shows a summary of the performance of various classifiers with sub-band frequencies obtained from DWT of EEG signals. In this table, the results are given for both training and testing data. In addition, execution times for training are rounded and shown for various classifiers. The SVM with RBF kernels and ANFIS have the best result for training data, because of their nonlinear characteristics. But their performances dramatically decrease for testing data. Table I. Results of classification for training and testing data

classifier SVM(linear) SVM(RBF) ANFIS FBNN ANFN

Correctly Classified (train) (%) 83.5

100 99.7 96.0 86.5

Correctly Classified (test) (%) 83.1 77.3 66.6 79.6

85.9

Training Time (second) 510

70 600 90 2700

We also used FBNN classifier which is a well-known network used in many areas. This network has three layers with five neurons in first layer. It can be seen that this network has an acceptable performance for training data, but like ANFIS and SVM with RBF kernels, it has a poor performance for testing data. The ANFN was trained for a simulated data with two features. The boundary of classification and features are shown in Fig. 2. The feature space depicts a classical classification problem. This network is able to solve this problem by fitting a Gaussian-like boundary to separate two classes. Then we used the input data to train the network. After training the network, it was simulated by the testing data. As it can be seen from Table I, despite of results for training data, this network has the best results for testing data and shows more robustness compared to other methods. These algorithms were developed with MATLAB Ver. 7 software and the computer processor was Pentium IV, 3 GHz. The execution times are rounded and shown in Table I. The SVM with RBF kernels had the least execution time and ANFN network had the longest time due to complexity

Fig.2. Feature space for simulated data and the classification boundaries

IV. CONCLUSION Diagnosing epilepsy is a difficult task requiring observation of the patient, an EEG, and gathering of additional clinical information. An artificial neural network that classifies subjects as having or not having an epileptic seizure provides a valuable diagnostic decision support tool for neurologists treating potential epilepsy, since differing etiologies of seizures result in different treatments. In this work, we proposed a neural fuzzy network for classification of epileptic and normal EEG signals. This is a neural network followed by an ANSIF as its activation function. After segmentation of EEG signals, they were decomposed into Time–frequency representations using DWT in five levels. The energy of details of five levels and the approximation of fifth level were used as the classifier inputs. SVM, FBNN, ANFIS and the proposed network (ANFN) were implemented for the classification of EEG signals using DWT sub-bands as inputs. The results are shown in Table I. As it can be seen, the proposed ANFN classifier has the best results. Because of using both neural network and Fuzzy system, ANFN network has the properties of both. It has the flexibility to handle a large scope of features extracted from other data sets.

REFERENCES [1]

[2]

Iasemidis, L. D., Shiau, D. S., Chaovalitwongse, W., Sackellares, J. C., Pardalos, P. M., Principe, J. C., “Adaptive epileptic seizure prediction system,” IEEE Transactions on Biomedical Engineering, 50(5), 616–627, 2003. Adeli, H., Zhou, Z., & Dadmehr, N. “Analysis of EEG records in an epileptic patient using wavelet transform,” Journal of Neuroscience Methods, 123, 69–87, 2003.

[3] [4] [5] [6] [7] [8]

Kalayci, T., & Ozdamar, O. Wavelet preprocessing for automated neural network detection of EEG spikes. IEEE Engineering in Medicine and Biology Magazine, Mar/Apr, 160–166, 1995. Haselsteiner, E., & Pfurtscheller, G. “Using time-dependent neural networks for EEG classification,” IEEE Transactions on Rehabilitation Engineering, 8, 457–463, 2000. Petrosian, A., Prokhorov, D., Homan, R., Dashei, R., & Wunsch, D. Recurrent neural network based prediction of epileptic seizures in intraand extracranial EEG. Neurocomputing, 30, 201–218, 2000. Kiymik, M. K., Akin, M., & Subasi, A. “Automatic recognition of alertness level by using wavelet transform and artificial neural network,” Journal of Neuroscience Methods, 139(2), 231–240, 2004. Subasi, A. “Automatic recognition of alertness level from EEG by using neural network and wavelet coefficients,” Expert Systems with Applications, 28, 701–711, 2005. R.G. Andrzejak, G.Widman, K. Lehnertz, C. Rieke, P. David, C.E. Elger, The epileptic process as nonlinear deterministic dynamics in a

[9] [10] [11] [12] [13]

stochastic environment: an evaluation on mesial temporal lobe epilepsy, Epilepsy Res. 44 129—140, 2001. S. Haykin, Neural Networks; A Comprehensive Foundation, NJ: Prentice Hall, 1999. Jang, J.S.R., ANFIS: “adaptive-network-based fuzzy inference systems,” IEEE Transactions on Systems, Man, and Cybernetics 23 (03), 665–685, 1993. Becerikli, Y., Oysal, Y., & Konar, A. F. “Trajectory priming with dynamic fuzzy networks in nonlinear optimal control,” IEEE Transactions on Neural Networks, 15(2), 383–394, 2004. Subasi, A., & Erc¸elebi, E. “Classification of EEG signals using neural network and logistic regression,” Computer Methods and Programs in Biomedicine, 78, 87–99, 2005. Chaudhuri, B. B., & Bhattacharya, U. “Efficient training and improved performance of multilayer perceptron in pattern classification,” Neurocomputing, 34, 11–27, 2000.